Maybe this is helpful, though the answer is more fitting to Stack Overflow. I built a small parser in Perl which does what you want. Shame there's no highlighting here.
#!/usr/bin/perl
use strict; use warnings;
use feature qw(say);
use Data::Dumper;
use Unicode::String;
use utf8;
my $line_no = 1;
# Read stuff from the __DATA__ section as if it were a file,
# one line at a time
while (my $line = <DATA>) {
# Create a Unicode::String object
my $us = Unicode::String->new($line);
# Iterate over the length of the string
for (my $i = 0; $i < $us->length; $i++) {
# Get the next char
my $char = $us->substr($i, 1);
# Output a description, one line per character
printf "Line %i, column %i, 0x%x '%s' (%s)\n",
$line_no, # line number
$i, # colum number
$char->ord, # the ordinal of the char, in hex
$char->as_string, # the stringified char (as in the input)
$char->name; # the glyph's name
}
# increment line number
$line_no++;
}
# Below is the DATA section, which can be used as a file handle
__DATA__
This is some very strange unicode stuff right here:
٩(-̮̮̃-̃)۶ ٩(●̮̮̃•̃)۶ ٩(͡๏̯͡๏)۶ ٩(-̮̮̃•̃).
Let's see what this does:
- Read from a file handle (the
DATA section can be used like that) line by line.
- Create an object that represents a Unicode string from the line.
- Iterate the chars in that string
- Output name, number and stuff about each char
It's really very straightforward. Maybe you can adapt it to php, though I don't know if there's a handy library around for the names.
Hope it helps.
I lifted the smiley thingies here: Which Unicode characters do smilies like ٩(•̮̮̃•̃)۶ consist of?