On my PHP site, currently users login with an email address and a password. I would like to add a username as well, this username they g\set will be unique and they cannot change it. I am wondering how I can make this name have no spaces in it and work in a URL so I can use there username to link to there profiles and other stuff. If there is a space in there username then it should add an underscore jason_davis. I am not sure the best way to do this?
            Asked
            
        
        
            Active
            
        
            Viewed 2.3k times
        
    28
            
            
        - 
                    2There are plenty questions like this. Didn’t you get an answer with searching? – Gumbo Jan 20 '10 at 18:21
- 
                    @Gumbo I searched SO, not google. Possibly not the correct term but I did search for "URL friendly username" with not much luck. I didn't know it was called a slug before this. – JasonDavis Jan 20 '10 at 18:27
- 
                    1Maybe not everyone is trying to convert usernames. But searching for “URL friendly string” is returning usable results. – Gumbo Jan 20 '10 at 19:04
- 
                    Similar: http://stackoverflow.com/questions/5305879 – GG. Jan 29 '12 at 01:12
- 
                    Nowadays, you can use libraries like https://github.com/cocur/slugify or https://github.com/ausi/slug-generator to achieve that. – ausi Oct 30 '17 at 22:14
2 Answers
103
            function Slug($string)
{
    // convert to entities
    $string = htmlentities( $string, ENT_QUOTES, 'UTF-8' );
    // regex to convert accented chars into their closest a-z ASCII equivelent
    $string = preg_replace( '~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1', $string );
    // convert back from entities
    $string = html_entity_decode( $string, ENT_QUOTES, 'UTF-8' );
    // any straggling caracters that are not strict alphanumeric are replaced with a dash
    $string = preg_replace( '~[^0-9a-z]+~i', '-', $string );
    // trim / cleanup / all lowercase
    $string = trim( $string, '-' );
    $string = strtolower( $string );
    return $string;
}
$user = 'Alix Axel';
echo Slug($user); // alix-axel
$user = 'Álix Ãxel';
echo Slug($user); // alix-axel
$user = 'Álix----_Ãxel!?!?';
echo Slug($user); // alix-axel
 
    
    
        squarecandy
        
- 4,894
- 3
- 34
- 45
 
    
    
        Alix Axel
        
- 151,645
- 95
- 393
- 500
- 
                    9This is dangerous! Multiple unique user names can map to the same URL. That's not what you want, is it? Consider, e.g., `AB` and `ab`, which are unique strings but map to the same slug string. You should store the slug as the identifier. – John Feminella Jan 20 '10 at 18:18
- 
                    12@John Feminella: He would obviously have to check for duplicates at some point before storing the slug. – Pekka Jan 20 '10 at 18:19
- 
                    perfect, thank you. BTW what does this part look for acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml ? – JasonDavis Jan 20 '10 at 18:19
- 
                    
- 
                    As a slight improvement, using `iconv()` to convert to `ASCII//TRANSLIT` would probably catch a lot more chars. – Frank Farmer Jan 20 '10 at 18:22
- 
                    @Pekka: Thanks! =) @jasondavid: It removes accents, instead of having a huge lookup table we convert to html entities and fetch the unaccented char. – Alix Axel Jan 20 '10 at 18:22
- 
                    @Frank Farmer: `iconv('UTF-8', 'ASCII//TRANSLIT', 'Álix Ãxel')` returns `'Alix ~Axel`. Using PHP 5.3.0 and .php file encoded as UTF-8 **no** BOM. – Alix Axel Jan 20 '10 at 18:25
- 
                    1Anyone know why Á and à would cause no output or error to occur when using this function? – John Conde Jan 20 '10 at 18:42
- 
                    I'll answer my own question: remove the "'UTF-8'" parameter from htmlentities. That did the trick. – John Conde Jan 20 '10 at 19:36
- 
                    @John Conde: I though you're talking about the `iconv()` function! You shouldn't remove the `UTF-8` from `htmlentities`, instead you should **save all your `.php` files encoded as UTF-8 no BOM**. – Alix Axel Jan 20 '10 at 19:44
- 
                    1Alex, thanks for the info. I reutnred the "UTF-8" parameter and saved the file as UTF-8 and it worked like a charm. – John Conde Jan 20 '10 at 20:17
- 
                    1@John Conde: No problem! ;) You should always save your files UTF-8 encoded. – Alix Axel Jan 20 '10 at 21:07
- 
                    1This function does not convert Polish chars "ąćęłńóśźż" => "acelnoszz", because they have no names in html entities (only numeric representation). You still need to replace those with table. Paste below code before `return` in function: `$string = strtr(mb_strtolower($string), array('ą'=>'a','ć'=>'c','ę'=>'e','ł'=>'l','ń'=>'n','ó'=>'o','ś'=>'s','ź'=>'z','ż'=>'z'));` – s3m3n Aug 05 '12 at 21:22
- 
                    @s3men: Indeed, not only Polish characters but others as well (Chinese, Japanese, Turkish, Arabic, ...). – Alix Axel Aug 05 '12 at 21:36
- 
                    1After few more tests iconv made the trick, as someone wrote above. `setlocale(LC_ALL, "en_US.utf8"); $string = iconv("UTF-8", "ascii//TRANSLIT", '>ĄĘŁŹÓżół<');` gives me `>AELZOzol<` – s3m3n Aug 05 '12 at 23:31
- 
                    3codepad demo of the `Slug()` function, with a second identical but spaced out `nSlug()` function (for the eyeball impaired): http://codepad.org/rJNSQmGJ – Jared Farrish Dec 13 '12 at 08:02
3
            
            
        In other words... you need to create a username slug. Doctrine (ORM for PHP) has a nice function to do it. Doctrine_Inflector::urlize()
EDIT: You should also keep username slug in database, as a Unique Key column. Then every search operation should be done based on that column, not original username.
 
    
    
        Crozin
        
- 43,890
- 13
- 88
- 135
- 
                    Broken link... can now be found here: https://github.com/doctrine/inflector/blob/2.0.x/lib/Doctrine/Inflector/Inflector.php – squarecandy Dec 21 '22 at 22:36
- 
                    Though, it's worth noting that this take the "keep a comprehensive list of all characters to replace" approach, which is much harder to maintain than the approach of the accepted answer. – squarecandy Dec 21 '22 at 22:38
