I'll try to answer my own question based on Broken Link's comment (thank you for this):
You've extracted phrases consisting of 1 to 3 words from your database of documents. Among these extraced phrases there are the following phrases:
- Half Blood Prince
 
- Half-Blood Prince
 
- Halfblood Prince
 
For each phrase, you strip all special characters and blank spaces and make the string lowercase:
$phrase = 'Half Blood Prince';
$phrase = preg_replace('/[^a-z]/i', '', $phrase);
$phrase = strtolower($phrase);
// result is "halfbloodprince"
When you've done this, all 3 phrases (see above) have one spelling in common:
- Half Blood Prince => halfbloodprince
 
- Half-Blood Prince => halfbloodprince
 
- Halfblood Prince => halfbloodprince
 
So "halfbloodprince" is the parent phrase. You insert both into your database, the normal phrase and the parent phrase.
To show a "Trending Topics Admin" like Twitter's you do the following:
// first select the top 10 parent phrases
$sql1 = "SELECT parentPhrase, COUNT(*) as cnt FROM phrases GROUP BY parentPhrase ORDER BY cnt DESC LIMIT 0, 10";
$sql2 = mysql_query($sql1);
while ($sql3 = mysql_fetch_assoc($sql2)) {
    $parentPhrase = $sql3['parentPhrase'];
    $childPhrases = array(); // set up an array for the child phrases
    $fifthPart = round($sql3['cnt']*0.2);
    // now select all child phrases which make 20% of the parent phrase or more
    $sql4 = "SELECT phrase FROM phrases WHERE parentPhrase = '".$sql3['parentPhrase']."' GROUP BY phrase HAVING COUNT(*) >= ".$fifthPart;
    $sql5 = mysql_query($sql4);
    while ($sql6 = mysql_fetch_assoc($sql5)) {
        $childPhrases[] = $sql3['phrase'];
    }
    // now you have the parent phrase which is on the left side of the arrow in $parentPhrase
    // and all child phrases which are on the right side of the arrow in $childPhrases
}
Is this what you thought of, Broken Link? Would this work?