I'm trying to write a search query to find articles from a database. I would like to take the search string the user enters and look for a specific set of possible search terms. If the user entered the search string "listing of average salaries in germany for 2011" I would like to generate a list of terms to hunt for. I figured I would look for the whole string and for partial strings of consecutive words. That is I want to search for "listing of average salaries" and "germany for 2011" but not "listing germany 2011".
So far I have this bit of code to generate my search terms:
  $searchString = "listing of average salaries in germany for 2011";
  $searchTokens = explode(" ", $searchString);
  $searchTerms = array($searchString);
  $tokenCount = count($searchTokens);
  for($max=$tokenCount - 1; $max>0; $max--) {
      $termA = "";
      $termB = "";
      for ($i=0; $i < $max; $i++) {
          $termA .= $searchTokens[$i] . " ";
          $termB .= $searchTokens[($tokenCount-$max) + $i] . " ";
      }
      array_push($searchTerms, $termA);
      array_push($searchTerms, $termB);
  }
  print_r($searchTerms);
and its giving me this list of terms:
- listing of average salaries in germany for 2011
- listing of average salaries in germany for
- of average salaries in germany for 2011
- listing of average salaries in germany
- average salaries in germany for 2011
- listing of average salaries in
- salaries in germany for 2011
- listing of average salaries
- in germany for 2011
- listing of average
- germany for 2011
- listing of
- for 2011
- listing
- 2011
What I'm not sure how to get are the missing terms:
- of average salaries in germany for
- of average salaries in germany
- average salaries in germany for
- of average salaries in
- average salaries in germany
- salaries in germany for
- etc...
Update
I'm not looking for a "power set" so answers like this or this aren't valid. For example I do not want these in my list of terms:
- average germany
- listing salaries 2011
- of germany for
I'm looking for consecutive words only.
 
     
     
    