Background
Users can type in a name and the system should match the text, even if the either the user input or the database field contains accented (UTF-8) characters. This is using the pg_trgm module.
Problem
The code resembles the following:
  SELECT
    t.label
  FROM
    the_table t
  WHERE
    label % 'fil'
  ORDER BY
    similarity( t.label, 'fil' ) DESC
When the user types fil, the query matches filbert but not filé powder. (Because of the accented character?)
Failed Solution #1
I tried to implement an unaccent function and rewrite the query as:
  SELECT
    t.label
  FROM
    the_table t
  WHERE
    unaccent( label ) % unaccent( 'fil' )
  ORDER BY
    similarity( unaccent( t.label ), unaccent( 'fil' ) ) DESC
This returns only filbert.
Failed Solution #2
As suggested:
CREATE EXTENSION pg_trgm;
CREATE EXTENSION unaccent;
CREATE OR REPLACE FUNCTION unaccent_text(text)
  RETURNS text AS
$BODY$
  SELECT unaccent($1); 
$BODY$
  LANGUAGE sql IMMUTABLE
  COST 1;
All other indexes on the table have been dropped. Then:
CREATE INDEX label_unaccent_idx 
ON the_table( lower( unaccent_text( label ) ) );
This returns only one result:
  SELECT
    t.label
  FROM
    the_table t
  WHERE
    label % 'fil'
  ORDER BY
    similarity( t.label, 'fil' ) DESC
Question
What is the best way to rewrite the query to ensure that both results are returned?
Thank you!
Related
http://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.0#Unaccent_filtering_dictionary
http://postgresql.1045698.n5.nabble.com/index-refuses-to-build-td5108810.html
 
     
    