What is the best way to rank a sql varchar column by the number (count)/match of words in a parameter, with four distinct unique criteria. This is likely not a trivial question but I am challenged to order the rows based on "best match" using my criteria.
column: description varchar(100) Parameter: @MyParameter varchar(100)
Output with this order preference:
- Exact match (entire string matches) - always first
 - Begins with (decending based on the parameter length of match)
 - Count of words rank with contiguous words ranking higher for the same count of words matched
 - Word(s) match anywhere (not contiguous)
 
Words might NOT match exactly, as partial matches of a word are allowed and likely, lessor value should be applied to partial words for ranking but not critical (pot would match each in: pot, potter, potholder, depot, depotting for instance). Begins with with other word matches should rank higher than those with no subsequent matches but that is not a deal killer/super important.
I would like to have a method to rank where the column "begins with" the value in the parameter. Say I have the following string:
'This is my value string as a test template to rank on.'
I would like to have, in the first case a rank of the column/row where the greatest count of words exist.
And the second to rank based on occurance (best match) at the begining as:
'This is my string as a test template to rank on.' - first
'This is my string as a test template to rank on even though not exact.'-second
'This is my string as a test template to rank' - third
'This is my string as a test template to' - next
'This is my string as a test template' - next etc.
Secondly:(possibly second set/group of data after the first (begins with) - this is desired
I want to rank (sort) the rows by the count of words in the @MyParameter that occur in @MyParameter with a rank where contiguous words rank higher than the same count separate.
Thus for the example string above, 'is my string as shown' would rank higher than 'is not my other string as' due the the "better match" of the contiguous string (words together) with the same count of words.  Rows with a higher match (count of words that occur) would rank descending best match first.
If possible, I would like to do this in a single query.
No row should occur twice in the result.
For performance considerations, there will occur no more than 10,000 rows in the table.
The values in the table are fairly static with little change but not totally so.
I cannot change the structure at this time but would consider that later (like a word/phrase table)
To make this slightly more complicated, the word list is in two tables - but I could create a view for that, but one table results (smaller list) should occur prior to a second, larger data set results given the same match - there will be duplicates from these tables as well as within a table and I only want distinct values. Select DISTINCT is not easy as I want to return one column (sourceTable) which could quite possibly make the rows distinct and in that case only select from the first (smaller) table, but all the other columns DISTINCT is desired (do not consider that column in the "distinct" evaluation.
Psuedo Columns in table:
procedureCode   VARCHAR(50),
description VARCHAR(100), -- this is the sort/evaluation column
category    VARCHAR(50),
relvu       VARCHAR(50),
charge  VARCHAR(15),
active  bit
sourceTable   VARCHAR(50) - just shows which table it comes from of the two
NO unique index exists like an ID column
Matches NOT in a third table to be excluded SELECT * FROM (select * from tableone where procedureCode not in (select procedureCode from tablethree))
UNION ALL
(select * from tabletwo where procedureCode not in (select procedureCode from tablethree))
EDIT: in an attempt to address this I have created a table value paramter like so:
0       Gastric Intubation & Aspiration/Lavage, Treatmen
1       Gastric%Intubation%Aspiration%Lavage%Treatmen
2       Gastric%Intubation%Aspiration%Lavage
3       Gastric%Intubation%Aspiration
4       Gastric%Intubation
5       Gastric
6       Intubation%Aspiration%Lavage%Treatmen
7       Intubation%Aspiration%Lavage
8       Intubation%Aspiration
9       Intubation
10      Aspiration%Lavage%Treatmen
11      Aspiration%Lavage
12      Aspiration
13      Lavage%Treatmen
14      Lavage
15      Treatmen
where the actual phrase is in row 0
Here is my current attempt at this:
CREATE PROCEDURE [GetProcedureByDescription]
(   
        @IncludeMaster  BIT,
        @ProcedureSearchPhrases CPTFavorite READONLY
)
AS
    DECLARE @myIncludeMaster    BIT;
    SET @myIncludeMaster = @IncludeMaster;
    CREATE TABLE #DistinctMatchingCpts
    (
    procedureCode   VARCHAR(50),
    description     VARCHAR(100),
    category        VARCHAR(50),
    rvu     VARCHAR(50),
    charge      VARCHAR(15),
    active      VARCHAR(15),
    sourceTable   VARCHAR(50),
    sequenceSet VARCHAR(2)
    )
    IF @myIncludeMaster = 0
        BEGIN -- Excluding master from search   
          INSERT INTO #DistinctMatchingCpts (sourceTable, procedureCode, description    ,   category  ,charge, active, rvu, sequenceSet
) 
      SELECT DISTINCT sourceTable, procedureCode, description, category ,charge, active, rvu, sequenceSet
          FROM (
                  SELECT TOP 1
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[COMBO])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      ''True'' AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''0CPTMore'' AS sourceTable,
                      ''01'' AS sequenceSet
                  FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [CPTMORE] AS CPT
                      ON CPT.[LEVEL] = PP.[LEVEL]
                  WHERE 
                      (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                      AND CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  ORDER BY PP.CODE
          UNION ALL
                  SELECT 
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[COMBO])) AS category,
                      LTRIM(RTRIM([CHARGE])) AS charge,
                      ''True'' AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''0CPTMore'' AS sourceTable, 
                      ''02'' AS sequenceSet
                  FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [CPTMORE] AS CPT
                      ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                  WHERE 
                      (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                      AND CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
          UNION ALL
            SELECT 
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[COMBO])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      ''True'' AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''0CPTMore'' AS sourceTable,
                      ''03'' AS sequenceSet
                  FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [CPTMORE] AS CPT
                      ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
                  WHERE 
                      (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                      AND CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
            ) AS CPTS
            ORDER BY 
                 procedureCode, sourceTable, [description]
        END -- Excluded master from search
    ELSE
        BEGIN -- Including master in search, but present favorites before master for each code
            -- Get matching procedures, ordered by code, source (favorites first), and description.
            -- There probably will be procedures with duplicated code+description, so we will filter
            -- duplicates shortly.
      INSERT INTO #DistinctMatchingCpts (sourceTable, procedureCode, description    ,   category  ,charge, active, rvu, sequenceSet) 
      SELECT DISTINCT sourceTable, procedureCode, description, category ,charge, active, rvu, sequenceSet
          FROM (
                  SELECT TOP 1
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[COMBO])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      ''True'' AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''0CPTMore'' AS sourceTable,
                      ''00'' AS sequenceSet
                FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [CPTMORE] AS CPT
                      ON CPT.[LEVEL] = PP.[LEVEL]
                  WHERE 
                      (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                      AND CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  ORDER BY PP.CODE
                  UNION ALL
                  SELECT TOP 1
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''2MasterCPT'' AS sourceTable,
                      ''00'' AS sequenceSet
                  FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [MASTERCPT] AS CPT
                      ON CPT.[LEVEL] = PP.[LEVEL]
                  WHERE 
                      CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  ORDER BY PP.CODE
                  UNION ALL
                  SELECT 
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[COMBO])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      ''True'' AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''0CPTMore'' AS sourceTable,
                      ''01'' AS sequenceSet
                FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [CPTMORE] AS CPT
                      ON CPT.[LEVEL] = PP.[LEVEL]
                  WHERE 
                      (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                      AND CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  UNION ALL
                  SELECT 
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''2MasterCPT'' AS sourceTable,
                      ''01'' AS sequenceSet
                  FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [MASTERCPT] AS CPT
                      ON CPT.[LEVEL] = PP.[LEVEL]
                  WHERE 
                      CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  UNION ALL
                  SELECT TOP 1
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[COMBO])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      ''True'' AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''0CPTMore'' AS sourceTable,
                      ''02'' AS sequenceSet
                FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [CPTMORE] AS CPT
                      ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                  WHERE 
                      (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                      AND CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  ORDER BY PP.CODE
                  UNION ALL
                  SELECT TOP 1
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''2MasterCPT'' AS sourceTable,
                      ''02'' AS sequenceSet
                  FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [MASTERCPT] AS CPT
                      ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                  WHERE 
                      CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  ORDER BY PP.CODE
                  UNION ALL
                  SELECT 
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[COMBO])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      ''True'' AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''0CPTMore'' AS sourceTable,
                      ''03'' AS sequenceSet
                FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [CPTMORE] AS CPT
                      ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                  WHERE 
                      (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                      AND CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  UNION ALL
                  SELECT 
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''2MasterCPT'' AS sourceTable,
                      ''03'' AS sequenceSet
                  FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [MASTERCPT] AS CPT
                      ON CPT.[LEVEL] LIKE PP.[LEVEL] + ''%''
                  WHERE 
                      CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  UNION ALL
                  SELECT 
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[COMBO])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      ''True'' AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''0CPTMore'' AS sourceTable,
                      ''04'' AS sequenceSet
                FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [CPTMORE] AS CPT
                      ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
                  WHERE 
                      (CPT.[COMBO] IS NULL OR CPT.[COMBO] NOT IN (''Editor'',''MOD'',''CATEGORY'',''Types'',''Bundles''))
                      AND CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
                  UNION ALL
                  SELECT 
                      LTRIM(RTRIM(CPT.[CODE])) AS procedureCode, 
                      LTRIM(RTRIM(CPT.[LEVEL])) AS description, 
                      LTRIM(RTRIM(CPT.[CATEGORY])) AS category,
                      LTRIM(RTRIM(CPT.[CHARGE])) AS charge,
                      COALESCE(CASE [ACTIVE] WHEN 1 THEN ''True'' WHEN 0 THEN ''False'' WHEN '''' THEN ''False'' ELSE ''False'' END,''True'') AS active,
                      LTRIM(RTRIM([RVU])) AS rvu,
                      ''2MasterCPT'' AS sourceTable,
                      ''04'' AS sequenceSet
                  FROM 
                    @ProcedureSearchPhrases PP
                    INNER JOIN  [MASTERCPT] AS CPT
                      ON CPT.[LEVEL] LIKE ''%'' + PP.[LEVEL] + ''%''
                  WHERE 
                      CPT.[CODE] IS NOT NULL
                      AND CPT.[CODE] NOT IN (''0'', '''')
                    AND CPT.[CODE] NOT IN (SELECT CPTE.[CODE] FROM CPT AS CPTE WHERE CPTE.[CODE] IS NOT NULL)
             ) AS CPTS 
            ORDER BY 
                 sequenceSet, sourceTable, [description]
        END
        /* Final select - uses artificial ordering from the insertion ORDER BY */
        SELECT procedureCode, description,  category, rvu, charge, active FROM
        ( 
        SELECT TOP 500 *-- procedureCode, description,  category, rvu, charge, active
        FROM #DistinctMatchingCpts
        ORDER BY sequenceSet, sourceTable, description
        ) AS CPTROWS
        DROP TABLE #DistinctMatchingCpts
However, this does NOT meet the criteria of best match on the count of words (as in the row 1 value in the sample) which should match the best (most) words found count from that row.
I have complete control over the form/format of the table value parameter if that makes a difference.
I am returning this result to a c# program if that is useful.