I have a program in C# whose psuedocode looks like this. The program runs fine except it takes 5 days to run. I need to run this every day for new criteria. We are trying to tag every flower name that is found in Biology books. Each day it is a different textbook or journal in .txt format. We are using Lucene search to make it faster. We have List of FLowerFamilyID in the database. We also have FlowerID| FlowerCommon Name in a csv file. This csv file has about 10,000 entries.
Step 1// Get FlowerFamilyID from SQL server & dump in a text file. This has about 100 entries
FlowerFamilyID FamilyName 
1             Acanthaceae
2                 Agavaceae
Step 2// A csvfile has flowerid, flowercommonname. Read csv file (about 10000 entries) and store them in a list eg:
1|Rose   
2|American water willow 
3|false aloe
Step 3// Lucene index is created on Book/Journal for that day
Step 4// for every familyflowerID from datatextfile call SearchFlower(flowerfamilyID, flower list). returns all flowers found in that book/journal
Step 5// In search function I call Lucene query parser & search for 10000 flower entries, if found store the first hit with score in a list
        public static List<flowerResult> searchText(String flowerfamilyid, List<flower> flowers)
    {
        DateTime startdate = DateTime.Now;   
        List<flowerResult> results = new List<flowerResult>();
        Document doc = new Document();   
        foreach (var flower in flowers)
        {           
            string[] separators = { ",", ".", "!", "?", ";", ":", " " };
            string value = flower.getFlower().Trim().ToLower();
            string[] words = value.Split(separators, StringSplitOptions.RemoveEmptyEntries);
             String criteria = string.Empty;
            if (words.Length > 1)
                criteria = "\"" + value+ "\"";
            else
                criteria = value;
            if (string.IsNullOrEmpty(criteria))
                continue;
            criteria = criteria.Replace("\r", " ");
            criteria = criteria.Replace("\n", " ");
            QueryParser queryParser = new QueryParser(VERSION, "body", analyzer);
            string special = " +body:" + criteria;
            Query query = queryParser.Parse(special);      
            try
            {      
                IndexReader reader = IndexReader.Open(luceneIndexDirectory, true);
                Searcher indexSearch = new IndexSearcher(reader);
                TopDocs hits = indexSearch.Search(query, 1);
                if (hits.TotalHits > 0)
                {
                    float score = hits.ScoreDocs[0].Score;
                    if (score > MINSCORE)
                    {
              flowerResult result = new flowerResult(flower.getId(), flower.getFlower(), score);
                        results.Add(result);
                    }
                }      
                indexSearch.Dispose();
                reader.Dispose();
                indexWriter.Dispose();
            }
            catch (ParseException e)
            {//"Could not parse article. Details: " + e.Message);
              }
        }
        return results;
    }
public class flower
{
    public long flowerID {get;set;}
    public string familyname {get;set;}
    public string flower {get;set;} //common name
}
I tried running this, & it completed in 5 days. But I need to finish this within a day bcoz results are used for further analysis. So, I split up the csv file into 10 different files and the job completed in 2 days. I was told by team leader to use multiple threads to enhance the speed. I have no clue how to do that. Can somebody help me?
Thanks R
 
    