I am using uthash.h for hash implementation in C. I am using the hash-table for a basic word count exercise. I have a file containing words and I have to count frequency of each word. The implementation of uthash.h requires me to generate an integer id for each entry, and I wanted to calculate a unique integer corresponding to each string. I tried using md5 hash algorithm, but it generates strings with digits and alphabets, so its no use.Can anybody suggest me such an algorithm.
            Asked
            
        
        
            Active
            
        
            Viewed 2,103 times
        
    0
            
            
        
        Love Bisaria
        
- 167
 - 13
 
- 
                    A good implementation of the md5 hash should be able to give you the raw 16-byte array. Split this into 4 32bit integers and xor them together. That alphanumeric string is just a convent representation for displaying the hash. – Gareth A. Lloyd Feb 20 '15 at 21:43
 - 
                    See http://stackoverflow.com/questions/16521148/string-to-unique-integer-hashing and http://stackoverflow.com/questions/1010875/string-to-integer-hashing-function-with-precision. – Mihai8 Feb 20 '15 at 21:44
 - 
                    @user1929959, the second link that you mentioned has hashing functions that return unsigned long values, but in `uthash.h` implementation the id needs to be integer. I am not wether this will work or not. I will try this approach and post my results once done. In the mean time if you have any more suggestion, please post them. – Love Bisaria Feb 20 '15 at 21:56
 - 
                    I head [murmur3](http://en.wikipedia.org/wiki/MurmurHash) is pretty good for strings. – Niklas B. Feb 20 '15 at 22:18
 - 
                    And please don't use md5 or any other cryptographic hash function for this. Their computation is *much* slower than good non-cryptographic hash functions. – Niklas B. Feb 20 '15 at 22:19
 
1 Answers
0
            Use Robert Sedgewick's algorithm for hashing
unsigned int GenerateHash(char* str, unsigned int len)
{
   unsigned int result = 0;
   unsigned int b    = 378551;
   unsigned int a    = 63689;
   unsigned int i    = 0;
   for(i=0; i<len; str++, i++)
   {
      result = result*a + (*str);
      a = a*b;
   }
   return result;
}
        mprivat
        
- 21,582
 - 4
 - 54
 - 64