So I have a dataset consisting 130000 points, in the format (x,y). My final goal is to cluster this data using kmeans. But for applying that, I need to find the optimum number of clusters to pass to the kmeans algorithm. How should I apply something like Gap Statistics or Levene's test in python to achieve this?
            Asked
            
        
        
            Active
            
        
            Viewed 84 times
        
    2
            
            
        - 
                    check [this](https://gist.github.com/michiexile/5635273) example using scipy. – Burak Nov 19 '15 at 20:06
