Here is my DataFrame -
In [106]: ogl.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000163 entries, 0 to 1000162
Data columns (total 5 columns):
 #   Column                       Non-Null Count    Dtype
---  ------                       --------------    -----
 0   geolocation_zip_code_prefix  1000163 non-null  int64
 1   geolocation_lat              1000163 non-null  float64
 2   geolocation_lng              1000163 non-null  float64
 3   geolocation_city             1000163 non-null  object
 4   geolocation_state            1000163 non-null  object
dtypes: float64(2), int64(1), object(2)
memory usage: 38.2+ MB
It comes from the Brazilian E-Commerce Public Dataset by Olist, olist_geolocation_dataset.csv. Oddly enough, given geolocation_zip_code_prefix, geolocation_city and geolocation_state are not redundant information. For example row 49285: "03203",-23.598384873160597,-46.56677381072186,sao paulo,SP and row 51000: "03203",-23.216648333054426,-46.86137071772756,jundiaí,SP
I was curious to know how well (geolocation_lat, geolocation_lng) could predict (geolocation_state, geolocation_city, geolocation_zip_code_prefix). The combination of these 3 fields could be thought as categories (such as (03203, sao paulo, SP))  which contain lists of (geolocation_lat, geolocation_lng) such as [(-23.598384873160597,-46.56677381072186), ...]. I thought this could be achieved with one-way ANOVA but now I am beginning to doubt this. How would I measure the strength of association, like Cramér's V but for predicting categories from quantitative data (geolocations)?
