Sorry for bad English for writing the title. What I would like to do is to determine the location of df by comparing with another data frame df2.
df is a data frame that shows the variation of longitude and latitude in a period of time. It is a data frame with more than 40,000 of rows.
    df=
                        Longitude Latitude     
        Time               
        2020-01-01 01:00    100.0   20.0       
        2020-01-01 01:01    100.2   20.1       
        2020-01-01 01:02    300.1   30.0       
        2020-01-01 01:03    200.1   40.0     
        2020-01-01 01:04     50.0   50.0  
df2 is the dataframe that contain the location data. It is a data frame with around 30 rows.
df2=
        Longitude Latitude   Location
    0        90.0   20.0     District A 
    1       210.0   60.0     District B
The method to determine where I am located will be done by Pythagoras theorem. Each row of data in df will compare with every row in df2 to find the shortest distance, which the shortest distance is where I am located.
Take 2020-01-01 01:01 as an example.
First, compare with df2 row 0, the calculation is [(100.2-90.0)^2+(20.1-20.0)^2]^0.5.
Second, compare with df2 row 1, the calculation is [(100.2-210.0)^2+(20.1-60.0)^2]^0.5
Since the distance between the location and District A is smaller than the distance between the location and District B, therefore the location on 2020-01-01 01:01 will be District A. The expected output is given as follows.
df=
                            Longitude Latitude  Location   
            Time               
            2020-01-01 01:00    100.0   20.0    District A   
            2020-01-01 01:01    100.2   20.1    District A   
            2020-01-01 01:02    300.1   30.0    District B   
            2020-01-01 01:03    200.1   40.0    District B 
            2020-01-01 01:04     50.0   50.0    District A
My approach: To get my expected result first I will create a df['Loaction'] column full of NaN, then I will do the calculation and extract the df2['Location'] to substitute the NaN value in df['Loaction'].
However, what I am confused about is how to get the calculation results. Are there any related pandas function or NumPy function that can compare a row from df with multiple rows from df2? Is there a better approach? Thank you!
 
    