I have a list of IP addresses in a df. These IP addresses are sent in GET requests to the ARIN database using requests, and I am interested in getting the organization or customer of that IP address. I am using a requests Session() inside of a requests-futures FuturesSession() to hopefully speed up the API calls. Here is the code:
s = requests.Session()
session = FuturesSession(session=s, max_workers=10)
def getIPAddressOrganization(IP_Address):
url = 'https://whois.arin.net/rest/ip/' + IP_Address + '.json'
request = session.get(url)
response = request.result().json()
try:
organization = response['net']['orgRef']['@name']
except KeyError:
organization = response['net']['customerRef']['@name']
return organization
df['organization'] = df['IP'].apply(getIPAddressOrganization)
Adding the regular requests Session() helped performance a lot, but the requests-futures FuturesSession() has not helped (likely due to my lack of knowledge).
How should pandas apply() be used in tandem with requests-futures, and/or is there another option for speeding up API calls that could be more effective?