I've a few questions here involving apply() and I often see the comment that I shouldn't be editing or changing a dataframe with .apply(). Why is this?
Here's a simple use case that I have for apply. Just ping an api and append the results to the dataframe:
Initial setup:
import pandas as pd
from random import sample
x = pd.DataFrame({'col1':['john','jim','mary'],
                 'col2':['a@gmail.com', 'b@gmail.com', 'c@gmail.com']})
print(x)
   col1         col2
0  john  a@gmail.com
1   jim  b@gmail.com
2  mary  c@gmail.com
Fake api call. Takes a random result from a list:
mylist = ['valid','invalid']
def api(email):
    return sample(mylist,1)
The apply function which will take the email, feed it to the api, parse the json, then append the result.
def myfun(row):
    email = row['col2']
    # fake API call
    api_response = api(email)
    # NOTE: THIS WOULD BE WHERE I PARSE THE JSON
    # if email is valid
    if api_response == 'valid':
        # append status
        row['status'] = 'success'
        # append some other data
        row['other_data'] = 'api_check_done'
        #return the row
        return row
    # otherwise fail status
    else:
        row['status'] = 'fail'
        row['other_data'] = 'api_check_done'
        #return the row
        return row
# apply the fuction
x.apply(myfun,axis=1)
   col1         col2 status      other_data
0  john  a@gmail.com   fail  api_check_done
1   jim  b@gmail.com   fail  api_check_done
2  mary  c@gmail.com   fail  api_check_done
It seems to work fine.
So I am wondering, what is the problem with this, and is there a better way to do it?
