What's the most effective way to solve the following pandas problem?
Here's a simplified example with some data in a data frame:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=['a','b','c','d'], 
                  index=np.random.randint(0,10,size=10))
This data looks like this:
   a  b  c  d
1  0  0  9  9
0  2  2  1  7
3  9  3  4  0
2  5  0  9  4
1  7  7  7  2
6  4  4  6  4
1  1  6  0  0
7  8  0  9  3
5  0  0  8  3
4  5  0  2  4
Now I want to apply some function f to each value in the data frame (the function below, for example) and get a data frame back as a resulting output. The tricky part is the function I'm applying depends on the value of the index I am currently at.
def f(cell_val, row_val):
    """some function which needs to know row_val to use it"""
    try:
        return cell_val/row_val
    except ZeroDivisionError:
        return -1
Normally, if I wanted to apply a function to each individual cell in the data frame, I would just call .applymap() on f. Even if I had to pass in a second argument ('row_val', in this case), if the argument was a fixed number I could just write a lambda expression such as lambda x: f(x,i) where i is the fixed number I wanted. However, my second argument varies depending on the row in the data frame I am currently calling the function from, which means that I can't just use .applymap().
How would I go about solving a problem like this efficiently? I can think of a few ways to do this, but none of them feel "right". I could:
- loop through each individual value and replace them one by one, but that seems really awkward and slow.
- create a completely separate data frame containing (cell value, row value) tuples and use the builtin pandas applymap()on my tuple data frame. But that seems pretty hacky and I'm also creating a completely separate data frame as an extra step.
- there must be a better solution to this (a fast solution would be appreciated, because my data frame could get very large).
 
     
     
    