Let's say I have an empty dataframe, already set up with columns, but no rows. I'm scraping some data from the web so let's say I need to add an index '2176' to the empty dataframe. How could I automatically add this row to the database when I try to assign it? Is this even pandas purpose or should I use something else?
            Asked
            
        
        
            Active
            
        
            Viewed 1.1k times
        
    4
            
            
         
    
    
        Cleb
        
- 25,102
- 20
- 116
- 151
 
    
    
        Hexagon789
        
- 315
- 2
- 4
- 16
2 Answers
10
            
            
        As an alternative to .loc, you might want to consider at. Using @NickBraunagel's example:
df = pd.DataFrame(columns=['foo1','foo2'])
Then
df.at['2716', 'foo1'] = 10
yields
     foo1 foo2
2716   10  NaN
Timings are quite different:
# @NickBraunagel's solution
%timeit df.loc['2716', 'foo1'] = 10
1000 loops, best of 3: 212 µs per loop
# the at solution
%timeit df.at['2716', 'foo1'] = 10
100000 loops, best of 3: 12.5 µs per loop
If you want to add several column entries at the same time, you can do:
d = {'foo1': 20, 'foo2': 10}
df.at['1234', :] = d
yielding
     foo1 foo2
2716   10  NaN
1234   20   10
However, make sure to always add the same datatype to avoid errors or other undesired effects as explained here.
 
    
    
        Cleb
        
- 25,102
- 20
- 116
- 151
- 
                    1Good call, assuming you're only updating one value/cell at a time (which works for this example). For reference: https://stackoverflow.com/a/37216587/4245462 – NickBraunagel Dec 30 '17 at 02:59
- 
                    2@NickBraunagel: I guess this assumption is valid as OP was talking about single rows. Thanks for the reference! – Cleb Dec 30 '17 at 03:04
6
            import pandas as pd
df = pd.DataFrame(columns=['foo1','foo2'])
df.loc[2176,'foo1'] = 'my_value'
df is then:
        foo1        foo2
2176    my_value    NaN
 
    
    
        NickBraunagel
        
- 1,559
- 1
- 16
- 30
- 
                    more details: https://github.com/pandas-dev/pandas/issues/2801#issuecomment-17644076 – Tomek C. Apr 29 '21 at 15:47