I'm doing some web-scraping and I'm storing the variables of interest in form of:
a = {'b':[100, 200],'c':[300, 400]}
This is for one page, where there were two b's and two c's. The next page could have three of each, where I'd store them as:
b = {'b':[300, 400, 500],'c':[500, 600, 700]}
When I go to create a DataFrame from the list of dict's, I get:
import pandas as pd
df = pd.DataFrame([a, b])
df
b c
0 [100, 200] [300, 400]
1 [300, 400, 500] [500, 600, 700]
What I'm expecting is:
df
b c
0 100 300
1 200 400
2 300 500
3 400 600
4 500 700
I could create a DataFrame each time I store a page and concat the list of DataFrame's at the end. However, based on experience, this is very expensive because the construction of thousands of DataFrame's is much more expensive than creating one DataFrame from a lower-level constructor (i.e., list of dict's).