Sample Data:
| id | start_to_end_date | attribute | val | 
|---|---|---|---|
| 1395287 | 2019-01-01 - 2019-12-31 | CostOfGoodsAndServicesSold | 49.369 | 
| 1395288 | 2020-01-01 - 2020-12-31 | CostOfGoodsAndServicesSold | 75.476 | 
| 1395289 | 2020-04-01 - 2020-06-30 | CostOfGoodsAndServicesSold | 17.69 | 
| 1395290 | 2020-07-01 - 2020-09-30 | CostOfGoodsAndServicesSold | 19.056 | 
| 1395291 | 2021-01-01 - 2021-12-31 | CostOfGoodsAndServicesSold | 137.292 | 
If some of you are familiar with financial statements and Index Match in Excel, I would basically like to recreate what an Income Statement looks like.
I have all my unique column_names in a list. This current order is important:
column_names = ['2020-04-01 - 2020-06-30',
 '2020-07-01 - 2020-09-30',
 '2021-01-01 - 2021-03-31',
 '2021-04-01 - 2021-06-30',
 '2021-07-01 - 2021-09-30',
 '2022-01-01 - 2022-03-31',
 '2019-01-01 - 2019-12-31',
 '2020-01-01 - 2020-12-31',
 '2021-01-01 - 2021-12-31']
I also have all my unique row names in a list. This current order is important:
row_names = ['RevenueFromContractWithCustomerExcludingAssessedTax',
 'CostOfGoodsAndServicesSold',
 'GrossProfit',
 'ResearchAndDevelopmentExpense']
The final dataframe would have the above column_names going across, the above row_names going down (as the index).
The values that would fill in dataframe would be from df['val'], retrieved by matching df['attribute'] == column_names[i] and df['start_to_end_date'] == row_names[i].
How would I do this?
- I tried to create an empty Dataframe first with the 2 lists but there is no place to query for df['val']:
pd.DataFrame(row_names , columns=column_names, index=[i[0] for i in row_names ])
- I'm imagining something in the end would look like:
df_new = np.where((df['attribute'] == column_names) & (df['start_to_end_date'] == row_names), df['val'], None).
Ideal End result:
| 2020-04-01 - 2020-06-30 | 2020-07-01 - 2020-09-30 | 2021-01-01 - 2021-03-31 | 2021-04-01 - 2021-06-30 | |
|---|---|---|---|---|
| RevenueFromContractWithCustomerExcludingAssessedTax | ##.## from df[val] | ##.## | ##.## | ##.## | 
| CostOfGoodsAndServicesSold | 17.69 | 19.056 | ##.## | ##.## | 
| GrossProfit | ##.## from df[val] | ##.## | ##.## | ##.## | 
