I am trying to create a 2D array so I can create a heatmap using matplotlib.pyplot similar to the example here: A simple categorical heatmap
I have looked at solutions here How to select rows from a DataFrame based on column values? and here Return single cell value from Pandas DataFrame, but I cannot get them to work for my purpose.
here is my code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
age = np.unique(ageVehicle['Age'])
vehicle = np.unique(ageVehicle['_Vehicle_Type'])
ageVehicleType = np.array([])
innerList = np.array([])
for i in age:
for j in vehicle:
if len(innerList) == len(vehicle) - 1:
innerList+=(int(ageVehicle.loc[(ageVehicle['_Vehicle_Type'] == j) & (ageVehicle['Age'] == i)]['_Count(vehicle_Type)'].values))
ageVehicleType.append(innerList)
innerList = np.array([])
break
else:
innerList+=(int(ageVehicle.loc[(ageVehicle['_Vehicle_Type'] == j) & (ageVehicle['Age'] == i)]['_Count(vehicle_Type)'].values))
fig, ax = plt.subplots()
im = ax.imshow(ageVehicleType)
# We want to show all ticks...
ax.set_xticks(np.arange(len(vehicle)))
ax.set_yticks(np.arange(len(age)))
# ... and label them with the respective list entries
ax.set_xticklabels(vehicle)
ax.set_yticklabels(age)
# Rotate the tick labels and set their alignment.
plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
rotation_mode="anchor")
fig.tight_layout()
plt.show()
My dataframe ageVehicle has 3 columns: Age, _Vehicle_Type and _Count(vehicle_Type). In the nested for loops for i ... for j: I am basically trying to build 1D arrays innerList which will be combined together in a 2D array ageVehicleType. age and vehicle lists contain the unique values of age and vehicle in my ageVehicle dataframe.
for example:
age = [8,9,10,11,12,13,14,15,16]
vehicle = ['toyota', 'bmw', 'mazda', 'benz', 'tesla']
_Count(vehicle_Type) is how many of each combinations of age and vehicle there are.
The 2D array ageVehicleType will essentially be all possible combinations of age and vehicle on dataframe ageVehicle. This 2D array will be the values to construct the colors on the heatmap.
Questions:
The more important question is that I already have the counts (to use for coloring cells on heatmap) in one of the columns
_Count(vehicle_Type. Is it possible, to somehow use this column in myageVehicledataframe to build the heatmap instead of creating the 2D array which constitutes all combinations ofageandvehicle?Should the 2D array
ageVehicleTypenecessarily be a cross-product of all combinations ofageandvehicle? If so, the logic of the code may need to be altered.I am getting an error. I'd appreciate your help on how I can re-write my conditions to resolve this issue:
TypeError Traceback (most recent call last)
<ipython-input-54-4e2a48f8339f> in <module>
15 else:
16 innerList+=(int(ageVehicle.loc[(ageVehicle['_Vehicle_Type'] == j) & (ageVehicle['Age'] == i)]\
---> 17 ['_Count(vehicle_Type)'].values))
18
TypeError: only size-1 arrays can be converted to Python scalars
Thanks in advance.