I have a dataset with 17 features, 10K observations, and one column containing labels (ranging from 1 through 4, integers). So the dataset is 10,000 X 18 (17 features plus one label). What I want to do is create a list of arrays in which each array is created from each block of labels. For example, the first 10 rows may be labeled as 1,1,1,2,2,3,1,1,1,3. I tried to use Pandas at first by aggregating by label, but that does not work because then I will only have four arrays within the list. Any ideas on how to code this in numpy or pandas?
            Asked
            
        
        
            Active
            
        
            Viewed 316 times
        
    1 Answers
1
            First get your labels, and then separate each block:
unique_labels = df["label_col"].unique()
label_blocks = {}
for label in unique_labels:
    block_df = df.loc[df["label_col"]==label]
    label_blocks[label] = block_df
 
    
    
        Laggs
        
- 386
- 1
- 5
