I'm working in geopandas with a large number (around 4.5 million) objects, where each has a unique ID number ('PARCEL_SPI') and also another code ('PC_PLANNO').
What I would like to do is write some code that, for each object, finds all other objects with the same PLANNO and adds their ID number as a list in a new attribute, say 'Same_code' for the object. The df is called spine_copy.
Here's a quick sample of what I have:
| PARCEL_SPI | PC_PLANNO | 
|---|---|
| 23908 | LP12345 | 
| 90435 | LP12345 | 
| 329048 | LP90803 | 
| 6409 | LP2399 | 
| 34534 | LP90803 | 
| 092824 | LP12345 | 
and what I want out:
| PARCEL_SPI | PC_PLANNO | Same_code | 
|---|---|---|
| 23908 | LP12345 | [90435, 092824] | 
| 90435 | LP12345 | [23908,092824] | 
| 329048 | LP90803 | 34534 | 
| 6409 | LP2399 | None | 
| 34534 | LP90803 | 329048 | 
| 092824 | LP12345 | [23908, 90435] | 
I'm not too sure how to do this, but here's my attempt using groupby:
spine_copy.groupby('PC_PLANNO')['PARCEL_SPI'].apply(list)
However, this doesn't add the list as a new attribute for each object, and I'm unsure how to do this.
Thanks in advance!