I have a text file that is structured as follows:
P,ABC,DEF
P,GHI,JKL
B,ABC,DEF
B,MNO,PQR
I want to get a count of how many times a line appears where fields 2 and 3 are the same while preserving field 1. So, the output would look something like this:
2,P,ABC,DEF
1,P,GHI,JKL
2,B,ABC,DEF
1,B,MNO,PQR
uniq -c won't work (as far as I know) because it can't separate by field. sort -u -t, -k2,2 -k3,3 also won't work as it can't count (as far as I know) and the command as written will simply destroy the third line as a duplicate while leaving the first.
At the end of the day, what I need to be returned are lines 2 and 4 as fields 2 and 3 combined are unique. But, I need to preserve field 1 as it refers to which dataset (in the real world) fields 2 and 3 originate from. So, a solution that returns lines 2 and 4 is really what I need.
Accordingly, a solution as follows works as well:
P,GHI,JKL
B,MNO,PQR