Suppose I have a dataset that looks like the following:
obs id1 id2
1 a 1
2 b 2
3 c 2
4 d 3
5 e 4
6 b 5
7 f 6
I want to create a unique transitive id variable that for this dataset. Both id1 and id2 are used to identify individuals. So if individual X has the same id1 as individual Y or the same id2 as individual Y, then X=Y.
So, in this example, the intended output would look like this:
obs id1 id2 uniqid
1 a 1 1
2 b 2 2
3 c 2 2
4 d 3 3
5 e 4 4
6 b 5 2
7 f 6 5
Here, observation 6 has id1 "b", which was already assigned uniqid 2 (by observation 2), and so, observation 6 identifies the same individual as observation 2.
Now, comparing observation 3 and 6, we see that these observations share neither id1 nor id2, but still identifies the same individual, since they both identify the same individual as observation 2.
I am currently working in Stata and I was wondering what is the best way to go about doing this. I would prefer a Stata based solution, but I would also be interested in seeing R or Python solutions.