I want to compare and get number of corresposndeces between two arrays in PostgresSQL.
For example: given the table
| id | array |
|---|---|
| 1 | a, b, d |
| 2 | a, e |
| 3 | a, b, d, e |
| 4 | b, c, d |
I would like to obtain a comparison table:
| id_1 | id_2 | number_of_matches |
|---|---|---|
| 1 | 2 | 1 |
| 1 | 3 | 3 |
| 1 | 4 | 2 |
| 2 | 3 | 2 |
| 2 | 4 | 0 |
| 3 | 4 | 2 |
There is no repeated values in each array, and the order of items inside arrays does not matter.
I found here about the overlap operator (&&) but it only returns a boolean if there is or not any correspondence between two arrays, but not the number of correspondences.
CREATE TABLE data (id integer, val text[]);
INSERT INTO data VALUES (1, '{a,b,d}');
INSERT INTO data VALUES (2, '{a,e}');
INSERT INTO data VALUES (3, '{a,b,d,e}');
INSERT INTO data VALUES (4, '{b,c,d}');
SELECT a.id id_1, b.id id_2, a.val && b.val match
FROM data a, data b
WHERE a.id < b.id;
returns
| id_1 | id_2 | match |
|---|---|---|
| 1 | 2 | t |
| 1 | 3 | t |
| 1 | 4 | t |
| 2 | 3 | t |
| 2 | 4 | f |
| 3 | 4 | t |