I have two dataframes with different columns and one of the dataframes has the row indexes as follows:
+------------+--------------+
|     rec_id1|       rec_id2|
+------------+--------------+
|rec-3301-org|rec-3301-dup-0|
|rec-2994-org|rec-2994-dup-0|
|rec-2106-org|rec-2106-dup-0|
|rec-3771-org|rec-3771-dup-0|
|rec-3886-org|rec-3886-dup-0|
| rec-974-org| rec-974-dup-0|
| rec-224-org| rec-224-dup-0|
|rec-1826-org|rec-1826-dup-0|
| rec-331-org| rec-331-dup-0|
|rec-4433-org|rec-4433-dup-0|
+------------+--------------+
+----------+-------+-------------+------+-----+-------+
|given_name|surname|date_of_birth|suburb|state|address|
+----------+-------+-------------+------+-----+-------+
|         0|    1.0|            1|     1|    1|    1.0|
|         0|    1.0|            0|     1|    1|    1.0|
|         0|    1.0|            1|     1|    1|    0.0|
|         0|    1.0|            1|     1|    1|    1.0|
|         0|    1.0|            1|     1|    1|    1.0|
|         0|    1.0|            1|     1|    1|    1.0|
|         0|    1.0|            1|     1|    1|    1.0|
|         0|    1.0|            0|     1|    1|    1.0|
|         0|    1.0|            1|     1|    1|    1.0|
|         0|    1.0|            1|     0|    1|    1.0|
+----------+-------+-------------+------+-----+-------+
I would like to merge the two pyspark dataframes into one such that the new dataframe is like this:
                             given_name  surname   ...     state  address
rec_id_1     rec_id_2                              ...                   
rec-3301-org rec-3301-dup-0           0      1.0   ...         1      1.0
rec-2994-org rec-2994-dup-0           0      1.0   ...         1      1.0
rec-2106-org rec-2106-dup-0           0      1.0   ...         1      0.0
Assume same number of rows.
 
    