I have multiple Excel sheets of a dataset that I need to align.
Simplifying, it looks like this (this is just a simplification, in fact it has hundreds of columns and rows and there are multiple sheets of data):
 As you can see, in the first sheet every patient has age, however in sheet n° 2 only patients 1, 3 and 4 have data on sex (and also those with sex missing are not present in the sheet). The result I want to have is in the third image so that the patients with sex missing are still reported but with blank. So I want to perform the alignment based on PATIENT NUMBER.
I want to know if there are packages or rapid ways to do it.
As you can see, in the first sheet every patient has age, however in sheet n° 2 only patients 1, 3 and 4 have data on sex (and also those with sex missing are not present in the sheet). The result I want to have is in the third image so that the patients with sex missing are still reported but with blank. So I want to perform the alignment based on PATIENT NUMBER.
I want to know if there are packages or rapid ways to do it.
            Asked
            
        
        
            Active
            
        
            Viewed 33 times
        
    0
            
            
         
    
    
        user19745561
        
- 145
- 10
- 
                    This is called a left join. Please see the examples in the linked duplicate. – Allan Cameron Mar 09 '23 at 20:47
1 Answers
1
            Merge them with a left join:
library(tidyverse)
a <- tibble(
  patient = 1:5,
  age = c(22, 30, 31, 50, 60)
)
b <- tibble(
  patient = c(1, 3, 4),
  sex = c(0, 1, 0)
)
left_join(a, b, join_by(patient))
#> # A tibble: 5 × 3
#>   patient   age   sex
#>     <dbl> <dbl> <dbl>
#> 1       1    22     0
#> 2       2    30    NA
#> 3       3    31     1
#> 4       4    50     0
#> 5       5    60    NA
Created on 2023-03-09 with reprex v2.0.2
 
    
    
        dufei
        
- 2,166
- 1
- 7
- 18