I have an Excel spreadsheet that I would like to change from wide to long format in R. However, the original layout of the excelsheet makes this problematic.
Spreadsheet:
| Date | Unit | Day | Treatment | Value | Unit | Day | Treatment | Value | 
|---|---|---|---|---|---|---|---|---|
| 01-03-2023 | 1 | 1 | treatment1 | 8.5 | 2 | 1 | treatment2 | 9.4 | 
| 01-03-2023 | 1 | 2 | treatment1 | 10.2 | 2 | 2 | treatment2 | 10.5 | 
This goes on for more than 200 different units, i.e. it is a very wide spreadsheet and there are more columns than added in this example
# I would like to change the format to the following:
| Date | Unit | Day | Treatment | Value | 
|---|---|---|---|---|
| 01-03-2023 | 1 | 1 | treatment1 | 8.5 | 
| 01-03-2023 | 2 | 1 | treatment2 | 9.4 | 
| 02-03-2023 | 1 | 2 | treatment1 | 10.2 | 
| 02-03-2023 | 2 | 2 | treatment2 | 10.5 | 
The first problem arises when trying to import the sheet, because R creates unique names for each column by adding numbers sequentially to the names of each column, e.g.
| Date | Unit...2 | day...3 | treatment...4 | value...5 | Unit...6 | day...7 | treatment...8 | value...9 | 
|---|
The second problem is how to obtain the desired long-format table using pivot_longer (or other alternatives), with grouped/repeated data like this?
The original layout of the spreadsheet cannot be modified since many persons use it and add data to it everyday.
I hope to find a solution using the pivot_longer function, since I'm familiar with this - but I will accept all suggestions with graditude
I found some inspiration from other posts, but these all have the "unit identifiers" added to each column name, whereas I have random unique identifiers for each column generated by R, e.g. https://stackoverflow.com/questions/59891956/converting-data-from-wide-to-long-format-when-id-variables-are-encoded-in-column
or
 
     
     
     
    