This is the second time I'm trying to formulate a question here - hopefully this time I'll make my self clear and in line with recommendations on this site. To the problem: I've a dataset on certain companies and their headquarters. The structure of the data comes a bit messy to me (Please see the link below) - even more problematic is that I've the data on 15 separate for the years 2003, 2007, 2011, 2015 and 2019 (three csv files for each year because of the size I guess).
For the purpose of this question I've merged three files into one (for the year 2003).
Now, what I want is this: 1) merge all the 15 files and from there 2) generate a set of variables that would indicate the total number of companies per country and year [note though that the year variable is not included as a variable].
Since I've the data on four main addresses, I'd like to create separate "sum variables" based on the order (1, 2, 3, 4) and, in addition, one variable that doesn't take into account the order of countries.
Just to give an example of how I'd like it to look like:
country year    total_c1    total_c2   ...
USA     2003        100         100
USA     2007        150         120
CAN     2003        50          50
CAN     2007        100         60
I intend to merge this data with a panel data that I have (country-year data).
Please click on the link to access the data. Data sample for 2003. The first variable indicates the ID of companies. The second (country_1) means country of first address. The third (country_2) means country of second address and so on. After that, comes a bunch of variables (over 2800) indicating a single company in the dataset.
Now, what I've come up with in my attempt to do this in R (rather than doing manually). Credit to @Duck in helping me with the merging part.
myfun <- function(df)
{
  #Code
  new <- df %>%
    pivot_longer(starts_with('country')) %>%
    group_by(name) %>%
    summarise_all(sum,na.rm=T)
  return(new)
}
#Load files
myfiles <- list.files(pattern = '.csv')
#List of files
L <- lapply(myfiles, read.csv)
#Apply function
L <- lapply(L,myfun)
# turn to a df
df <- as.data.frame(L)
But this didn't work out for me since I couldn't figure out which year the data come from. Instead I merged the files for one year (for example 2003) and tried to create the variables I want by running this:
  df2<- df %>%
  mutate(Total_c1 = select(., A2654:U9340) %>% rowSums(na.rm = TRUE))
  df3<–df2 %>% group_by(country_1) %>%
  summarise(Total_c1=sum(Total_c1,na.rm = T)
And here I'm stuck. Any suggestion that can take me forward from here (and start from the right side) would be much appreciated!