Not sure title is clear or not, but I want to shuffle a column in a dataframe, but not for every individual row, which is very simple to do using sample(), but for pairs of observations from the same sample.
For instance, I have the following dataframe df1:
>df1
sampleID groupID  A B C D E F
438   1      1      0      0      0      0      0
438   1      0      0      0      0      1      1
386   1      1      1      1      0      0      0
386   1      0      0      0      1      0      0
438   2      1      0      0      0      1      1
438   2      0      1      1      0      0      0
582   2      0      0      0      0      0      0
582   2      1      0      0      0      1      0
597   1      0      1      0      0      0      1
597   1      0      0      0      0      0      0
I want to randomly shuffle the labels here for groupID for each sample, not observation, so that the result looks like:
>df2
sampleID groupID  A B C D E F
438   1      1      0      0      0      0      0
438   1      0      0      0      0      1      1
386   2      1      1      1      0      0      0
386   2      0      0      0      1      0      0
438   1      1      0      0      0      1      1
438   1      0      1      1      0      0      0
582   1      0      0      0      0      0      0
582   1      1      0      0      0      1      0
597   2      0      1      0      0      0      1
597   2      0      0      0      0      0      0
Notice that in column 2 (groupID), sample 386 is now 2 (for both observations).
I have searched around but haven't found anything that works the way I want. What I have now is just shuffling the second column. I tried to use dplyr as follows:
df2 <- df1 %>%
  group_by(sampleID) %>%
  mutate(groupID = sample(df1$groupID, size=2))
But of course that only takes all the group IDs and randomly selects 2.
Any tips or suggestions would be appreciated!