I conducted a diary study in which for 5 days, participants had to answer to 2 times.
My criteria was that people had to answer to at least 3 full days out of the 5. So, that from the overall 10 times in which the questionnaire took place, they had to answer to at least 6 times. Everytime they filled in the questionnaire they had to put a personal code, which is why I can see who answered and how many times.
I put like this:
Morning_Afternoon_PT_EN: is the name of the database
respfreq <- calc.nomiss(Morning_Afternoon_PT_EN$day, tolower(Morning_Afternoon_PT_EN$code), data=Morning_Afternoon_PT_EN)
print(respfreq)
   952345172    alju12    amou79    amou91    baab81 
        0         5        10        10        10        10 
   base85    beju58    cade61    caju21    chno45    crju09 
       10        10        10        10         5         7 
   faap52    fuau48    fude38    fuma07    huju03    leja26 
       10         8         3        10         8        10 
   leju40    lema32    leno81    liab14    liab20    liab50 
       10         9         8         9        10         9 
  liabr14    liag30    liag60   liap520    liau35    lide50 
        1        10         9        10         9         9 
   life10    life74    lija05    lija45    lija78    liju65 
        9         1        10        10         9        10 
   liju94    lima40    lima82    limf96    lioc46    lioc84 
        9        10        10         4        10        10 
   lise50    lise88    maab31    moag91    moap58    pode04 
        9        10        10        10         9         8 
   sade61    saja28    saja79    saoc06    sema72    sema83 
        9        10        10         9        10        10 
   tose37    vima32 
        9         9 
length(respfreq)
[1] 56
So, I see that "952345172", "chno45", "limf96","liabr14","life74", "fude38" do not meet the requiremente and I want to eliminate them from the overall data base.
I tried to use subset, like:
NewDataFrame<-subset(Morning_Afternoon_PT_EN, respfreq>6)
But, I get the answer:
NewDataFrame<-subset(Morning_Afternoon_PT_EN, respfreq>6)
Error: Must subset rows with a valid subscript vector. i Logical subscripts must match the size of the indexed input. x Input has size 485 but subscript
rhas size 56.
I understand the error, but I don't know how to solve it.
 
    