In my dataset:
# A tibble: 240 x 1,415
   matchcode S001  S002  S002EVS S003  S003A S004  S006  S007  S007_01    S008  S009  S009A S010  S010_01 S010_02 S010_03 S010_04 S011  S012  S013  S013B S014  S015  S016  S017      S017A    
   <fct>     <dbl> <dbl> <dbl+l> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl>  <dbl> <fct> <fct> <dbl> <dbl+l> <dbl+l> <dbl+l> <dbl+l> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl+lbl>
 1 "JPN 198~ 2     1     -4      392   392   -4     324    324 3920120324 -4    JP    JP     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    0.6789805 0.6789805
 2 "MEX 198~ 2     1     -4      484   484   -4     933   2130 4840120926 -4    MX    MX     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    1.1378840 1.1378840
 3 "HUN 198~ 2     1     -4      348   348   -4    1280   4321 3480121280 -4    HU    HU     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    1.0635516 1.0635516
 4 "AUS 198~ 2     1     -4       36    36   -4     973   5478  360120973 -4    AU    AU     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    0.9616138 0.9616138
 5 "ARG 198~ 2     1     -4       32    32   -4     874   6607  320120874 -4    AR    AR     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    0.9266260 0.9266260
 6 "FIN 198~ 2     1     -4      246   246   -4     385   7123 2460120385 -4    FI    FI     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    1.0000000 1.0000000
 7 "KOR 198~ 2     1     -4      410   410   -4       3   7744 4100120003 -4    KR    KR     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    1.0000000 1.0000000
 8 "ZAF 198~ 2     1     -4      710   710   -4    5420  10260 7100121549 -4    ZA    ZA     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    1.0000000 1.0000000
 9 "ARG 199~ 2     2     -4       32    32   -4     856  11163  320240856 -4    AR    AR    125   -4      -4      -4      -4      1210  -4     1    -4    -4    -4    -4    1.0000000 1.0000000
10 "BLR 199~ 2     2     -4      112   112   -4     106  11415 1120240106 -4    BY    BY     -4   -4      -4      -4      -4        -4  -4    -4    -4    -4    -4    -4    1.0000000 1.0000000
to replace all negative values with NA's, I used the following code:
df [ df < 0 ] <- NA
I however only want to have this operation carried out on columns that are not characters (I want to get rid of the error messages, without suppressing them). The variable charcol holds the names of the columns that should be skipped. I tried:
df [-charcol] df [-charcol] < 0] <- NA
Which gave me the error:
Error: cannot allocate vector of size 1.8 Gb
In addition to still giving me the warnings:
In addition: Warning messages:
1: In Ops.factor(left, right) : ‘<’ not meaningful for factors
Although I probably got the syntax wrong, I am wondering what would be the most efficient solution for such problems for large datasets. I have been looking at the data.table vignette for a while, but I cannot really figure out how to do the syntax.
Any suggestions?
str(WVSsample)
Classes ‘data.table’ and 'data.frame':  240 obs. of  1415 variables:
 $ matchcode  : Factor w/ 240 levels "ALB 1998 ","ALB 2002 ",..: 108 134 88 12 4 73 117 232 5 25 ...
 $ S001       :Class 'labelled'  atomic [1:240] 2 2 2 2 2 2 2 2 2 2 ...
  .. ..- attr(*, "label")= chr "Study"
  .. ..- attr(*, "format.stata")= chr "%8.0g"
  .. ..- attr(*, "labels")= Named num [1:7] -5 -4 -3 -2 -1 1 2
  .. .. ..- attr(*, "names")= chr [1:7] "Missing; Unknown" "Not asked in survey" "Not applicable" "No answer" ...
 $ S002       :Class 'labelled'  atomic [1:240] 1 1 1 1 1 1 1 1 2 2 ...
  .. ..- attr(*, "label")= chr "Wave"
  .. ..- attr(*, "format.stata")= chr "%8.0g"
  .. ..- attr(*, "labels")= Named num [1:11] -5 -4 -3 -2 -1 1 2 3 4 5 ...
  .. .. ..- attr(*, "names")= chr [1:11] "Missing; Unknown" "Not asked in survey" "Not applicable" "No answer" ...
 $ S002EVS    :Class 'labelled'  atomic [1:240] -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 ...
  .. ..- attr(*, "label")= chr "EVS-wave"
  .. ..- attr(*, "format.stata")= chr "%8.0g"
  .. ..- attr(*, "labels")= Named num [1:9] -5 -4 -3 -2 -1 1 2 3 4
  .. .. ..- attr(*, "names")= chr [1:9] "Missing; Unknown" "Not asked in survey" "Not applicable" "No answer" ...
 $ S003       :Class 'labelled'  atomic [1:240] 392 484 348 36 32 246 410 710 32 112 ...
  .. ..- attr(*, "label")= chr "Country/region"
  .. ..- attr(*, "format.stata")= chr "%8.0g"
  .. ..- attr(*, "labels")= Named num [1:199] -5 -4 -3 -2 -1 4 8 12 16 20 ...
  .. .. ..- attr(*, "names")= chr [1:199] "Missing; Unknown" "Not asked in survey" "Not applicable" "No answer" ...
 $ S003A      :Class 'labelled'  atomic [1:240] 392 484 348 36 32 246 410 710 32 112 ...
  .. ..- attr(*, "label")= chr "Country/regions [with split ups]"
  .. ..- attr(*, "format.stata")= chr "%8.0g"
  .. ..- attr(*, "labels")= Named num [1:199] -5 -4 -3 -2 -1 4 8 12 16 20 ...
  .. .. ..- attr(*, "names")= chr [1:199] "Missing; Unknown" "Not asked in survey" "Not applicable" "No answer" ...
 $ S004       :Class 'labelled'  atomic [1:240] -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 ...
  .. ..- attr(*, "label")= chr "Set"
  .. ..- attr(*, "format.stata")= chr "%8.0g"
  .. ..- attr(*, "labels")= Named num [1:7] -5 -4 -3 -2 -1 1 2
  .. .. ..- attr(*, "names")= chr [1:7] "Missing; Unknown" "Not asked in survey" "Not applicable" "No answer" ...
EDIT: @chinsoon12 mentioned using the following piece of code:
f_dowle3 = function(DT) {
  for (j in seq_len(ncol(DT)))
    set(DT,which(is.na(DT[[j]])),j,0)
}
This code however does not do two things:
- It replaces NA's with zero, while I want to replace negative values with NA's. I need to change the - which(is.na(DT[[j]]))part to something like- DT[[j]]) < 0.
- It does not account for character columns. 
I changed the code to:
f_dowle3 = function(DT) {
  # or by number (slightly faster than by name) :
  for (j in seq_len(ncol(DT)))
    set(DT,which(DT[[j]]<0),j,NA)
}
But this makes the dataset NULL. Could anyone help me with adapting the code properly?
 
    