I have a data.table with sequences and number of reads, like so:
   sequence   num_reads
1: AACCTGCCG          1
2: CGCGCTCAA         12
3: AGTGTGAGC          3
4: TGGGTACAC         11
5: GGCCGCGTG         15
6: CCTTAAGAG          2
7: GCGGAACTG          9
8: GCGTTGTAG         17
9: GTTGTAGCG         20
10: ACACGTGAC        16
I'd like to use data.table to add two new columns to this table, based on the results of applying dpois() with two weights and two lambdas. The correct output should be this (based on using data.frame):
sequence     num_reads                       clus1                       clus2
1  AACCTGCCG         1 2.553269503552647000377e-03 1.610220613932057849571e-03
2  CGCGCTCAA        12 1.053993989051599418361e-02 2.887608256917401083896e-02
3  AGTGTGAGC         3 2.085170094567994833468e-02 1.717568654860860896672e-02
4  TGGGTACAC        11 1.806846838374168498498e-02 4.331412385376097462508e-02
5  GGCCGCGTG        15 1.324248858039188620275e-03 5.415587646672919558410e-03
6  CCTTAAGAG         2 8.936443262434262332916e-03 6.440882455728230530922e-03
7  GCGGAACTG         9 4.056186780023639942838e-02 7.444615037365168164207e-02
8  GCGTTGTAG        17 2.385595369261770803265e-04 1.274255916864215588610e-03
9  GTTGTAGCG        20 1.196285397159046524451e-05 9.538289904012846548518e-05
10 ACACGTGAC        16 5.793588753921446012421e-04 2.707793823336458478163e-03
But when I try to use data.table I can't seem to get the right result. Here is what I tried (based on similar questions asked around this topic):
    pois = function(n, p, l){return(dpois(as.numeric(as.character(n)), l)*p) }
    x = x[, c(paste("clus", seq(1,2), sep = '')) := pois(num_reads, c(0.4,0.6), c(7,8)), by = seq_len(nrow(x))]
And here is the result:
          sequence num_reads                             clus1                      clus2
    1:   AACCTGCCG         1       2.553269503552647000377e-03 2.553269503552647000377e-03
    2:   CGCGCTCAA        12       1.053993989051599418361e-02 1.053993989051599418361e-02
    3:   AGTGTGAGC         3       2.085170094567994833468e-02 2.085170094567994833468e-02
    4:   TGGGTACAC        11       1.806846838374168498498e-02 1.806846838374168498498e-02
    5:   GGCCGCGTG        15       1.324248858039188620275e-03 1.324248858039188620275e-03
    6:   CCTTAAGAG         2       8.936443262434262332916e-03 8.936443262434262332916e-03
    7:   GCGGAACTG         9       4.056186780023639942838e-02 4.056186780023639942838e-02
    8:   GCGTTGTAG        17       2.385595369261770803265e-04 2.385595369261770803265e-04
    9:   GCGTTGTAG        20       1.196285397159046524451e-05 1.196285397159046524451e-05
    10:  ACACGTGAC        16       5.793588753921446012421e-04 5.793588753921446012421e-04
The reason I'm using data.table and not data.frame is that my real data has 100,000s of rows. I studied the answers to this and this but I haven't been able to come up with a solution.
Any tips you have would be much appreciated. Thanks!
 
     
    