I have a output data, where in each row there are multiple isoforms for each gene. Isoforms are seperated by comma ','. When I import the table to R, data frame looks like as below.
Df:
gene isoform                sample1_read_number        p-value
A    'A1','A2','A3'         0:23,1:12,2:122            0.9,0.01,0.5
B    'B1','B2','B3'         0:3,1:45,2:76              0.43,0.001,0.12
C    'C1','C2','C3','C4'    0:5,1:56,2:166,3:7         0.004,0.002,0.23,0.12
D    'D1','D2'              0:43,1:100                 0.1,0.0003
For each gene, there are multiple isoforms. For each isoform, I have read numbers, seperated by comma (0:23 read for A1 meaning A1 read is 23) and p-values seperated by comma (p-value for A1 is 0.9 and A2 is 0.01). So everything is in an order by comma separation in each object.
For example when I call, df[1,2] the result is [1] 'A1','A2','A3''
or df[1,4] the result is [1] 0.9,0.01,0.5 as one object. I couldn't figure how to make R to separate those values in df[X,Y].
The reason I want to do this is because, I want to filter this data to based on p-value or read number. To be able to do that, first I should be able to break this data frame by each isoform and to do that I need to find a way to separate values on each spot.
Final data frame should be like that (only showing for gene A and B here):
Df_I:
gene isoform sample1_read_number  p-value 
A    A1      0:23                 0.9
A    A2      1:12                 0.01
A    A3      2:122                0.5
B    B1      0:3                  0.43
B    B2      1:45                 0.001
B    B3      2:76                 0.12
Anybody can give me ideas to make this second data frame? Any help would be appreciated a lot!
Cheers! A
 
     
    