I have the following series of commands:
my_data = read.csv(file='r-stats.out', sep='\t', skip=1)
histsub = subset(my_data, my_data[,10] != "Invalid")
hist(as.numeric(histsub[,10]))
r-stats.out is a file that has 10 columns, and column number 10 (one which I am trying to plot) has numbers ranging from -2000 to 10000 or the word "Invalid" which I try to first filter out. For some reason, my histogram only has range from 0 to 2500 IGNORING everything else. Why? What is happening? I did a
print(histsub)
and everything looks okay, those numbers are there in the histsub, but not on the plot. Please help.
EDIT: Adding a few lines from my_data print and also from histsub: my_data:
38    629345  1  633201  0   -41 Invalid    0   g    0     -37
39    633201  0  628727  0  4496     323    0   g    0    4629
40    628727  0  631371  1  7835     202    0   g    0 Invalid
41    631371  1  625871  1  7317     112    0   g    0    7379
42    625871  1  633427  1  1351     348    0   g    0    1321
histsub:
38    629345  1  633201  0  -41 Invalid    0   g    0   -37
39    633201  0  628727  0 4496     323    0   g    0  4629
41    631371  1  625871  1 7317     112    0   g    0  7379
42    625871  1  633427  1 1351     348    0   g    0  1321
 
     
     
    