I am struggling to read a large (22MB) datafile in R with read.csv. I know there are a bunch of similar questions but none of the ones I found seem to help me solve my problem (one, two, three, 
My file has 6 columns, all but one are integers, and 330,001 rows.
I can open the file in excel. There are no empty cells anymore and every column except the first one (id) has NA values. 
When I use the following code:
mt <- read.csv("C:/path/master.csv", header=T, sep=",", encoding="utf-8")
I get 79,024 rows Changing this to the following (see here)
mt <- read.csv("C:/Users/slebex/Desktop/PhB Data/master.csv", header=T, sep=",", quote="", encoding="utf-8")
Increases the rowcount to 104,510 but reads all my integers as factors (or characters when I add stringsAsFactors = F).
Additionally, using the following code gives the below warning message and loads a large character of 26,234 elements only
mt <- readLines(file("C:/Users/slebex/Desktop/PhB Data/master.csv", encoding="utf-8"))
Warning messages:
1: invalid input found on input connection 'C:/path/master.csv' 
2: In readLines(file("C:/path/master.csv",  :
  incomplete final line found on 'C:/path/master.csv'
The datatable looks like this
head(mt)
  id App_date Grant_date Prior_date Num_inventors Assignee
1  1       NA   18370630         NA             1     <NA>
2  2       NA   18371028         NA             1     <NA>
3  3       NA   18371028         NA             1     <NA>
4  4       NA   18380109         NA             1     <NA>
5  5       NA   18380203         NA             1     <NA>
6  6       NA   18380210         NA             1     <NA>
tail(mt)
         id App_date Grant_date Prior_date Num_inventors
79019 79019 19990826   20010206   19990826             2
79020 79020 19990127   20010206   19920501             2
79021 79021 19980213   20010206   19951002             4
79022 79022 19941108   20010206   19931108             4
79023 79023 19941208   20010206   19901025             1
79024 79024 19980918   20010206   19931214             1
                                                                                                                                 Assignee
79019                                                                                                          Novo Nordisk Biotech, Inc.
79020                                                                                          Trustees Of The University Of Pennsylvania
79021 Mohammad W. Katoot, Katoot, Administrator Karen Robbyn Goodan, Katoot, Administrator Ali Maroof, Katoot, Administrator Ahmed Maroof
79022                                                                                                                   Mcgill University
79023                                                                         The Trustees Of Columbia University In The City Of New York
79024                                                                                                         Centr Embrionalnikh Tkaney 
As you can see, the Assignee variable contains various forms of punctuation. Perhaps this causes a problem I'm not sure (see comments here). I removed all double spaces, changed all commas into semicolons, and removed all quotation marks, but that has not helped. 
Following this question i did the following
library(readr)
mt <- read_csv("C:/Users/slebex/Desktop/PhB Data/master.csv")
This gives me the following error
Warning message:
660101 problems parsing 'C:/Users/slebex/Desktop/PhB Data/master.csv'. See problems(...) for more details.
Despite the error the dataset gets fully loaded but two of my columns suddenly consist entirely of NA values.
If relevant find below the sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_Singapore.1252  LC_CTYPE=English_Singapore.1252   
[3] LC_MONETARY=English_Singapore.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Singapore.1252    
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
loaded via a namespace (and not attached):
[1] tools_3.2.0
Any suggestions would be welcomed
 
    