Some rules of thumb for new R users (According to myself):
If you are working with
data.frames, forget there is a function calledapply- whatever you do - don't use it. Especially with a margin of 1 (the only good usecase for this function is to operate overmatrixcolumns- margin of 2).- Some good alternatives:
?do.call,?pmax/pmin,?max.col,?rowSums/rowMeans/etc, the awesomematrixStatspackages (for matrices),?rowsumand many more
- Some good alternatives:
For loops are not bad- don't listen to anyone who says otherwise. They are bad only in certain cases:
- If you use them to iterate over rows.
- If you are performing unvectorized/inefficient operation within each iteration
- If you are writing a loop for something that is already vectorized
R is a vectorized language- meaning many operation were already written in C loops- so don't reinvent the wheel and write stuff in R loops if it was already written. With one exception- many of these functions work only with matrices. Hence, if you have a
data.frameyou should think twice if you want it to be converted to amatrix(you may experience some unexpected consequences as a result), or can you avoid it.Learn base R before you learn any fancy packages such as
dplyr. It is a nice package and all, but it was designed for very specific things. Many many operations could be done much more efficiently using base R.Get familiar with R classes. Learn what is
factorand how to use it. Know the difference between amatrix(a vector with adimattribute) and adata.frame(alistof vectors). Learn how and when to work withlists orarrays. Know the difference betweennumericandinteger. Read about floating points.Learn how and when yo use
lapply/sapply/vapply- these could come useful many timesYou must learn some
?regex. Must.Read
?S4groupGenericin order to discover which functions havedata.framemethods (a very useful to know).Learn about
?methodsRead
?strptimevery carefully (note theSys.setlocale("LC_TIME", "C")part - could be a life saver).Read the damn docs. R has awesome documentation- please use it. You won't find anything even nearly as good in any other language (I know of).
Like Barry Rowlingson once said: "This is all documented in TFM. Those who WTFM don't want to have to WTFM again on the mailing list. RTFM."