Here's a solution using data.table's := operator, building on Andrie and Ramnath's answers.
require(data.table)  # v1.6.6
require(gdata)       # v2.8.2
set.seed(1)
dt1 = create_dt(2e5, 200, 0.1)
dim(dt1)
[1] 200000    200    # more columns than Ramnath's answer which had 5 not 200
f_andrie = function(dt) remove_na(dt)
f_gdata = function(dt, un = 0) gdata::NAToUnknown(dt, un)
f_dowle = function(dt) {     # see EDIT later for more elegant solution
  na.replace = function(v,value=0) { v[is.na(v)] = value; v }
  for (i in names(dt))
    eval(parse(text=paste("dt[,",i,":=na.replace(",i,")]")))
}
system.time(a_gdata = f_gdata(dt1)) 
   user  system elapsed 
 18.805  12.301 134.985 
system.time(a_andrie = f_andrie(dt1))
Error: cannot allocate vector of size 305.2 Mb
Timing stopped at: 14.541 7.764 68.285 
system.time(f_dowle(dt1))
  user  system elapsed 
 7.452   4.144  19.590     # EDIT has faster than this
identical(a_gdata, dt1)   
[1] TRUE
Note that f_dowle updated dt1 by reference. If a local copy is required then an explicit call to the copy function is needed to make a local copy of the whole dataset. data.table's setkey, key<- and := do not copy-on-write.
Next, let's see where f_dowle is spending its time.
Rprof()
f_dowle(dt1)
Rprof(NULL)
summaryRprof()
$by.self
                  self.time self.pct total.time total.pct
"na.replace"           5.10    49.71       6.62     64.52
"[.data.table"         2.48    24.17       9.86     96.10
"is.na"                1.52    14.81       1.52     14.81
"gc"                   0.22     2.14       0.22      2.14
"unique"               0.14     1.36       0.16      1.56
... snip ...
There, I would focus on na.replace and is.na, where there are a few vector copies and vector scans. Those can fairly easily be eliminated by writing a small na.replace C function that updates NA by reference in the vector. That would at least halve the 20 seconds I think. Does such a function exist in any R package?
The reason f_andrie fails may be because it copies the whole of dt1, or creates a logical matrix as big as the whole of dt1, a few times. The other 2 methods work on one column at a time (although I only briefly looked at NAToUnknown).
EDIT (more elegant solution as requested by Ramnath in comments) :
f_dowle2 = function(DT) {
  for (i in names(DT))
    DT[is.na(get(i)), (i):=0]
}
system.time(f_dowle2(dt1))
  user  system elapsed 
 6.468   0.760   7.250   # faster, too
identical(a_gdata, dt1)   
[1] TRUE
I wish I did it that way to start with!
EDIT2 (over 1 year later, now)
There is also set(). This can be faster if there are a lot of column being looped through, as it avoids the (small) overhead of calling [,:=,] in a loop. set is a loopable :=. See ?set.
f_dowle3 = function(DT) {
  # either of the following for loops
  # by name :
  for (j in names(DT))
    set(DT,which(is.na(DT[[j]])),j,0)
  # or by number (slightly faster than by name) :
  for (j in seq_len(ncol(DT)))
    set(DT,which(is.na(DT[[j]])),j,0)
}