Any idea on how to apply a function on a dataframe using dplyr in a way that I keep only rows that have any missing value?
            Asked
            
        
        
            Active
            
        
            Viewed 1,935 times
        
    1
            
            
        
        tyluRp
        
- 4,678
 - 2
 - 17
 - 36
 
        Joni Hoppen
        
- 658
 - 5
 - 23
 
3 Answers
4
            
            
        Using @DJack's sample data here, we can do this in dplyr using filter_all. filter_all takes an argument quoted in all_vars or any_vars and applies it to all columns. Here, we keep any row that returns TRUE for is.na in any column.
m <- matrix(1:25, ncol = 5)
m[c(1, 6, 13, 25)] <- NA
df <- data.frame(m)
library(dplyr)
df %>%
  filter_all(any_vars(is.na(.)))
#>   X1 X2 X3 X4 X5
#> 1 NA NA 11 16 21
#> 2  3  8 NA 18 23
#> 3  5 10 15 20 NA
Created on 2018-05-08 by the reprex package (v0.2.0).
        Calum You
        
- 14,687
 - 4
 - 23
 - 42
 
- 
                    That worked just fine in a very elegant way. Any hints about these other two situations - Remove all columns with missing values and , Keep only columns with missing values. – Joni Hoppen May 08 '18 at 20:31
 - 
                    1Both are done with `select_if`. In `dplyr`, `filter` verbs allow you to keep rows, `select` verbs allow you to keep columns in various ways. – Calum You May 08 '18 at 20:34
 
3
            
            
        Here is a (not dplyr) solution:
df[rowSums(is.na(df)) > 0,]
#  X1 X2 X3 X4 X5
#1 NA NA 11 16 21
#3  3  8 NA 18 23
#5  5 10 15 20 NA
Or as suggested by MrFlick:
df[!complete.cases(df),]
Sample data
m <- matrix(1:25, ncol = 5)
m[c(1,6,13,25)] <- NA
df <- data.frame(m)
df
#  X1 X2 X3 X4 X5
#1 NA NA 11 16 21
#2  2  7 12 17 22
#3  3  8 NA 18 23
#4  4  9 14 19 24
#5  5 10 15 20 NA
        DJack
        
- 4,850
 - 3
 - 21
 - 45
 
2
            
            
        I don't know how to solve this with dplyr, but maybe this helps:
First, I created this df:
df <- tribble( ~a ,  ~b, ~c,
               1  , NA ,  0,
               2  ,  0 ,  1,
               3  ,  1 ,  NA,
               4  ,  1 ,  0
             )
Then, this will return only rows with NA:
df[!complete.cases(df),]
See more: Subset of rows containing NA (missing) values in a chosen column of a data frame
        Wlademir Ribeiro Prates
        
- 555
 - 3
 - 17
 
- 
                    This good, just trying both approaches. Let you guys know how it goes. – Joni Hoppen May 08 '18 at 20:24
 - 
                    I was answering at the same time! But this validates your solution, right?! – Wlademir Ribeiro Prates May 08 '18 at 21:03