I try to implement SQL query using sqldf and dplyr.
I need to do this separately using these 2 different libraries.
Unfortunately, I cannot produce the same result using dplyr.
library(sqldf)
library(dplyr)
Id       <- c(1,2,3,4)
HasPet   <- c(0,0,1,1)
Age      <- c(20,1,14,10)
Posts <- data.frame(Id, HasPet, Age)
# sqldf way
ref <- sqldf("
      SELECT Id, HasPet, MAX(Age) AS MaxAge
      FROM Posts
      GROUP BY HasPet
  ")
# dplyr way
res <- Posts %>%
  group_by(HasPet) %>%
  summarize(
    Id,
    HasPet,
    MaxAge = max(Age)
    ) %>%
  select(Id, HasPet, MaxAge)
head(ref)
head(res)
Output for sqldf is:
> head(ref)
  Id HasPet MaxAge
1  1      0     20
2  3      1     14
while the output for sqldf is different:
> head(res)
# A tibble: 4 x 3
# Groups:   HasPet [2]
     Id HasPet MaxAge
  <dbl>  <dbl>  <dbl>
1     1      0     20
2     2      0     20
3     3      1     14
4     4      1     14
UPD. SQL query cannot be modified.
 
     
    