I have a spark data frame:
library(SparkR); library(magrittr)
as.DataFrame(mtcars) %>%
groupBy("am")
How can I ungroup this data frame? There doesn't seems to be any ungroup function in the SparkR library!
I have a spark data frame:
library(SparkR); library(magrittr)
as.DataFrame(mtcars) %>%
groupBy("am")
How can I ungroup this data frame? There doesn't seems to be any ungroup function in the SparkR library!
There doesn't seems to be any ungroup function in the SparkR library
That's because groupBy doesn't have the same meaning as group_by in dplyr.
SparkR::group_by / SparkR::groupBy returns not a SparkDataFrame but a GroupData object which corresponds to GROUP BY clause in SQL. To convert it back to a SparkDataFrame you should call SparkR::agg (or if you prefer dplyr nomenclature SparkR::summarize) which corresponds to SELECT component of the SQL query.
Once you aggregate you get back SparkDataFrame and grouping is no longer present.
Additionally SparkR::groupBy doesn't have dplyr group_by(...) %>% mutate(...) equivalent. Instead we use window functions with frame definition.
So the take away message is - if you don't plan to aggregate don't use groupBy.