I would like to format my date variable to %d %b %Y (e.g. 05 May 2020). However, once it has been formatted, it becomes a character variable and sorting the variable from the earliest date to the latest date would not be possible (e.g. 05 May 2020 is sorted before 26 Apr 2020).
Data:
df <- structure(list(Date = structure(c(1588204800, 1587945600, 1588464000, 1588032000,
1588291200, 1588377600, 1588118400), class = c("POSIXct",
"POSIXt"), tzone = "UTC")), class = "data.frame", row.names = c(NA, -7L))
# > df
# Date
# 1 2020-04-30
# 2 2020-04-27
# 3 2020-05-03
# 4 2020-04-28
# 5 2020-05-01
# 6 2020-05-02
# 7 2020-04-29
Here is how it looks like sorting a formatted date variable:
df %>%
mutate(Date = format(Date, "%d %b %Y")) %>%
arrange(Date)
# Date
# 1 01 May 2020
# 2 02 May 2020
# 3 03 May 2020
# 4 27 Apr 2020
# 5 28 Apr 2020
# 6 29 Apr 2020
# 7 30 Apr 2020
So, this is what I have done, which works, but I would like to know if this is really correct or if there are alternatives to solve this.
df %>%
mutate(Date = factor(Date, labels = format(sort(unique(Date)), "%d %b %Y"), ordered = TRUE)) %>%
arrange(Date)
# Date
# 1 27 Apr 2020
# 2 28 Apr 2020
# 3 29 Apr 2020
# 4 30 Apr 2020
# 5 01 May 2020
# 6 02 May 2020
# 7 03 May 2020
Edit: Actually the reason behind wanting to format it and arranging it, is so that I can have direct access to more readable date formats when building my dashboard for my users.
When it comes to ggplot(), even after you do arrange and mutate with format, the facetted plots, will always give in sorted character order. Example below:
df %>%
arrange(Date) %>%
mutate(n = 1:n(),
Date = format(Date, "%d %b %Y")) %>%
ggplot() +
geom_bar(aes(x = n)) +
facet_wrap(~Date)
