I have a dataframe with columns describing company names (id), functions (category), indicators (factors) and values for these factors. The purpose is to plot several boxplots to show the distribution of factors values by functions.
Data:
structure(list(id = c("Chee, Chelsea", "Chee, Chelsea", "Chee, Chelsea",
"Chee, Chelsea", "Chee, Chelsea", "Chee, Chelsea", "Chee, Chelsea",
"Chee, Chelsea", "Chee, Chelsea", "Chee, Chelsea", "Hatchett, Dante",
"Hatchett, Dante", "Hatchett, Dante", "Hatchett, Dante", "Hatchett, Dante",
"Hatchett, Dante", "Hatchett, Dante", "Hatchett, Dante", "Hatchett, Dante",
"Hatchett, Dante", "Hagemeier, Wilmer", "Hagemeier, Wilmer",
"Hagemeier, Wilmer", "Hagemeier, Wilmer", "Hagemeier, Wilmer",
"Hagemeier, Wilmer", "Hagemeier, Wilmer", "Hagemeier, Wilmer",
"Hagemeier, Wilmer", "Hagemeier, Wilmer", "el-Jabour, Suhaa",
"el-Jabour, Suhaa", "el-Jabour, Suhaa", "el-Jabour, Suhaa", "el-Jabour, Suhaa",
"el-Jabour, Suhaa", "el-Jabour, Suhaa", "el-Jabour, Suhaa", "el-Jabour, Suhaa",
"el-Jabour, Suhaa", "Salihi, Divya", "Salihi, Divya", "Salihi, Divya",
"Salihi, Divya", "Salihi, Divya", "Salihi, Divya", "Salihi, Divya",
"Salihi, Divya", "Salihi, Divya", "Salihi, Divya", "al-Jamil, Jaad",
"al-Jamil, Jaad", "al-Jamil, Jaad", "al-Jamil, Jaad", "al-Jamil, Jaad",
"al-Jamil, Jaad", "al-Jamil, Jaad", "al-Jamil, Jaad", "al-Jamil, Jaad",
"al-Jamil, Jaad", "Porter, Elijah", "Porter, Elijah", "Porter, Elijah",
"Porter, Elijah", "Porter, Elijah", "Porter, Elijah", "Porter, Elijah",
"Porter, Elijah", "Porter, Elijah", "Porter, Elijah", "Ridgley, Matthew",
"Ridgley, Matthew", "Ridgley, Matthew", "Ridgley, Matthew", "Ridgley, Matthew",
"Ridgley, Matthew", "Ridgley, Matthew", "Ridgley, Matthew", "Ridgley, Matthew",
"Ridgley, Matthew", "Oats, Jiair", "Oats, Jiair", "Oats, Jiair",
"Oats, Jiair", "Oats, Jiair", "Oats, Jiair", "Oats, Jiair", "Oats, Jiair",
"Oats, Jiair", "Oats, Jiair", "Thompson, Asien", "Thompson, Asien",
"Thompson, Asien", "Thompson, Asien", "Thompson, Asien", "Thompson, Asien",
"Thompson, Asien", "Thompson, Asien", "Thompson, Asien", "Thompson, Asien"
), category = c("will", "will", "will", "will", "will", "deal",
"deal", "deal", "deal", "deal", "will", "will", "will", "will",
"will", "deal", "deal", "deal", "deal", "deal", "will", "will",
"will", "will", "will", "deal", "deal", "deal", "deal", "deal",
"will", "will", "will", "will", "will", "deal", "deal", "deal",
"deal", "deal", "will", "will", "will", "will", "will", "deal",
"deal", "deal", "deal", "deal", "will", "will", "will", "will",
"will", "deal", "deal", "deal", "deal", "deal", "will", "will",
"will", "will", "will", "deal", "deal", "deal", "deal", "deal",
"will", "will", "will", "will", "will", "deal", "deal", "deal",
"deal", "deal", "will", "will", "will", "will", "will", "deal",
"deal", "deal", "deal", "deal", "will", "will", "will", "will",
"will", "deal", "deal", "deal", "deal", "deal"), factor = c("f1",
"f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2",
"f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3",
"f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4",
"f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5",
"f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1",
"f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2",
"f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3",
"f4", "f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4",
"f5", "f1", "f2", "f3", "f4", "f5", "f1", "f2", "f3", "f4", "f5"
), value = c(0.339243657473717, 0.384596983617986, 0.0903604942291727,
0.622299975399853, 0.878426613848986, 0.619932561033423, 0.768372484010595,
1.3720186467304, 0.516137222110122, 0.0939216356224454, 0.423330163104718,
1.09092813025095, 1.19417177287019, 0.719465669220584, 0.452970378504298,
-0.262289594598489, 1.22689933746316, 0.816430627598565, 0.225885114542236,
0.632040744287071, 0.104560237280194, 0.381714309901825, 0.62676961473864,
-0.0497874636348734, 0.950027143102881, 0.770846095346556, 0.148980694426281,
0.0441704598142616, 0.490668306336729, 1.02471661138678, 0.156174816905824,
0.31746617387743, 0.156617889567164, 0.0424322867402526, -0.468906139291209,
0.240259904852959, 0.477319222715837, 0.838721253256597, 0.445074674905288,
0.549554109125289, -0.226713556713281, 0.118250559860738, 0.479740692801046,
0.0787136404239509, -0.796681488556265, 0.191482860752725, 0.28786926088113,
0.87763251227066, 0.0338514723682836, 0.235576477670443, -0.0690121807547427,
-0.268401095627916, 0.525430078156439, -0.292741297006626, 0.204765160519623,
0.332993835314161, 0.410545410766758, 0.686637667590553, 0.149842772573679,
0.700177571955539, 0.945997668337351, 0.32488054941514, 0.993151127821943,
0.524358293364559, 0.743356027756573, 0.0247172637782763, 0.205738918048416,
0.922272051144243, 0.264568168014215, 0.800444985485889, 0.0490291076301935,
-0.182296829387635, 0.275266536310165, 0.723462807292679, 1.37681045703127,
0.996572375062412, 0.78567025822639, 0.852269626584109, -0.257367673879751,
0.998810021760118, 0.90491311313343, 1.33803924723801, 1.44241236118906,
1.20343139126242, 0.666758519859951, 1.0151075718858, 0.820298727592033,
1.26452544892297, 0.937448475295236, 0.363135203972494, 0.633056112436769,
0.965685304671053, 0.640992301458128, -0.083835315236123, 1.14088770490309,
0.402326393668432, 0.117951239403618, 0.403472929718899, 1.32109715429833,
0.937023659882023)), class = "data.frame", row.names = c(NA,
-100L))
I think about automatizing this process. I would like to know how can I:
- Filter my dataframe within a function for each
willanddeal; - To make boxplots for factors within each category.
I tried to write a lambda function but did not understand indexing and how to filter tha abstract dataframe which we define in our function. Conceptually, I understand that I am supposed to do something like that:
plots_fun <- function(dataframe){
a <- ggplot(data = dataframe[,1], ...)
}
Also, I thought about using lapply... But my first step is to write the function -- actually, what I am struggling with.
In the case of my sample data, the desirable output is two plots - for will and deal:
ggplot(data = sample_data %>% filter(category == "will"),
aes(factor, value)) +
geom_boxplot()
ggplot(data = sample_data %>% filter(category == "deal"),
aes(factor, value)) +
geom_boxplot()