Ok so I've read this question Confusion between factor levels and factor labels. But still feel like I am missing a lot. So this is maybe not a question per se - more like a presentation of my frustration.
Sample data
sample <- dput(structure(list(Logistik_1 = structure(c(3L, 2L, 3L, 3L, 3L, 4L), .Label = c("I meget ringe grad", "I ringe grad", "I nogen grad", "I høj grad", "I meget høj grad"), class = "factor"),
Logistik_2 = structure(c(4L, 4L, 4L, 3L, 3L, 4L), .Label = c("I meget ringe grad", "I ringe grad", "I nogen grad", "I høj grad", "I meget høj grad"), class = "factor"),
Logistik_3 = structure(c(3L, 4L, 3L, 4L, 3L, 4L), .Label = c("I meget ringe grad", "I ringe grad", "I nogen grad", "I høj grad", "I meget høj grad"), class = "factor"),
Logistik_4 = structure(c(4L, 2L, 3L, 4L, 2L, 3L), .Label = c("I meget ringe grad", "I ringe grad", "I nogen grad", "I høj grad", "I meget høj grad"), class = "factor")),
.Names = c("Logistik_1","Logistik_2", "Logistik_3", "Logistik_4"), row.names = c(NA, 6L), class = "data.frame"))
The output of sample shows me the labels.
Logistik_1 Logistik_2 Logistik_3 Logistik_4
1 I nogen grad I høj grad I nogen grad I høj grad
2 I ringe grad I høj grad I høj grad I ringe grad
3 I nogen grad I høj grad I nogen grad I nogen grad
4 I nogen grad I nogen grad I høj grad I høj grad
5 I nogen grad I nogen grad I nogen grad I ringe grad
6 I høj grad I høj grad I høj grad I nogen grad
I can not make calculations with these nominal data rowSums(sample):
Error in rowSums(sample) : 'x' must be numeric
I can change each and single variable to a numeric. E.g. if I want to add all the integer values I can do this: sample$test <- as.numeric(sample[[1]])+as.numeric(sample[[2]])+as.numeric(sample[[3]])+as.numeric(sample[[4]]) which will work. But its lot of typing I think?
However: If I cbind the columns, the output returns the levels: Output of with(sample, cbind(Logistik_1, Logistik_2)):
Logistik_1 Logistik_2
[1,] 3 4
[2,] 2 4
[3,] 3 4
[4,] 3 3
[5,] 3 3
[6,] 4 4
And I can make calculations on these levelse. E.g. if I want to add all the integer values I can do this: sample$total_score <-with(sample, rowSums(cbind(Logistik_1, Logistik_2, Logistik_3, Logistik_4))) [a]
Logistik_1 Logistik_2 Logistik_3 Logistik_4 total_score
1 I nogen grad I høj grad I nogen grad I høj grad 14
2 I ringe grad I høj grad I høj grad I ringe grad 12
3 I nogen grad I høj grad I nogen grad I nogen grad 13
4 I nogen grad I nogen grad I høj grad I høj grad 14
5 I nogen grad I nogen grad I nogen grad I ringe grad 11
6 I høj grad I høj grad I høj grad I nogen grad 15
But I am confused, and think I am doing something which is simple too complicated. Is there a canonical 'correct' way to make calculations on factor levels? Is as.numeric more correct than cbind? And why does cbind work like this to begin with?
My hope was something like this would work: sum(as.numeric(sample[1:4])) - but that returns Error: (list) object cannot be coerced to type 'double' (because I am calling as.numeric on dataframe).
[a] I am aware that most statisticians will frown upon the common practice of assigning integer values to survey responses (e.g. "Highly agree" =5, "Agree somewhat" = 4 etc.) - but please just accept that's how we do it in the social sciences :-).The labels are responses in a survey and the levels are the integer values assigned to those responses.