to explain my problem, I have created the following df:
hh_01 <- c(rep(1:4, each = 3), rep(5:10, each = 5))
vill <- c(rep(100, 12), rep(101, 30))
hh_02 <- c(2:4, 1, 3, 4, 1:2, 4, 1:3, 6:10, 5, 7:10, 5:6, 8:10, 5:7, 9:10, 5:8, 10, 5:9)
set.seed(1); dist <- abs(rnorm(42, mean = 0, sd = 1000))
df <- matrix(c(hh_01, vill, hh_02, dist), nrow = 42, ncol = 4)
colnames(df) <- c("hh_01", "vill", "hh_02", "dist")
df <- as.data.frame(df)
df
   hh_01 vill hh_02       dist
1      1  100     2 1728.39791
2      1  100     3  979.05280
3      1  100     4  972.09301
4      2  100     1  461.72457
5      2  100     3  384.84236
6      2  100     4  523.10665
7      3  100     1  482.88891
8      3  100     2  218.27501
9      3  100     4  878.32424
10     4  100     1   41.75679
11     4  100     2  967.72103
12     4  100     3  661.80881
13     5  101     6  851.74364
14     5  101     7  852.48595
15     5  101     8  471.51824
16     5  101     9  862.90742
17     5  101    10  750.57410
18     6  101     5 1714.03797
19     6  101     7   93.43975
20     6  101     8  640.15912
21     6  101     9  601.66437
22     6  101    10  969.44271
23     7  101     5   77.95871
24     7  101     6  604.71114
25     7  101     8  169.18386
26     7  101     9  435.42663
27     7  101    10  604.22278
28     8  101     5  475.18935
29     8  101     6   13.09895
30     8  101     7 2873.04565
31     8  101     9 1019.03810
32     8  101    10   41.51445
33     9  101     5  914.63453
34     9  101     6   67.62432
35     9  101     7   85.45653
36     9  101     8  971.21044
37     9  101    10 2074.87280
38    10  101     5   98.43913
39    10  101     6  437.63773
40    10  101     7  620.47573
41    10  101     8  376.56226
42    10  101     9 1013.93106
My task: for all hh_01 with the same value calculate the mean of dist and save the result in a new df with the following structure:
hh_01  vill  mean_dist
1      100   1226.515
2      100   .......
I know I have to use the for loop (or maybe alternatively sapply/lapply or ) but I don´t know how to finish this command...
for (i in seq(along=df[,df$hh_01])){
  ifelse(df$hh_01[i] == df$hh_01[i+1])
}
I know these are basics in programming (not just in R) but i´m not a programmer and pretty new in this area...) I would appreciate any help. The simpler the code the better for me (please with short explanation). I would like to understand this kind of looping (or looping in general) because I have to work with this type of questions very often in the future. Thank you very much.
 
     
     
     
    