I have a dataset containing the number of infants born per gestational week.
I am trying to determine the median gestational age of delivery based on the frequency of infants born for this particular year
For example:
| GA | num_infants_born |
|---|---|
| 20 weeks | 16 |
| 21 weeks | 22 |
| 22 weeks | 34 |
| 23 weeks | 45 |
| 24 weeks | 60 |
| 25 weeks | 67 |
| 26 weeks | 94 |
and onwards, until 41 weeks. The distribution is (not surprisingly) left skewed
I also calculated cumulative frequencies using
data$cumulative_freq = cumsum(data$num_infants_born)
Do I use the cumulative_freq column to calculate the median number of infants born that corresponds to a gestational week? Using
median(medianGA2001a$cumulative_freq)
gives me an unexpected number.
I am expecting the median GA to be around 35 weeks, based on the distribution