In the recent TIMSS report that I happened to come across, there's a plot (shown below) that in my opinion is very communicative. I've read that such plots are called Cleveland dot plots, though this one adds confidence intervals as well. I was wondering if it can be reproduced in ggplot2 or matplotlib. All hints are welcome.

(source: timss2015.org) 
            Asked
            
        
        
            Active
            
        
            Viewed 2,260 times
        
    4
            
            
         
    
    
        Glorfindel
        
- 21,988
- 13
- 81
- 109
 
    
    
        John Smith
        
- 81
- 2
- 
                    Can you please include data that will provide us with a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) ? – Ben Bolker Dec 02 '16 at 15:08
- 
                    I believe data for plot is [here] (http://timss2015.org/wp-content/uploads/filebase/science/1.-student-achievement/1_1_science-distribution-of-science-achievement-grade-4.xls) – John Smith Dec 02 '16 at 16:21
2 Answers
4
            
            
        Using the iris data set:
library(dplyr)
library(ggplot2)
plot_data <- iris %>% 
  group_by(Species) %>% 
  summarise_each(funs(mean, sd, n(), q95=quantile(., 0.95), q75=quantile(., 3/4), q25=quantile(., 1/4),  q5 = quantile(., 0.05)), Sepal.Length) %>% 
  mutate(se = sd/sqrt(n),
         left95 = mean - 2*se,
         right95 = mean + 2*se)
ggplot(plot_data, aes(x = Species, y = mean)) +
  geom_crossbar(aes(ymin = q5, ymax = q95), fill = "aquamarine1",  color = "aquamarine1", width = 0.2) +
  geom_crossbar(aes(ymin = q25, ymax = q75), fill = "aquamarine4",  color = "aquamarine4", width = 0.2) +
  geom_crossbar(aes(ymin = left95, ymax = right95), fill = "black", color = "black", width = 0.2) +
  coord_flip() +
  theme_minimal()
This should give you the gist of how to use ggplot2 to accomplish this.  The data you provided can be easily used, without the dplyr summarizing. 
 
    
    
        Jake Kaupp
        
- 7,892
- 2
- 26
- 36
3
            
            
        A Cleveland [edited] dot plot display all the values of a dataset as points ordered on the x-axis simply with the position in dataset (not the averages as in the other answer). Using ggplot2 (and the iris dataset again as example):
ggplot(iris) + geom_point(aes(y=Sepal.Length,x=seq(1,length(Sepal.Length),1))) 
If you have an unique ID for each row, you can use that instead of x=seq(1,length(Sepal.Length),1) since both Y and X are required aesthetics for geom_point
 
    
    
        Simone Bianchi
        
- 134
- 10
- 
                    Pretty sure Cleveland would recommend sorting them in a meaningful way instead of just by row order, but I don't have a reference in front of me to quote from. – Aaron left Stack Overflow Oct 08 '19 at 05:32
- 
                    I guess @Aaron is right: https://en.wikipedia.org/wiki/Dot_plot_(statistics)#Cleveland_dot_plots – Simone Bianchi Oct 09 '19 at 13:15
- 
                    And for a primary reference, just look at the cover of his [Visualizing Data](https://www.amazon.com/Visualizing-Data-William-S-Cleveland/dp/0963488406) book. – Aaron left Stack Overflow Oct 09 '19 at 14:34

