Consider the following:
Plot a histogram using R and save it in PDF:
set.seed(42) x = c(rnorm(1000, 1, 1), rnorm(1000, 8, 3)) pdf("Rplot.pdf", width = 10, height = 3.33) par(mar = c(4, 5, 0, 0), family = "serif") hist(x, breaks = 100, border = NA, col = "gray", xlab = "x", ylab = "Frequency", cex.lab = 2.75, cex.axis = 2, main = "", las = 1, xaxt = "n") axis(side = 1, at = seq(-2.5, by = 2.5, len = 30), cex.axis = 2) dev.off()Plot a histogram using Python and save it in PDF:
import numpy as np import matplotlib.pyplot as plt np.random.seed(42) x = np.concatenate((np.random.normal(1, 1, size = 1000), np.random.normal(8, 3, size = 1000))) plt.close() plt.rcParams["figure.figsize"] = (10, 3.33) plt.rcParams["font.family"] = "Times New Roman" plt.rcParams["axes.spines.bottom"] = True plt.rcParams["axes.spines.left"] = True plt.rcParams["axes.spines.top"] = False plt.rcParams["axes.spines.right"] = False tmp = plt.hist(x, bins = 100, color = 'lightgray') plt.xlabel('x', fontsize = 30) plt.ylabel('Frequency', fontsize = 30) tmp = plt.xticks(fontsize = 25) tmp = plt.yticks(fontsize = 25) plt.tight_layout() plt.savefig("pyPlot.pdf", bbox_inches='tight')
Not only pyPlot.pdf (13KB) is 2.6x the size of Rplot.pdf (5KB), but if we compare them in Adobe Reader, pyPlot.pdf is also obviously blurrier than Rplot.pdf.
Some further investigation shows that, if we save both plots in .svg, then they are totally comparable. pyPlot.pdf also appears to be a direct clone of pyPlot.svg in terms of visual quality.
Is it possible to generate the level of visual quality and file size of Rplot.pdf using Matplotlib?
PS: I uploaded the two .pdfs here: https://github.com/WhateverLiu/twoImages . Please check the file size and visual quality. Even in Chrome, if you look closely, Rplot.pdf prints smoother labels. But the major problem is that pyPlot.pdf is 2.5x larger, which really frustrates my work. Is it simply because R performed extra optimization on its graphic device? I don't want to give up on Python yet..