I have the following demonstration code where I create a simple scatter plot and save it as png, fully vectorized eps and partly rasterized eps.
For a large number of points I expect the filesize of the vectorized eps to be much bigger than the png (at least at reasonable dpi), and this is indeed what I observe.
When I rasterize the scatter plot, I would expect the filesize to get back down towards the size of the png, since I'm practically just "embedding" the png in an eps, right? However, the rasterized version completely bloats up by a factor of ~20:
png: 48K, fully vectorized eps: 184K, rasterized eps: 3.8M (on Linux openSUSE, python 3.4.6, matplotlib 2.2.2)
What's the reason for this? Is my understanding of what happens when one rasterizes the plot completely wrong? When I put the png into inkscape and export as eps I get a file (which is obviously rasterized) of only minutely larger size than the original png.
Demonstration code:
import matplotlib.pyplot as plt
import numpy as np
# Prepare some random data
N = 10000
x = np.random.rand(N)
y = np.random.rand(N)
dpi = 150
# Create a figure and plot some points
fig = plt.figure()
ax = fig_mesh.add_subplot(111)
scatter = ax.scatter(x, y, zorder=0.5)
# Save it as png or unrasterized eps
fig_mesh.savefig('mesh.png', dpi=dpi) # 184K
fig_mesh.savefig('mesh.eps') # 48 K
# Save it with rasterized points
ax_mesh.set_rasterization_zorder(1)
fig_mesh.savefig('mesh_rasterized.eps', dpi=dpi, rasterized=True) # 3.8M!
Thanks in advance!