6

Given an image that has some actual content inside and usually some unwanted white or black or transparency around that, I would like to trim or crop the exterior parts using ImageMagick.

The following image has been drawn digitally on a computer (on the HTML <canvas>):

canvas.png

The following ImageMagick command is what I tried:

$ convert canvas.png -trim +repage canvas_trimmed.png

And it worked perfectly:

canvas_trimmed.png

So this is exactly what I want. But now I want this to work with scanned documents as well, which are not as "perfect" as computer-generated images, i.e. they have more shades of "white" and "black" and no transparency which would be easier to detect. Sometimes, they even have some black bars around the white background of the paper because the area of the scanner is larger than the paper:

scan.jpg

With this image, I tried the following commands in the given order, each trying to be more aggressive, but none yielding any results – you can't see any difference between the original image and the "trimmed" images, i.e. the trimming or cropping does not work at all:

$ convert scan.jpg -trim +repage scan_trimmed.jpg
$ convert scan.jpg -fuzz 10% -trim +repage scan_trimmed.jpg
$ convert scan.jpg -fuzz 60% -trim +repage scan_trimmed.jpg
$ convert scan.jpg -fuzz 60% -bordercolor white -border 1x1 -trim +repage scan_trimmed.jpg
$ convert scan.jpg -fuzz 60% -bordercolor black -border 1x1 -trim +repage scan_trimmed.jpg

What am I doing wrong here? How can the ImageMagick command that reliably trims computer-generated images be modified so that it trims scanned documents of the style above just as reliably?

caw
  • 218

3 Answers3

6

You can use -shave and simply shave off the edges and then use the logic you use afterwards to process accordingly.

Note: The amount you shave off (e.g. the argument after "-shave" 40x40 or 10x10, etc. ) is important so be sure to test thoroughly to ensure this setting works universally in your environment for your images.

Example Logic

@ECHO ON

SET Convert="C:\Program Files\ImageMagick\Convert.exe"
%convert% C:\Folder\Circle.jpg -shave 40x40 C:\Folder\ShavedCircle.jpg
<The rest of your logic against C:\Folder\ShavedCircle.jpg now>

Before

enter image description here

After

enter image description here


Further Resources

  • Shave, removing edges from a image

    The reverse of the "-border" or "-frame" operators, is "-shave", which if given the same arguments, will remove the space added by these commands.

    The main thing to keep in mind about these three operators is that they add and remove space on opposite sides of the images, not just one side, or adjacent sides.

    If you want to only remove one edge of an image, then you will need to use the "-chop" operator instead. (See the Chop Examples below).

    As before all the operators "-border", "-frame", and "-shave", only effect the real image on the virtual canvas and not the virtual canvas itself.

    source

2

Remove Dirt Specks or Noise from Images with ImageMagick

Below is what I used to get rid of the dirt specks in the image file from the image in your question, but I went ahead and used the shave with the 90x90 first which is what you confirmed helped resolve the problem from the other solution I provided for the awarded bounty.

Example Logic

@ECHO ON

SET Convert="C:\Program Files\ImageMagick\Convert.exe"
%convert% C:\Folder\Circle.jpg -shave 90x90 C:\Folder\ShavedCircle.jpg
%convert% C:\Folder\ShavedCircle.jpg -write MPR:source ^
  -morphology close rectangle:3x4 ^
  -morphology erode square    MPR:source -compose Lighten -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
C:\Folder\cleaned.jpg

Before

enter image description here

After

enter image description here


Due to the nature of the ringing noise, all black noise specks are separated by at least 1 pixel from the letters.

One good approach to remove this noise would be to dilate the image so that at least one "seed" part of each letter remains, then erode these seeds while using the original image as a mask; in effect a flood-fill for each letter.

This way the shape of the letters and other large blobs is preserved perfectly, and smaller blobs disappear.

The biggest dilate that still leaves a part of each letter shape seems to be a 3x4 rectangle for the example data; perhaps use something smaller to be on the safe side.

This command first dilates that 3x4 rectangle, end then erodes until the letters are all whole again

Code

convert cleanup.tif -write MPR:source ^
  -morphology close rectangle:3x4 ^
  -morphology erode square    MPR:source -compose Lighten -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  cleaned.png

source


Further Resources

1

What ultimately provided perfect results, at least for my specific example shown in the original question (scan.jpg), was the following two-step solution:

convert \
    scan.jpg \
    -write MPR:source \
    -morphology close rectangle:3x4 \
    -clip-mask MPR:source \
    -morphology erode:8 square \
    +clip-mask \
    scan_intermediate.jpg

convert scan_intermediate.jpg -shave 40x40 -fuzz 10% -trim +repage scan_final.jpg

This solution is composed of three parts:

  1. The command from my original question
  2. The noise removal shown in this answer
  3. The -shave operator suggested in this answer
caw
  • 218