I would like to use Tesseract OCR with a video.
With ffmpeg I can export some (.jpeg) images from a video. Can I convert a .jpeg into a valid .tiff or export directly .tiff images from the video with ffmpeg?
I would like to use Tesseract OCR with a video.
With ffmpeg I can export some (.jpeg) images from a video. Can I convert a .jpeg into a valid .tiff or export directly .tiff images from the video with ffmpeg?
Converting to TIFF
You can convert a JPEG to TIFF:
ffmpeg -i input.jpeg -pix_fmt rgba output.tiff
Or from a video:
ffmpeg -i input.mp4 -pix_fmt rgba out%05d.tiff
What's important is specifying the RGBA colorspace via -pix_fmt rgba. Keeping the YUV 4:2:0 colorspace from video (-pix_fmt yuv420p) would produce TIFF files that cannot be opened in most programs (even though the YCbCr* colorspace is allowed).
How to compress output
By default this produces uncompressed TIFF images. You can choose a different compression algorithm using the -compression_algo option:
ffmpeg -i input.jpeg -pix_fmt rgb24 -compression_algo lzw output.tiff
Valid options are packbits, raw, lzw and deflate (see ffmpeg -h encoder=tiff).
* YCbCr refers to what in video compression is usually known as YUV