LinuxCommandLibrary

gocr

Optical character recognition program

TLDR

OCR an image

$ gocr [image.pbm]
copy
Set recognition mode
$ gocr -m [mode] [image.pbm]
copy
Output to file
$ gocr -o [output.txt] [image.pbm]
copy
Set character filter
$ gocr -C "[a-zA-Z0-9]" [image.pbm]
copy

SYNOPSIS

gocr [options] image

DESCRIPTION

gocr is an optical character recognition (OCR) program that reads images and outputs recognized text. It supports various image formats including PBM, PGM, PPM, and can be tuned for specific character sets.
The tool processes scanned documents and images to extract text, integrating into document processing pipelines via standard input and output.

PARAMETERS

IMAGE

Image file to process.
-o FILE
Output file.
-m MODE
Recognition mode.
-C CHARS
Expected characters.
-i FORMAT
Input format.
--help
Display help information.

CAVEATS

Accuracy varies by image quality. Works best with clean scans. Consider tesseract for better accuracy.

HISTORY

gocr (GOCR/JOCR) is an open-source OCR program developed since the late 1990s.

SEE ALSO

> TERMINAL_GEAR

Curated for the Linux community

Copied to clipboard

> TERMINAL_GEAR

Curated for the Linux community