LinuxCommandLibrary

gocr

Perform optical character recognition (OCR) on images

TLDR

Recognize characters in the [i]nput image and [o]utput it in the given file. Put the database ([p]) in path/to/db_directory (verify that the folder exists or DB usage will silently be skipped). [m]ode 130 means create + use + extend database

$ gocr -m 130 -p [path/to/db_directory] -i [path/to/input_image.png] -o [path/to/output_file.txt]
copy

Recognize characters and assume all [C]haracters are numbers
$ gocr -m 130 -p [path/to/db_directory] -i [path/to/input_image.png] -o [path/to/output_file.txt] -C "[0..9]"
copy

Recognize characters with a cert[a]inty of 100% (characters have a higher chance to be treated as unknown)
$ gocr -m 130 -p [path/to/db_directory] -i [path/to/input_image.png] -o [path/to/output_file.txt] -a 100
copy

SYNOPSIS

gocr [options] [image_files...]

PARAMETERS

-h, --help
    Display help message

-v, --version
    Show version information

-x SXxSY
    Set image size in pixels (e.g., 640x480)

-d DPI
    Set resolution in DPI (e.g., 300)

-f FILE
    Load custom font file

-p N
    Postprocessing level (0-3)

-o N
    Output format: 0=text, 1=PS level 1, 2=PS level 2, 3=PBM

-s N
    Segmentation mode (0-3)

-a
    Enable page alignment

-e FEATURE
    Enable feature (blanks, blanks2, dust, dualsize, linebreaks, etc.)

-i FEATURE
    Disable feature

-c
    Colored debug output

-D
    Debug mode (dump images)

-l FILE
    Load pattern file

-L FILE
    Save pattern file

-P FILE
    Save preprocessed page image

-m
    Force monochrome input

DESCRIPTION

gocr is a lightweight, open-source optical character recognition (OCR) tool for Linux that converts scanned images or bitmaps into editable text. It excels at processing simple, high-contrast monochrome images, automatically segmenting characters, recognizing them using built-in font patterns, and applying post-processing for improved accuracy.

Key strengths include batch processing of multiple files, configurable preprocessing (e.g., alignment, noise removal), and output in plain text, PostScript, or bitmap formats. It supports custom fonts and pattern learning for specialized needs. While not as advanced as modern neural-net-based OCR like Tesseract, gocr is fast, dependency-light (relies on NetPBM tools), and ideal for command-line scripting or legacy document digitization.

Usage involves specifying image dimensions or DPI for accurate scaling, enabling features like dust removal or line break detection. Debug modes help tune parameters. Best results come from preprocessed scans using tools like unpaper.

CAVEATS

Best on clean B/W images; poor with color/noise/complex layouts. No active development since ~2010; limited font support. Requires NetPBM for full format compatibility.

SUPPORTED FORMATS

Primarily PBM/PGM/PPM/PNM via NetPBM; limited TIFF/JPEG support.

BUILT-IN FONTS

Default: ps2, latin, big, 7x13, etc.; extensible via -f.

HISTORY

Originated as JOcr project ~2000 by Stefan Schweizer. Renamed GOCR, evolved through 0.40 (2007) to 0.50 (2010). Pioneered trainable pattern recognition; largely superseded by Tesseract.

SEE ALSO

ocrad(1), tesseract(1), unpaper(1), netpbm(1)

Copied to clipboard