gocr

Perform optical character recognition (OCR) on images

TLDR

Recognize characters in the [i]nput image and [o]utput it in the given file. Put the database ([p]) in path/to/db_directory (verify that the folder exists or DB usage will silently be skipped). [m]ode 130 means create + use + extend database

$ gocr -m 130 -p [path/to/db_directory] -i [path/to/input_image.png] -o [path/to/output_file.txt]

Recognize characters and assume all [C]haracters are numbers

$ gocr -m 130 -p [path/to/db_directory] -i [path/to/input_image.png] -o [path/to/output_file.txt] -C "[0..9]"

Recognize characters with a cert[a]inty of 100% (characters have a higher chance to be treated as unknown)

$ gocr -m 130 -p [path/to/db_directory] -i [path/to/input_image.png] -o [path/to/output_file.txt] -a 100

SYNOPSIS

gocr [options] [image_files...]

PARAMETERS

-h, --help
    Display help message

-v, --version
    Show version information

-x SXxSY
    Set image size in pixels (e.g., 640x480)

-d DPI
    Set resolution in DPI (e.g., 300)

-f FILE
    Load custom font file

-p N
    Postprocessing level (0-3)

-o N
    Output format: 0=text, 1=PS level 1, 2=PS level 2, 3=PBM

-s N
    Segmentation mode (0-3)

-a
    Enable page alignment

-e FEATURE
    Enable feature (blanks, blanks2, dust, dualsize, linebreaks, etc.)

-i FEATURE
    Disable feature

-c
    Colored debug output

-D
    Debug mode (dump images)

-l FILE
    Load pattern file

-L FILE
    Save pattern file

-P FILE
    Save preprocessed page image

-m
    Force monochrome input

DESCRIPTION

gocr is a lightweight, open-source optical character recognition (OCR) tool for Linux that converts scanned images or bitmaps into editable text. It excels at processing simple, high-contrast monochrome images, automatically segmenting characters, recognizing them using built-in font patterns, and applying post-processing for improved accuracy.

Key strengths include batch processing of multiple files, configurable preprocessing (e.g., alignment, noise removal), and output in plain text, PostScript, or bitmap formats. It supports custom fonts and pattern learning for specialized needs. While not as advanced as modern neural-net-based OCR like Tesseract, gocr is fast, dependency-light (relies on NetPBM tools), and ideal for command-line scripting or legacy document digitization.

Usage involves specifying image dimensions or DPI for accurate scaling, enabling features like dust removal or line break detection. Debug modes help tune parameters. Best results come from preprocessed scans using tools like unpaper.

gocr

Perform optical character recognition (OCR) on images

TLDR

SYNOPSIS

PARAMETERS

DESCRIPTION

CAVEATS

SUPPORTED FORMATS

BUILT-IN FONTS

HISTORY

SEE ALSO