avro-tools
TLDR
Convert JSON to Avro
SYNOPSIS
avro-tools command [options] [arguments]
DESCRIPTION
avro-tools is a collection of utilities for working with Apache Avro data files. Avro is a data serialization system that provides rich data structures, a compact binary format, and schema evolution capabilities.
The tools enable conversion between Avro's binary format and human-readable JSON, schema extraction and validation, file manipulation, and code generation. Avro files are self-describing, containing their schema alongside the data, which avro-tools can extract and display.
Common workflows include inspecting Avro files produced by data pipelines, converting data between formats for debugging, generating test data, and compiling schemas to language-specific classes for application development.
PARAMETERS
fromjson
Convert JSON to Avro format.tojson
Convert Avro to JSON format.getschema
Extract and print the schema from an Avro file.getmeta
Print metadata from an Avro file.concat
Concatenate multiple Avro files with compatible schemas.compile
Compile schema to Java source files.idl
Compile Avro IDL to schema.random
Generate random data based on a schema.recodec
Change the codec of an Avro file.--schema-file file
Path to Avro schema file (.avsc).--count n
Number of records to generate (for random).--codec name
Compression codec (null, deflate, snappy, bzip2, zstd, xz).
CAVEATS
JSON to Avro conversion requires a schema file; the JSON structure must match the schema. Large files may require increased JVM heap space (-Xmx option). Schema compatibility is enforced when concatenating files. The tool is distributed as a Java JAR file and requires a JRE.
HISTORY
Apache Avro was created by Doug Cutting (also creator of Hadoop and Lucene) and first released in 2009. It was developed to address limitations of existing serialization frameworks when working with big data systems like Hadoop. Avro became a top-level Apache project in 2010 and is widely used in data engineering, particularly with Kafka, Spark, and Hadoop ecosystems.


