LinuxCommandLibrary

mongoimport

Import data into MongoDB

TLDR

Import a JSON file into a specific collection

$ mongoimport --file [path/to/file.json] --uri [mongodb_uri] [[-c|--collection]] [collection_name]
copy

Import a CSV file, using the first line of the file to determine field names
$ mongoimport --type [csv] --file [path/to/file.csv] [[-d|--db]] [database_name] [[-c|--collection]] [collection_name]
copy

Import a JSON array, using each element as a separate document
$ mongoimport --jsonArray --file [path/to/file.json]
copy

Import a JSON file using a specific mode and a query to match existing documents
$ mongoimport --file [path/to/file.json] --mode [delete|merge|upsert] --upsertFields "[field1,field2,...]"
copy

Import a CSV file, reading field names from a separate CSV file and ignoring fields with empty values
$ mongoimport --type [csv] --file [path/to/file.csv] --fieldFile [path/to/field_file.csv] --ignoreBlanks
copy

Display help
$ mongoimport --help
copy

SYNOPSIS

mongoimport [options]

PARAMETERS

--host
    Specifies the host and port of the MongoDB server to connect to. Defaults to localhost:27017.

--port
    Specifies the port to connect to. Defaults to 27017.

--db
    Specifies the database to import into.

--collection
    Specifies the collection to import into.

--file
    Specifies the input file containing the data.

--type
    Specifies the type of the input file. Defaults to JSON.

--headerline
    Uses the first line in a CSV or TSV file as the field names.

--fields
    Specifies the field names for CSV or TSV files, separated by commas.

--jsonArray
    Loads the entire input as a single JSON array.

--upsert
    Update the document, if the document doesn't exist, insert the document

--upsertFields
    Specifies which fields to use to match documents for upserting.

--ignoreBlanks
    Specifies whether to ignore fields with empty values in CSV and TSV files.

--authenticationDatabase
    Specifies the database to use for authentication.

--username
    Specifies the username to authenticate with.

--password
    Specifies the password to authenticate with.

DESCRIPTION

The mongoimport command is a command-line tool used to import data from extended JSON, CSV, or TSV into a MongoDB database. It allows you to specify the database and collection to import into, as well as the input file format and other options like field names for CSV/TSV files. This utility is useful for migrating data, loading initial datasets, or restoring data into MongoDB. The data being imported needs to be in a format mongoimport understands. It supports inserting and upserting operations.

It can be significantly faster than inserting documents individually through application code. Errors during import are handled on a per-document basis, allowing the import process to continue even if some documents fail validation or insert.

It also allows specify --mode upsert in which all existing documents with a matching query filter will be updated and all other will be inserted. This mode is useful for replicating data or updating documents with some external source of truth.

CAVEATS

Large files might consume significant memory. For very large datasets, consider splitting the data into smaller chunks or using MongoDB's sharding features. Validate your data after import to ensure it was imported correctly, particularly when using CSV or TSV formats.

PERFORMANCE CONSIDERATIONS

To improve import performance, consider these:
1. Increase write concern settings.
2. Disable journaling during the import process (use with caution).
3. Pre-split the collection across shards if you're using a sharded cluster.

ERROR HANDLING

mongoimport provides some level of error handling, but it's essential to monitor the output during and after the import process to identify and address any issues. Use verbose logging to capture more details about any errors encountered.

HISTORY

mongoimport was developed as part of the MongoDB tool suite. It provides a straightforward way to load data from common formats into a MongoDB instance. It has been improved over time to support various data types, authentication methods, and error handling. Its usage has grown with the popularity of MongoDB as a NoSQL database.

SEE ALSO

Copied to clipboard