LinuxCommandLibrary

dvc-commit

Save changes to tracked data and pipelines

TLDR

Commit changes to all DVC-tracked files and directories

$ dvc commit
copy

Commit changes to a specified DVC-tracked target
$ dvc commit [target]
copy

Recursively commit all DVC-tracked files in a directory
$ dvc commit [[-R|--recursive]] [path/to/directory]
copy

SYNOPSIS

dvc commit [-h] [-q | -v] [-f] [-d ] [--no-exec] [--run-cache] [--message ] [ ...]

PARAMETERS

-h, --help
    Show this help message and exit.

-q, --quiet
    Suppress all output. Useful in scripts.

-v, --verbose
    Increase verbosity. Useful for debugging.

-f, --force
    Overwrite existing DVC-files.

-d, --dvc-file
    Specify the DVC-file to commit. Defaults to the DVC-file associated with the target.

--no-exec
    Skip executing the command defined in the DVC-file. Only updates the DVC-file itself.

--run-cache
    Use cached results of the command defined in the DVC-file to create outputs (if available).

--message
    Add a message to the commit that will be stored in the DVC history. This will be printed out when using 'dvc history'.

...
    Paths to DVC-files or data files/directories tracked by DVC that should be committed. Can specify multiple targets.

DESCRIPTION

The `dvc commit` command is used to save changes made to data tracked by DVC (Data Version Control). It works by creating a DVC-file (a small text file) that describes the data's location, dependencies, and how to reproduce it.

This command is analogous to `git commit`, but instead of committing the actual data, it commits the metadata about the data. This allows DVC to efficiently track changes in large datasets without storing multiple copies of the data itself.

When you modify tracked files or directories, you need to run `dvc commit` to update the corresponding DVC-file and register the changes with DVC. DVC-files can then be versioned by git, allowing users to restore their workspace to previous states.

CAVEATS

The `dvc commit` command only works on files and directories that are already tracked by DVC (i.e., have associated DVC-files). Use `dvc add` or `dvc stage` to start tracking new data.

WORKFLOW EXAMPLE

1. Modify a file or directory tracked by DVC.
2. Run `dvc commit ` where `` is the path to the DVC-file or the tracked data.
3. Run `git add .dvc` to stage the changes to the DVC-file in Git.
4. Run `git commit -m "Updated data"` to commit the changes to Git.
5. Run `dvc push` to upload the data to the remote storage.

DVC-FILE CONTENT

DVC-files contains a checksum of your data and a MD5 checksum of a execution command. This allows DVC to know if the data was changed and it needs to be updated.

SEE ALSO

dvc add(1), dvc stage(1), dvc push(1), dvc status(1)

Copied to clipboard