hive
data warehouse system for Hadoop
TLDR
Start Hive shell
$ hive
Execute query$ hive -e "SELECT * FROM [table]"
Run script file$ hive -f [script.hql]
Set configuration$ hive --hiveconf [key=value]
Silent mode$ hive -S -e "[query]"
SYNOPSIS
hive [options]
DESCRIPTION
Hive is a data warehouse system for Hadoop. It provides SQL-like query language (HiveQL) for querying large datasets stored in HDFS.
The tool translates queries to MapReduce, Tez, or Spark jobs. It's used for data analysis and ETL on big data platforms.
PARAMETERS
-e QUERY
Execute query.-f FILE
Execute script file.-S, --silent
Silent mode.--hiveconf KEY=VALUE
Set configuration.--database DB
Use database.-i FILE
Initialization file.--help
Display help information.
CAVEATS
Requires Hadoop cluster. Query latency higher than RDBMS. Schema on read.
HISTORY
Apache Hive was developed at Facebook and contributed to the Apache project for SQL-based big data analytics.
