Installing Apache Hive

The Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

In the previous post we installed hadoop 1.2.1.
$ su hduser

1. Prerequisites 

$ java -version
$ hadoop version
$ jps

2. Download Apache Hive

$ sudo wget

3. Create the Hive directory hierarchy:

$ cd  /usr/local/hadoop/bin
$ hadoop fs -mkdir /tmp
$ hadoop fs -mkdir /user/hive/warehouse
$ hadoop fs -chmod g+w /tmp
$ hadoop fs -chmod g+w /user/hive/warehouse
$ hadoop fs -chmod 777 /tmp/hive

4. Configuration

$ sudo tar -xzvf apache-hive-0.14.0-bin.tar.gz
$ mv apache-hive-0.14.0-bin hive
$ cd hive
$ pwd
$ export HIVE_HOME=/home/hduser/hive
$ export PATH=$HIVE_HOME/bin:$PATH
hduser@ubuntu:~/hive$ hive

You should see something like this:

Logging initialized using configuration in jar:file:/home/hduser/hive/lib/hive-common-0.14.0.jar!/

If you have problems to run apache-hive-0.14.0, maybe this link can help you.

hive> show tables;
Time taken: 3.511 seconds

