Thursday, December 04, 2014

Installing Apache Hive

The Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

Installing Apache Hive

In the previous post we installed hadoop 1.2.1.
$ su hduser

1. Prerequisites 

$ java -version
$ hadoop version
$ jps

2. Download Apache Hive

$ sudo wget

3. Create the Hive directory hierarchy:

$ cd  /usr/local/hadoop/bin
$ hadoop fs -mkdir /tmp
$ hadoop fs -mkdir /user/hive/warehouse
$ hadoop fs -chmod g+w /tmp
$ hadoop fs -chmod g+w /user/hive/warehouse
$ hadoop fs -chmod 777 /tmp/hive

4. Configuration

$ sudo tar -xzvf apache-hive-0.14.0-bin.tar.gz
$ mv apache-hive-0.14.0-bin hive
$ cd hive
$ pwd
$ export HIVE_HOME=/home/hduser/hive
$ export PATH=$HIVE_HOME/bin:$PATH
hduser@ubuntu:~/hive$ hive

You should see something like this:

Logging initialized using configuration in jar:file:/home/hduser/hive/lib/hive-common-0.14.0.jar!/

If you have problems to run apache-hive-0.14.0, maybe this link can help you.

hive> show tables;
Time taken: 3.511 seconds

No comments:

Post a Comment