Thursday, December 04, 2014

Apache Flume


Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store.It has a simple and flexible architecture based on streaming data flows.




Flume is configured by defining endpoints in a data flow called sources and sinks. The source produces events (eg, Twitter Streaming API), and the sink writes the events out to a location. Between source and the sink, there is channel. Source sends data to sink through channel.

Installation

1. Download last stable release of Apache Flume

$ sudo wget http://www.apache.org/dyn/closer.cgi/flume/1.5.2/apache-flume-1.5.2-bin.tar.gz

2. Create the Flume directory hierarchy:

$ tar -xzf apache-flume-1.5.2-bin.tar.gz
$ mv apache-flume-1.5.2-bin flume
$ sudo mv flume/ /usr/lib/
$ sudo chmod -R 777 /usr/lib/flume
$ cd /usr/lib/flume

3. Configuration 

$ nano ~/.bashrc

Add this lines to .bashrc:

#BEGIN CONFIGURATION FLUME
export FLUME_HOME=/usr/lib/flume
export FLUME_CONF_DIR=$FLUME_HOME/conf
export FLUME_CLASSPATH=$FLUME_CONF_DIR
export PATH=$FLUME_HOME/bin:$PATH
#END CONFIGURATION FLUME

$ source ~/.bashrc
$ cd /usr/lib/flume/conf
$ mv flume-env.sh.template flume-env.sh

In file flume-env.sh add:

JAVA_HOME=/usr/lib/jvm/jdk1.7.0_71





And that's all.

$ /usr/lib/flume/bin/flume-ng version

You shoud see something like this:

Flume 1.5.2
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 229442aa6835ee0faa17e3034bcab42754c460f5
Compiled by hshreedharan on Wed Nov 12 12:51:22 PST 2014
From source with checksum 837f81bd1e304a65fcaf8e5f692b3f18

Maybe you could be interested in this other post.

No comments:

Post a Comment