Presentation is loading. Please wait.

Presentation is loading. Please wait.

D ISTRIBUTED S YSTEMS Apache Flume Muhammad Afaq.

Similar presentations


Presentation on theme: "D ISTRIBUTED S YSTEMS Apache Flume Muhammad Afaq."— Presentation transcript:

1 D ISTRIBUTED S YSTEMS Apache Flume Muhammad Afaq

2 O VERVIEW What is Flume? Flume Agent Flume Components Conf File Example Configuration Example: User Trends Retrieval with Flume using Twitter API

3 W HAT IS F LUME ? Reliable service for collection and aggregation of large amount of data. Especially streaming data, for example Log data. Flume is one of the projects which comes into Hadoop framework. For log analysis based on Hadoop, Flume can be used to get the log information, such as logs from websites or system logs.

4 F LUME A GENT Flume architecture or flume agent has source (anything like web server, application server or website etc.) From source, data moves to channel where our log data will be stored. From channel, the log data will be moved to sink (storage, for example Hadoop, or local file system etc.)

5 F LUME C OMPONENTS Source An active component which receives the event and places it in the channel. Channel A passive component which buffers the event and sends it to the sink, Sink Writes the data into next hop for final destination.

6 C ONF F ILE Basic Rules Every agent must have at least one channel. Every source must have at least one channel. Every sink must have exactly one channel. Every component must have a type.

7 E XAMPLE C ONFIGURATION # example.conf: A single-node Flume configuration # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = localhost a1.sources.r1.port = 44444 # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1

8 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API In this example, we will retrieve users’ trends as logs from a personal Twitter account using an API. These trends can be further analyzed as desired.

9 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Download Flume

10 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Check whether the flume tar is present or not Create flume-ng directory Copy the flume tar to flume-ng directory Check whether flume tar is copied or not

11 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Change directory to flume-ng Extract file from flume tar Check whether flume files are extracted or not

12 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Move flume-sources-1.0-SNAPSHOT.jar file to ‘lib’ directory of apache-flume and check its presence there Create flume.env.sh file in the ‘conf’ directory of apache flume

13 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Open flume-env.sh

14 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Edit flume-env.sh according to the below snapshot

15 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Open a Browser and go the below URL: URL: https://apps.twitter.comhttps://apps.twitter.com Log in to Twitter

16 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Create a new application

17 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Twitter Apps

18 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) The highlighted part will be used in flume.conf

19 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Edit flume.conf

20 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Change the directory to the ‘bin’ folder of apache flume Start fetching the data from Twitter

21 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Data being fetched from Twitter

22 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Browse the filesystem Click on user

23 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Click on flume Click on tweets

24 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) Click on FlumeData file

25 U SER T RENDS R ETRIEVAL WITH F LUME USING T WITTER API ( CONT.) This is the data that has been downloaded from Twitter


Download ppt "D ISTRIBUTED S YSTEMS Apache Flume Muhammad Afaq."

Similar presentations


Ads by Google