第四章:Flume的安装和配置

  1. 安装Flume

    yum install flume
    
  2. 在HDFS中创建flume目录,以存放来自本地的log日志文件(此\/user\/flume就是flume.conf中path的路径)

    hadoop fs -mkdir /user/flume
    
  3. 在本地创建一个log日志文件或者txt文件(如在\/tmp下创建一个a.txt文件,随意保存点内容)

  4. 进入Flume的默认配置路径修改flume.conf

    cd /usr/lib/flume/conf
    vi flume.conf
    

    一、监控文件夹:

       ## Name the components on this agent
       agent1.sources = source1
       agent1.sinks = sink1
       agent1.channels = ch1
    
       # Describe/configure the source,下面的spoolDir一定要写本地存放log或txt的文件夹名,flume上传会将目录下所有log或txt文件都上传到HDFS中!!!!!
       agent1.sources.source1.channels = ch1
       agent1.sources.source1.type = spooldir
       agent1.sources.source1.spoolDir =/tmp
       agent1.sources.source1.ignorePattern = .*dat.*
       agent1.sources.source1.fileHeader = true
       agent1.sources.source1.deserializer.outputCharset = UTF-8
    
       # Describe the sink,注意下面的path为Active Name Node!!
       agent1.sinks.sink1.type = hdfs
       agent1.sinks.sink1.hdfs.path = hdfs://<Active Name Node IP>:8020/user/flume/
       agent1.sinks.sink1.hdfs.hdfs.rollInterval = 60
       agent1.sinks.sink1.hdfs.hdfs.rollSize = 1024
       agent1.sinks.sink1.channel=ch1
    
       # Use a channel which buffers events in memory
       agent1.channels.ch1.type = file
    
  5. 退回到usr\/lib\/flume目录下,执行以下flume上传命令

    bin/flume-ng agent -n agent1 -c conf -f conf/flume.conf -Dflume.root.logger=INFO,console
    

    这里可能会报以下错误,说明flume.conf文件位置放置错误,将\/usr\/lib\/flume\/conf中的4个配置文件copy到\/usr\/lib\/flume\/bin\/conf目录中即可。

另外还有可能报Error: Could not find or load main class org.apache.flume.tools.GetJavaProperty这个错误,如下图,则说明flume-ng的内容不匹配当前的class文件,解决办法就是将flume-ng中的内容覆盖掉原来的即可。

  1. 检查HDFS目录中\/user\/flume是否已经有刚刚上传的a.txt文件

    hadoop fs -ls /user/flume
    hadoop fs -cat /user/flume/*
    

    二、监控文件:

    # a.conf:A single-node Flume configuration 
    # Name the components on this agent 
    a2.sources=r2
    a2.sinks=k2
    a2.channels=c2
    # Describe configure the source
    a2.sources.r2.type=exec
    a2.sources.r2.command=tail -f /root/flume/1.txt
    # Describe the sink
    a2.sinks.k2.type=hdfs
    a2.sinks.k2.hdfs.path=hdfs://<Active Name Node IP>:8020/user/flume/file
    a2.sinks.k2.hdfs.filePrefix=data1
    a2.sinks.k2.hdfs.round=true
    a2.sinks.k2.hdfs.rollSize=0
    a2.sinks.k2.hdfs.rollCount=0
    a2.sinks.k2.hdfs.batchSize=1000
    a2.sinks.k2.hdfs.roundValue=1
    a2.sinks.k2.hdfs.fileType=DataStream
    #Use a channel which buffers events in memory
    a2.channels.c2.type=memory 
    a2.channels.c2.capacity=100000
    a2.channels.c2.transactionCapacity=1000
    #Bind the source and sink to the channel
    a2.sources.r2.channels=c2
    a2.sinks.k2.channel=c2
    

    退回到usr\/lib\/flume目录下,执行以下flume上传命令

    bin/flume-ng agent -n a2 -c conf -f conf/flume.conf -Dflume.root.logger=INFO,console
    

results matching ""

    No results matching ""