Docker日志收集实践

说到日志，很多人第一时间想到就是ELK，但是要收集容器内的日志还是需要花一番功夫的。

整体思路

之前我们的思路是，在应用容器内嵌一个filebeat，然后filebeat直接监听容器内的日志文件，有新的日志时发送到kafka集群，然后ELK读取kafka中的日志信息。这样的方案存在几个问题：

每个应用容器内都需要有filebeat，这样造成多余的性能开销
在基础镜像中内嵌filebeat，对于应用服务有入侵性，不够合理
要制作有filebeat的镜像，最终会比较大（官方filebeat镜像也是基于centos:7）
所以综上考虑，我们还是在宿主机中安装filebeat,然后通过filebeat监听docker应用的stdout。

为什么要用Kafka

由于logstash是ruby实现，性能很一般，所以我们加了消息队列进行一道缓冲，另外filebeat性能还是不错的。

filebeat配置

filebeat.inputs:
- type: docker
  include_lines: ['INFO','WARN','ERROR'']
  combine_partial: true
  containers:
    path: "/var/lib/docker/containers"
    stream: "stdout"
    ids:
      - "*"

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.kibana:

setup.template.settings:
  index.number_of_shards: 3

output.kafka:
  hosts: ["broker.elk-kafka:9092"]
  topic: elk-log
  partition.round_robin:
    reachable_only: false
  required_acks: 1
  compression: gzip
  max_message_bytes: 1000000

logstash配置

input {
    kafka{
		    bootstrap_servers => ["broker.elk-kafka:9092"]
        auto_offset_reset => "latest"
        consumer_threads => 5
        decorate_events => true
        topics => ["elk-log"]
        codec => "json"
    }
}

output {
  	elasticsearch {
	     	hosts => "http://elasticsearch:9200"
		    index => "elk-log"
    }
}