Setting up Elasticsearch, Logstash and Kibana for QA and debugging

Posted by Miguel Lopes on Fri, Mar 4, 2016
In Scripts,

One of my latests projects involves a deamon working with a lot of threads and the only valueable output is in the log file. Up until here is like every other daemon except that the number of threads that will be managed by the daemon is somewhat big. In 16h it has outputed almost 10000 lines each one corresponding to an event. Monitoring and debuging this daemon on the log file alone seems like a lost cause, so I turned to the ELK stack ( Elasticsearch, Logstash and Kibana) to make this task manageble.

By feeding the data to elasticsearch I can quickly get an overview of whats happening in almost real time, and explore older data at will.

Setting up the ELK stack

I started by turning the log file into somethig logstash would support easily so I formated and outputed the events to the log file in JSON. Like this

"timestamp":"2016-02-28 03:07:30,226",

This part is crucial because it defines how your data will be handled by elasticsearch and what will be possible to do with it. For more information about this checkout the guide.

With that done I was ready to start building the monitoring machine.

After the OS was ready and the ELK stack installed I needed to feed the log file which was on a remote server, to logstash, there were many options to achive this but I didn’t want to have to open any more ports on either of the machines so the final setup ended up being the following:

To get the logs I mounted througth sshfs the remote folder on the monitoring VM then told logstash to create a Unix Socket to listen to the JSON events like this:

input {
unix {
path => "/tmp/logstash/log.sock"
codec => json_lines
mode => server

To connect them together socat was added to the mix to send the events from the file to the socket.

$ socat -u /home/sshfs/net/debugging.log,seek=0,ignoreeof UNIX-CONNECT:/tmp/logstash/log.sock

The remote file maybe truncated at any time so I need socat to start at the beggining of the file so the seek=0 parameter was added as well as ignoreeof to ignore the end of the file and keep reading as more events are added.

With this we have the logs being feed to elasticsearch, but because socat starts at the beggining of the file, every time there will be duplicates unless we fingerprint the events that in my case ended up being something like this:

filter {
fingerprint {
concatenate_sources => true
key => '1421876543'
method => SHA1
source => ['timestamp','name']
output {
elasticsearch {document_id => "%{fingerprint}"}

The fingerprint is based on the timestamp and the name, the name being the name of the thread, for this case it seems to be the best approach.

Now the logs are ready on elasticsearch it’s time to create some visualizations and dashboards on Kibana and do some data exploration. Diffrent types of visulizations of the same data help to bring some perspective so play around and tune your logs to get the most of the ELK stack.

With this setup I have been able to keep track of the performance and the overall activity of the daemon, also I have been able to spot some bugs that weren’t very obvious. The ELK is great but it’s not perfect neither should be the only way to monitor a daemon like this but it’s a good starting point.

comments powered by Disqus