Graylog on AWS with Ansible, Part II

An all-in-one log management solution for smaller sites

In the first part of this series of posts, I went over provisioning AWS resources to create a simple all-in-one Graylog server. In this post, I'll go over how to install and configure the system itself.

Check out Part I for a little more context around why we chose Graylog, Ansible, and AWS.

Getting Started

Before diving in, let's go over specifically what we want to accomplish here. I'm big on checklists, and creating an outline to organize and plan your efforts can be especially helpful when putting together new systems like this.

Provision infrastructure resources (Part I)

Define the instances, network resources and storage devices we'll need for the all-in-one setup. Ensure these resources are configured and secured appropriately, and connect everything together.

Configure the Graylog server (Part II)

Install the required packages and services, ready the storage devices, and install and configure the all-in-one Graylog package.

Set up inputs, streams, and extractors (Part III)

Import an initial set of streams and extractors and set up a few Graylog inputs to enable the service to accept and parse incoming log data appropriately.

Install the Graylog collector (Part IV)

Set up the Graylog collector to monitor its own log files and those of the Graylog server, and turn on syslog forwarding.

Since we're going to use Ansible, this outline translates nicely into its built-in system of roles, plays, and tasks. We'll define a Graylog role, and playbooks for each of the four goals outlined above. Additionally, we want to ensure that our playbooks are written to be idempotent — this allows us to run the plays safely at any time without worrying about duplicating changes.

The Playbook

Whereas the previous playbook focused on using the AWS-specific cloud modules to handle resource provisioning, the next one we're going to put together uses the more traditional Ansible configuration modules. We'll still need to use the AWS Dynamic Inventory script, however, in order to select the EC2 instances without having to define them manually in an inventory file.

- name: Configure graylog instance
  tags: configure
  hosts: tag_Name_graylog
  remote_user: ubuntu
  become: yes

For this play, we choose the hosts based on the AWS tag we applied to the instance from the provisioning play. The pattern tag_KEY_VALUE is a pattern used by the AWS Dynamic Inventory script to automatically group EC2 instances. See the docs for more information. The remote user is set to ubuntu, since that is the SSH user defined in the Graylog AMI used to provision the instance. We also set the become argument to yes, to ensure that all the commands are run via sudo.

Setting the Hostname

    - name: Set hostname
        name: graylog
        - restart rsyslog

    - name: Ensure hostfile is correct
        dest: /etc/hosts
        owner: root
        group: root
        mode: 0644
        line: " graylog"
        state: present
        create: yes
        - restart rsyslog

Setting up a custom hostname is optional, but it's helpful to ensure that log data sent to the Graylog server (in this case, locally) is tagged with a source attribute that is meaningful. For our purposes, since we're setting up just a single Graylog server, setting it to 'graylog' and ensuring a hostfile entry will do the trick.

The notify directives will ensure that a designated handler is called if the tasks result in a change to the system. We'll go over all the handlers later on in this post.

Format and Mount EBS

    - name: Format EBS volume
        fstype: ext4
        dev: /dev/xvdf
        force: no

    - name: Mount EBS volume
        name: /var/opt/graylog/data
        src: /dev/xvdf
        fstype: ext4
        state: mounted

We attached an EBS volume when we originally provisioned the server, but in order for Graylog to make use of the volume, we need to ensure it's properly formatted and mounted to the correct directory. The all-in-one Graylog package stores all of its data in the /var/opt/graylog/data directory, so we mount the volume there.

ElasticSearch Plugins

    - name: Install elasticsearch head plugin
      command: /opt/graylog/elasticsearch/bin/plugin --install mobz/elasticsearch-head
        creates: /opt/graylog/elasticsearch/plugins/head
        - restart elasticsearch

The head plugin is an optional, but very useful ElasticSearch plugin that lets you interact with the ElasticSearch installation using a web browser. We install it using the plugin command. You can also install any other plugins you might find useful using the same method.

Configure Graylog

    - name: Check NTP service
        name: ntp
        state: started

    - name: Add gen-smtp-pass script
        src: ./gen-smtp-pass
        dest: /usr/local/bin/gen-smtp-pass
        mode: 0755
        owner: root
        group: root

    - name: Install SSL certificate
        src: "{{ item }}"
        dest: /opt/graylog/conf/nginx/ca/
        - ./graylog.crt
        - ./graylog.key
        - restart nginx

    - name: Reconfigure graylog
      shell: |
        graylog-ctl set-email-config email-smtp.{{ region }} \
          --no-ssl \
          --port 587 \
          --user $(cat /etc/graylog/iam_access_key) \
          --password $(cat /etc/graylog/iam_secret_key | xargs gen-smtp-pass) \
          --from-email "{{ graylog_email }}" \
          --web-url "https://{{ graylog_hostname }}" \
        && graylog-ctl set-admin-password "{{ admin_pass }}" \
        && graylog-ctl set-timezone UTC \
        && graylog-ctl enforce-ssl
        creates: /etc/graylog/graylog-settings.json
        - reconfigure graylog

The AMI we used to provision the instance already contains the required Graylog installation package, managed via the graylog-ctl script. This is an Omnibus package, managed via Chef. Before running the reconfigure command, however, we need to set up a few additional items.

First, ensuring that the NTP service is started and running. By default, this is the case, but it doesn't hurt to make sure.

Next, we copy over a custom script called gen-smtp-pass. This is used to generate the correct SMTP password for the IAM user to be able to send emails via SES. Notice that we use the access key and secret key laid down by the user-data script in the previous playbook to specify the username and password. If you're not using SES, you can skip this step and set up the SMTP user and password in some other way. The script itself is fairly straightforward, and credit goes to Charles Lavery for posting it:

#!/usr/bin/env python

import base64
import hmac
import hashlib
import sys

def hash_smtp_pass_from_secret_key(key):
    message = "SendRawEmail"
    version = '\x02'
    h =, message, digestmod=hashlib.sha256)
    return base64.b64encode("{0}{1}".format(version, h.digest()))

if __name__ == "__main__":
     print hash_smtp_pass_from_secret_key(sys.argv[1])

An additional, optional step is to install your own custom SSL certificates. The Graylog installation will generate its own by default, but you'll want to make your own based on the hostname you're going to use for accessing the web interface (either self-signed or signed by a CA).

Finally, we run the graylog-ctl script, configuring the SMTP settings, the admin password, the timezone, and forcing SSL to be used for the web interface.

The Handlers

    - name: reconfigure graylog
      command: graylog-ctl reconfigure

    - name: restart rsyslog
      service: name=rsyslog state=restarted

    - name: restart elasticsearch
      command: graylog-ctl restart elasticsearch

    - name: restart nginx
      command: graylog-ctl restart nginx

The last section of the play details the handlers mentioned previously. Using handlers ensure that these commands only get run once, and only if necessary, by a trigger from the notify directive. In this case, we have triggers to run the Graylog reconfiguration, and to restart rsyslogd, elasticsearch, and nginx.

Wrapping Up

Running this playbook after provisioning the server should leave you with a fully-functional Graylog installation, ready to be used for receiving and analyzing your log data. The next steps, and what I'll cover in a subsequent post, are to set up your inputs, extractors, and streams, and to set up a Graylog collector to ship Graylog's own logs to itself!