In this article, I’d like to introduces a solution to collect logs and store them into Azure DocumentDB using fluentd and its plugin, fluent-plugin-documentdb.
Azure DocumentDB is a managed NoSQL database service provided by Microsoft Azure. It’s schemaless, natively support JSON, very easy-to-use, very fast, highly reliable, and enables rapid deployment, you name it. Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data. fluent-plugin-documentdb is fluentd output plugin that enables to store event collections into Azure DocumentDB.
This article shows how to
To use Azure DocumentDB, you must create a DocumentDB database account using either the Azure portal, Azure Resource Manager templates, or Azure command-line interface (CLI). In addition, you must have a database and a collection to which fluent-plugin-documentdb writes event-stream out. Here are instructions:
First of all, install Fluentd. The following shows how to install Fluentd using Ruby gem packger but if you are not using Ruby Gem for the installation, please refer to this installation guide where you can find many other ways to install Fluentd on many platforms.
Also, install fluent-plugin-documentdb for fluentd aggregator to store collected logs data into Azure DocumentDB.
Next, configure fluent.conf, a fluentd configuration file as follows. Please see this for fluent-plugin-documentdb configuration.
Regarding he port number of the aggregator host above, the default is 24224. Note that both TCP packets (event stream) and UDP packets (heartbeat message) are sent to this port, which mean you need to open both TCP and UDP for this port if you have access controls between forwarders and aggregator. Please see the forward Output plugin article to understand more about forward plugin.
Finally, run fluentd with specifiying fluent.conf that you configurea above.
First, to set up Fluentd, run the following command to setup Fluentd. Again If you are not using Ruby Gem for the installation, please refer to the installation document.
Then, give Fluentd a read access to servers’log files.
Next, configure fluent.conf, a fluentd configuration file as follows to tail apache access logs and forard event to aggregator
Finally, start Fluentd with the configuration above to start log collections
Let’s check if logs will be forwarded from apache nodes to aggegator and ultimately stored in documentdb. First, create log events by sending test requests to web servers somehow (here using apache bench for example)
If logs are collected successfully, you can see the logs stored in DocumentDB easily by using Document DB’s query explorer. Go to Azure Portal > Display your DocumentDB dashboard > Query Explorer.
Fluentd’s Input plugins extend Fluentd to retrieve and pull event logs from external sources. An input plugin typically creates a thread socket and a listen socket. It can also periodically pull data from data sources. For examples, listening syslog events, tailing apache/nginx logs, and pulling data from well-known RDBMS/NoSQLs on-premices or on public cloud. See http://www.fluentd.org/plugins
Happy log collections with Azure DocumentDB and fluentd!!