This article is about ELK (buzz word now) implementation.
The ELK stack consists of Elasticsearch, Logstash, and Kibana.
Logstash is a tool for log data intake, processing, and output. This includes virtually any type of log that you manage: system logs, webserver logs, error logs, and app logs.
Here in this post, 'Logstash' will be replaced by 'AWS CloudWatch' and 'AWS Kinesis Firehose'.
Elasticsearch - Is a NoSQL database that is based on the Lucene search engine. Is a popular open-source search and analytics engine. It is designed to be distributed across multiple nodes enabling work with large datasets. Handle use cases as : Log Analytics, Real-time application monitoring, Click Stream Analytics and Text Search
Here in this post, 'AWS Elasticsearch' Service will be used for 'Elasticsearch' component.
Kibana is your log-data dashboard. It’s a stylish interface for visualizing logs and other time-stamped data.
Enable better grip on your large data stores with point-and-click pie charts, bar graphs, trendlines, maps and scatter plots.
First Implementation – ELK With CloudTrail/CloudWatch (as LogStash)
We’ll try to list few easy steps to do so:
- Go to AWS Elastic Search
- Create ES Domain – amelasticsearchdomain
o Set Access Policy to Allow All/Your Id
- Go to AWS CloudTrail Service
- Create Cloud Trail - amElasticSearchCloudTrail
o Create S3 Bucket – amelasticsearchbucket (Used to hold cloudtrail data)
o Create CloudWatch Group - amElasticSearchCloudWatchGroup
o In order to deliver CloudTrail events to CloudWatch Logs log group, CloudTrail will assume role with below two permissions
§ CreateLogStream: Create a CloudWatch Logs log stream in the CloudWatch Logs log group you specify
§ PutLogEvents: Deliver CloudTrail events to the CloudWatch Logs log stream
- Go & Setup Cloud Watch,
- Select Group and Then Action to Stream data to Elastic Search Domain
o Create New Role - AM_lambda_elasticsearch_execution
o Create Lambda (Automatically) LogsToElasticsearch_amelasticsearchdomain - CloudWatch Logs uses Lambda to deliver log data to Amazon Elasticsearch Service / Amazon Elasticsearch Service Cluster.
- Go to Elastic Search
o Hit Kibana link
o On Kibana - Configure an index pattern
Second Implementation – ELK With AWS KinesisFirehose/CloudWatch (as LogStash)
We’ll try to list few easy steps to do so:
- Go to AWS Elastic Search
- Create ES Domain - amelasticsearchdomain
o Set Access Policy to Allow All/Your Id
- Create Kinesis Firehose Delivery Stream - amelasticsearchkinesisfirehosestream
o Attach it to above ES Domain
o Create Lambda (Optional) - amelasticsearchkinesisfirehoselambda
o Create S3 Bucket for Backup - amelasticsearchkinesisfirehosebucket
o Create Role - am_kinesisfirehose_delivery_role
- Create EC2 System - (To send log data to above configured Kinesis Firehose)
o This will be using 1995 NASA Apache Log (http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html) to feed into Kinesis Firehose.
o EC2 used the Amazon Kinesis Agent to flow data from my file system into my Firehose stream.
o Amazon Kinesis Agent is a standalone Java software application that offers an easy way to collect and send data to Amazon Kinesis and to Firehose
- Steps:
- Launch an EC2 Instance (t2.micro) running the Amazon Linux Amazon Machine Image (AMI)
- Putty into instance/etc/aws-kinesis/agent
- Install Kinesis Agent - sudo yum install –y aws-kinesis-agent
- Go to directory - /etc/aws-kinesis/
- Open file - nano agent.json
- Make sure it has this data:
{
"cloudwatch.emitMetrics": true,
"firehose.endpoint": "https://firehose.us-east-1.amazonaws.com",
"flows": [
{
"filePattern": "/tmp/mylog.txt",
"deliveryStream": "amelasticsearchkinesisfirehosestream",
"initialPosition": "START_OF_FILE"
}
]
}
- Now Download NASA access log file in your local desktop and Upload to S3
- URL - http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html
- File download - Jul 01 to Jul 31, ASCII format, 20.7 MB gzip compressed,
- Unzip and uplaod this file to any S3 bucket (other than any used above)
- Make sure file is Public
- Again go to EC2 Putty
- Go to directory - /etc/aws-kinesis/
- Downlaod file from S3 - wget https://s3-us-west-1.amazonaws.com/arunm/access_log_Jul95
- Concatenate this file to mylog.txt - cat access_log_Jul95 >> /tmp/mylog.txt
- Again go to EC2 Putty
- Come to root - cd ~
- Go to directory - /var/log/aws-kinesis-agent/
- Monitor the agent’s log at /var/logs/aws-kinesis-agent/aws-kinesis-agent.log.
- Open file - nano aws-kinesis-agent.log
- You’ll find log lines like : 2017-03-01 21:46:38.476+0000 ip-10-0-0-55 (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 1891715 records parsed (205242369 bytes), and 1891715 records sent successfully to destinations. Uptime: 630024ms
- Create Kibana ( To Visualize data)
o Go to AWS Elasticsearch
o Click on link to Kibana
o The first thing you need to do is configure an index pattern. Use the index root you set when you created the Firehose stream (in our case, logs*).
o Kibana should recognize the logs indexes and let you set the Time-field name value. Firehose provides two possibilities:
§ @timestamp – the time as recorded in the file
§ @timestamp_utc – available when time zone information is present in the log data
o Choose either one, and you should see a summary of the fields detected.
o Select the discover tab, and you see a graph of events by time along with some expandable details for each event.
o As we are using the NASA dataset, we get a message that there are no results. That’s because the data is way back in 1995.
o Expand the time selector in the top right of the Kibana dashboard and choose an absolute time. Pick a start of June 30, 1995, and an end of August 1, 1995. You’ll see something like this.
Hope this helps.
Regards,
Arun Manglick
No comments:
Post a Comment