Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit b7824bd

Browse files
Updated README.md
1 parent bfe398b commit b7824bd

File tree

4 files changed

+59
-15
lines changed

4 files changed

+59
-15
lines changed

‎README.md‎

Lines changed: 59 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,77 @@
1-
## Derive Application Log Insights Using Amazon CloudWatch Connector for Automated Data Analytics on AWS
1+
# Derive Application Log Insights Using Amazon CloudWatch Connector for Automated Data Analytics on AWS
22

3-
### <ins> Background </ins>
4-
This repository aims at providing CDK solution that is used to demonstrate the capabilities of AWS ADA Solution. Automated Data Analytics (ADA) is an AWS Solution that enables users to derive meaningful insights from data in a matter of minutes through a simple and intuitive user interface. ADA provides a AWS-native, production-ready data platform to enable businesses analyse datasets, manage the data ingestion and data transformation. ADA provides a foundational platform that can be used by data analysts in use cases such as IT, Finance, Marketing, Sales and Security. Using ADA, teams can ingest, transform, govern and query diverse datasets from a range of data sources without requiring specialist technical skills. ADA provides a set of pre-built connectors to ingest data from a wide range of sources including Amazon Simple Storage Service, Amazon Kinesis Stream, Amazon CloudWatch, Amazon CloudTrail, and Amazon DynamoDB. The Amazon CloudWatch data connector allows data ingestion from the Amazon CloudWatch logs in the same AWS account in which ADA has been deployed, or an external AWS Account. Amazon Athena is a serverless, interactive analytics service that provides a simplified, flexible way to analyze petabytes of data.
3+
##Background
4+
This repository provides an AWS CDK solution that is used to demonstrate the capabilities of ADA on AWS Solution as describe in the blog [here](link). [Automated Data Analytics on AWS (ADA)](https://aws.amazon.com/solutions/implementations/automated-data-analytics-on-aws/) is an AWS Solution that enables users to derive meaningful insights from data in a matter of minutes through a simple and intuitive user interface. ADA offers an AWS-native data analytics platform that is ready to use out-of-the-box by data analysts for a variety of use cases. Using ADA, teams can ingest, transform, govern and query diverse datasets from a range of data sources without requiring specialist technical skills. ADA provides a set of [pre-built connectors](https://docs.aws.amazon.com/solutions/latest/automated-data-analytics-on-aws/data-connectors-guide.html) to ingest data from a wide range of sources including Amazon Simple Storage Service (S3), Amazon Kinesis Stream, Amazon CloudWatch, Amazon CloudTrail, and Amazon DynamoDB.
55

6-
In this repository, we demonstrate how ADA Solution can be used to derive application insights in the AWS. We will first deploy ADA Solution into an AWS account and configure ADAby creating data products using the data connectors. We then use ADA’s query workbench to query and join the separate data sources to gain insights. We will also demonstrate how ADA can be integrated with BI tools such as Tableau to create rich visualisation and create reports.
6+
Using this repository, we will demonstrate how an Application Developer or an Application Tester is able to leverage ADA to derive operational insights of applications running in AWS. We will also demonstrate how ADA solution can be used to connect to different data sources in AWS without having to copy the data from the source. We will first [deploy the ADA solution](https://docs.aws.amazon.com/solutions/latest/automated-data-analytics-on-aws/deploy-the-solution.html) into an AWS account and [set up the ADA solution](https://docs.aws.amazon.com/solutions/latest/automated-data-analytics-on-aws/setting-up-automated-data-analytics-on-aws.html) by creating [data products](https://docs.aws.amazon.com/solutions/latest/automated-data-analytics-on-aws/creating-data-products.html) using data connectors. ADA’s data products allows users to connect to a wide range of data sources so that users can query the datasets as if they are querying Relational Database Tables. We then use ADA’s query workbench to join the separate datasets and query the correlated data to get insights. We will also demonstrate how ADA can be integrated with BI tools such as Tableau to visualise the data and to build reports.
77

8-
### <ins> Solution overview </ins>
8+
##Solution overview
99

10-
The following are deployed:
10+
In this section, we will present the Solution Architecture for the demo and explain the workflow. For the purposes of demonstration, the bespoke application is simulated using an Amazon Lambda function that emits logs in [Apache Log Format](https://httpd.apache.org/docs/2.4/logs.html#accesslog) at a preset interval using Amazon EventBridge. This standard format can be produced by many different web servers and be read by many log analysis programs. The application (Amazon Lambda) logs are sent to a CloudWatch Log Group. The historical application logs are stored in an Amazon S3 bucket for reference and for querying purposes. A lookup table with a list of [HTTP status codes](https://httpd.apache.org/docs/2.4/logs.html#accesslog) along with the description is stored in an Amazon DynamoDB table. These three will serve as sources from which data will be ingested into ADA for correlation, query and analysis. We will [deploy ADA Solution](https://docs.aws.amazon.com/solutions/latest/automated-data-analytics-on-aws/deploy-the-solution.html) into an AWS account and setup ADA. We will then create the data products within ADA for the Amazon CloudWatch Log Group, Amazon S3 bucket, and Amazon DynamoDB. Once the data products are configured, ADA provisions the data pipeline to ingest the data from the sources into the ADA platform. Using ADA’s Query Workbench, users can query the ingested data using plain SQL for application troubleshooting or issue diagnosis.
1111

12-
1. A Amazon Lambda Function that simulates an application emitting logs in Apache Log Format and
13-
2. An Amazon EventBridge rule that invokes the Application Amazon Lambda function at a 2-minute interval.
14-
3. An Amazon S3 Bucket with the relevant bucket policies and a .csv file that contains the historical application logs.
15-
4. A Amazon DynamoDB table with the lookup data.
16-
5. Relevant IAM roles and permissions required for the services.
12+
Refer to the diagram below to get an overview of the architecture and workflow of using ADA to gain insights into application logs.
13+
![Demo Solution Architecture.](./image/SA.png "Demo Solution Architecture.")
1714

18-
### <ins> Instructions </ins>
15+
The workflow includes the following steps:
16+
1. An Amazon Lambda function is scheduled to be triggered at a 2-minute interval using Amazon EventBridge.
17+
1. The Amazon Lambda function emits logs that are stored at a specified Amazon CloudWatch Log Group under /aws/lambda/CdkStack-AdaLogGenLambdaFunction. The application logs are generated using the Apache Log Format schema but stored in the Amazon CloudWatch Log Group in JSON format.
18+
1. The Data Products for Amazon CloudWatch, Amazon S3 and Amazon DynamoDB, are created in ADA, respectively. The Amazon CloudWatch data product connects to the Amazon CloudWatch Log Group where the application (AWS Lambda) logs are stored. The Amazon S3 connector connects to an Amazon S3 bucket folder where the historical logs are stored. The Amazon DynamoDB connector connects to an Amazon DynamoDB table where the status code that are referred by the application and historical logs are stored.
19+
1. For each of three data products, ADA deploys the data pipeline infrastructure to ingest data from the sources. Once the data ingestion is complete, user will be able to write queries using SQL via the ADA’s Query Workbench.
20+
1. User logs in to the ADA portal and composes SQL queries from the query workbench to gain insights in to the application logs. User can optionally save the query and share the query with other ADA users in the same domain. ADA’s query feature is powered by Amazon Athena, which is a serverless, interactive analytics service that provides a simplified, flexible way to analyze petabytes of data.
21+
1. Tableau is configured to access the ADA Data Products via ADA’s Egress EndPoints. User creates a dashboard with two charts. The first chart is a heat map that shows the prevalence of HTTP Error codes correlated with the Application API EndPoints. The second chart is a bar chart that shows the top 10 application APIs with a total count of HTTP error codes from the historical data.
1922

20-
The `cdk.json` file tells the CDK Toolkit how to execute your app.
23+
## Prerequisites
2124

22-
## Steps to setting up the solution:
23-
## Prerequisites: Install the AWS CDK prerequisites, TypeScript-specific prerequisites and git.
25+
To perform this demo end to end as described in the [blog](link), the user needs the following prerequisites:
2426

27+
1. Install the [AWS Command Line Interface](https://aws.amazon.com/cli/), AWS CDK [prerequisites](https://docs.aws.amazon.com/cdk/v2/guide/work-with.html), TypeScript-specific [prerequisites](https://docs.aws.amazon.com/cdk/v2/guide/work-with-cdk-typescript.html) and [git] (https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).
28+
1. [Deploy](https://docs.aws.amazon.com/solutions/latest/automated-data-analytics-on-aws/deploy-the-solution.html) ADA Solution in the user’s AWS Account in the North Virgina (us-east-1) region.
29+
1. Provide an admin email while launching the ADA CloudFormation stack. It is needed for ADA to send the root user password. An admin phone number is required to receive One-Time Password (OTP) messages if Multi-Factor Authentication (MFA) is enabled. For this demo, MFA is not enabled.
30+
1. Build and Deploy the sample application ([AWS Cloud Development Kit](https://github.com/aws-samples/operational-insights-with-automated-data-analytics-on-aws)) solution so that the following resources can be provisioned in the user’s AWS Account in the North Virginia (us-east-1) region:
31+
1. An Amazon Lambda Function that simulates the logging application and an Amazon EventBridge rule that invokes the Application Amazon Lambda function at a 2-minute interval.
32+
1. An Amazon S3 Bucket with the relevant bucket policies and a .csv file that contains the historical application logs.
33+
1. An Amazon DynamoDB table with the lookup data.
34+
1. Relevant IAM roles and permissions required for the services.
35+
1. (Optional) Install Tableau [desktop](https://www.tableau.com/products/desktop), a third party Business Intelligence provider. We are using Tableau Desktop version 2021.2. There is a cost involved in using a licensed version of Tableau Desktop application. For additional details, please refer to Tableau licensing documentation.
36+
37+
## Setting up the Sample Application Infrastructure using AWS CDK
38+
The steps to clone the repo and to set up AWS CDK project are listed below. Before running the commands below, be sure to [configure](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) your AWS Credentials. Create a folder, open Terminal and navigate to the folder where the AWS CDK solution needs to be installed.
39+
40+
```
2541
* `gh repo clone vaijusson/ADALogInsights` clone the project in a local folder
2642
* `npm run build` compile typescript to js
2743
* `npm run watch` watch for changes and compile
2844
* `cdk deploy` deploy this stack to your default AWS account/region
2945
* `cdk diff` compare deployed stack with current state
3046
* `cdk synth` emits the synthesized CloudFormation template
47+
```
48+
49+
These steps perform the following:
50+
1. Install the library dependencies
51+
1. Build the project
52+
1. Generate a valid AWS CloudFormation template
53+
1. Deploy the stack using AWS CloudFormation in the user’s AWS account.
54+
55+
The deployment takes about 1-2 minutes and creates the Amazon DynamoDB lookup table, Application Lambda function and Amazon S3 bucket containing the historical log files as outputs.
56+
![CDK Deployment.](./image/cdk_deploy.jpg "CDK Deployment.")
57+
58+
## Tear Down
59+
60+
Tearing down the sample application infrastructure is a two-step process. First, to remove the infrastructure provisioned for the purposes of this demo, execute the following command in the Terminal.
61+
62+
```
63+
cdk destroy
64+
```
65+
66+
For the following question, enter ‘y’ and CDK will delete the resources deployed for the demo.
67+
68+
```
69+
Are you sure you want to delete: CdkStack (y/n)? y
70+
```
71+
72+
Alternatively, the resources can be removed from the AWS Console by navigating the ‘CloudFormation’ service, selecting the ‘CdkStack’ and selecting ‘Delete’ option.
73+
![CloudFormation Destroy.](./image/cf_destroy.jpg "CloudFormation Destroy.")
74+
3175

3276
## Security
3377

‎image/SA.png‎

296 KB
Loading[フレーム]

‎image/cdk_deploy.jpg‎

164 KB
Loading[フレーム]

‎image/cf_destroy.jpg‎

215 KB
Loading[フレーム]

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /