Query AWS Cloud Trail Events
By Bhuvaneswari Subramani / Mar 14, 2023
This post will provide you with a comprehensive understanding of how to store CloudTrail logs in an AWS CloudTrail Lake and leverage SQL queries to analyze the CloudTrail events that are stored in the lake.
AWS is a big container housing huge list of varied services. When you create an AWS Account, there are multiple ways in which you would create, update, delete or access the AWS resources – AWS Console, AWS SDK & AWS CLI.
Well, ultimately each of these events are either User activity or API calls. Now monitoring Who did what, where & when is called Account Monitoring and AWS CloudTrail is purpose built for that in 2013. Since then, CloudTrail has been the single source of truth to track user activity and API usage.
Later AWS CloudTrail Lake was launched in 2022 to aggregate, immutably store, and query your activity logs for auditing, security investigation, and operational troubleshooting is simplified.
In one product, CloudTrail Lake collects, stores, optimizes, and queries activity logs. As a result of combining these features into one environment, CloudTrail Lake eliminates the need for separate data pipelines across teams and products.
Recently, AWS CloudTrail Lake has also extended support for non-aws event source integration
Irrespective of the data source, the success of the services depends on how the data is stored and how seamlessly it can be utilised or accessed. This blog post focus on two important features, storing and querying from CloudTrail Lake.
So let's dive deep into the steps to store the CloudTrail logs in a CloudTrail Lake and run SQL queries on your CloudTrail events stored in AWS CloudTrail Lake
Table of Contents
- Create CloudTrail
- Create CloudTrail Lake
- Create Query
- Run Query
- Validate Query
- Cleanup
- Learning Reference
- Conclusion
Create CloudTrail
- Sign in to the AWS Management Console and open the CloudTrail console at https://console.aws.amazon.com/cloudtrail/.
- Create a CloudTrail, named cloudtrail-demo in your region of exploration, and save in an S3 bucket named aws-cloudtrail-logs-demo-321321
Create CloudTrail Lake
- Stay on CloudTrail console, navigate to CloudTrail Lake and click Event data sourc and create an event data store
- Configure Create an event data store named cloudtrail-event-ds-demo
- Choose events Make sure you select at least one event source (Management / Data) else you will be notified with the below message while moving to the next screen
Select at least one of management events or data events. In the below example, only management event is selected and copying existing trail events from specified S3 bucket.
- Review and create
Once successfully created, you will see the below confirmation
Create Query
Create query to run against the above event data store cloudtrail-event-ds-demo Go back to Event data store screen and select Run query
Queries in CloudTrail are authored in SQL. You can build a query on the CloudTrail Lake Editor tab by writing the query in SQL from scratch, or by opening a saved or sample query and editing it.
First you may run one of the sample queries to get a feel of query format and result, later frame your own query and execute.
Sample Query:
Run the sample query find the number of API calls grouped by event name and event source within the past week.
The Query results tab for an active query displays rows of results based on a query. You can filter results by entering all or part of an event field value.
On the Command output tab, you can review metadata about the query run, such as the event data store ID, run time, number of results scanned, and the success or failure of the query. Query results saved to an Amazon S3 bucket will also have a link to the S3 bucket in the metadata.
Custom Query 1:
To list the EC2 instance related events including the eventTime, eventName and IPAddress where the event has originated from the specified date and time (say last 5 days)
SELECT
userIdentity.principalid, userIdentity.userName, eventName, eventTime, sourceIPAddress
FROM
event_data_store_ID
WHERE userIdentity.principalid IS NOT NULL
AND
eventTime > 'yyyy-mm-dd hh:mm:ss'
AND
eventSource='ec2.amazonaws.com'
[Note: replace event_data_store_ID with your event data store id and date & time.]
In Save query, enter a name and description for the query. Choose Save query to save your changes as the new query. To discard changes to a query, choose Cancel, or close the Save query window
Try it yourself
Custom Query 2
To list the Terminated or Stopped EC2 instances where the event has originated from the specified date and time (say last 5 days)
SELECT
userIdentity.principalid, userIdentity.userName, eventName, eventTime, sourceIPAddress
FROM
d56bf0c1-fee5-4667-986d-b0d9e6048e4b
WHERE
userIdentity.principalid IS NOT NULL
AND
eventTime > '2023-02-01 00:00:00'
AND
eventName='TerminateInstances' OR eventName='StopInstances'
Run Query
Here you go with steps to run a query using CloudTrail Lake
- Sign in to the AWS Management Console and open the CloudTrail console at https://console.aws.amazon.com/cloudtrail/.
- From the navigation pane, open the Lake submenu, then choose Query.
- On the Saved queries or Sample queries tabs, choose a query to run by choosing the value in the Query SQL column.
- On the Editor tab, for Event data store, choose an event data store from the drop-down list.
- (Optional) On the Editor tab, choose Save results to S3 to save the query results to an S3 bucket.
Query results can be saved in S3 bucket
Key points to remember while saving query results to S3
- CloudTrail delivers the query results to the S3 bucket in compressed gzip format
- On average, after the query scan completes you can expect a latency of 6 minutes for every GB of data delivered to the S3 bucket
- Queries that run for longer than one hour might time out. Partial results will not be saved into S3, hence fine tune your query to limit the data scan to complete within an hour
Validate Query
If you want to determine whether the query results have been modified, deleted, or unchanged after CloudTrail delivered them, you can use CloudTrail query results integrity validation.
- Sign in to the AWS Management Console and open the CloudTrail console at https://console.aws.amazon.com/cloudtrail/
- From the navigation pane, open the Trials submenu, then choose the trail and click Delete button
- You will receive the below prompt
- Click Delete to delete the trail and then proceed to delete S3 bucket as detailed below.
Delete cloudtrail S3 bucket:
- Go to Amazon S3 console, select the radio button before aws-cloudtrail-logs-demo-321321 bucket and click Delete button.
- You might see the following error if the bucket contains cloudtrail events
- permanently delete all objects and delete the bucket
Delete event store data:
- Click on the Event data stores tab in the Lake console.
- Select the event data store from the list (cloudtrail-event-ds-demo).
- From the actions menu, select “Change termination protection”.
- From the change termination protection pop-up select Disabled and click “Save”.
- From the Actions menu select Delete, confirm that you want to delete it by entering the name of the data store. Then click “Delete”. This will place your event data store in the pending deletion state.
- This will disable the data store and in seven days it will be deleted permanently.
- If you feel, you have deleted by mistake during this time, you can restore it from Actions menu. (I was just curious if the event data store is getting into pending deletion state, then it should be restorable too)
- Additionally, delete the S3 bucket if this has been created to store the query results in this demo. example: aws-cloudtrail-lake-query-results--
Learning Reference
- AWS CloudTrail user guide
- AWS CloudTrail pricing