Partitioning CloudTrail Logs in Athena

Alex Smolen
4 min readJan 15, 2018

CloudTrail logs provide information about AWS API calls and are useful in a variety of scenarios:

While the information they contain is undoubtedly useful, interacting with CloudTrail logs can be difficult.

CloudTrail logs are delivered to S3 as JSON by default, so you could download the files and parse them locally with jq for exploration, or write a script for more complex tasks. While handy in a pinch, it takes time and bandwidth to download large log files. There’s no set way to distribute the analysis results, and it’s painful to write out the commands to get after what you’re looking for in a particular use case.

Alternatively, if you have the CloudTrail logs forward to CloudWatch logs, you could search via the CloudWatch Logs interface. I find the CloudWatch logs query syntax to be limited, and the results, again, aren’t easy to forward on for other processing.

You could put the CloudTrail logs in CloudSearch, but this requires creating a new AWS not-serverless resource, with the associated management overhead and costs. You could forward CloudTrail logs to other search services, like Splunk, but what if you don’t have that infrastructure at your fingertips?

--

--