Skip to content

CloudWatch Logs Insights Example Queries

CloudWatch Logs Insights provides a powerful platform for analyzing and querying CloudWatch log data. It allows you interactively search through your log data using a SQL like query language with a few simple but powerful commands.

CloudWatch Logs insights provides out of the box example queries for the following categories:

  • Lambda
  • VPC Flow Logs
  • CloudTrail
  • Common Queries
  • Route 53
  • AWS AppSync
  • NAT Gateway

In this section of the best practices guide we provide some example queries for other types of logs that are not currently included in the out of the box examples. This list will evolve and change over time and you can submit your own examples for review by leaving an issue on the git hub.

API Gateway

Last 20 Messages containing an HTTP Method Type

filter @message like /$METHOD/ 
| fields @timestamp, @message
| sort @timestamp desc
| limit 20

This query will return the last 20 log messages containing a specific HTTP method sorted in descending timestamp order. Substitute METHOD for the method you are querying for. Here is an example of how to use this query:

filter @message like /POST/ 
| fields @timestamp, @message
| sort @timestamp desc
| limit 20

Tip

You can change the $limit value in order to return a different amount of messages.

Top 20 Talkers Sorted by IP

fields @timestamp, @message
| stats count() by ip
| sort ip asc
| limit 20

This query will return the top 20 talkers sorted by IP. This can be useful for detecting malicious activity against your API.

As a next step you could then add an additional filter for method type. For example, this query would show the top talkers by IP but only the "PUT" method call:

fields @timestamp, @message
| filter @message like /PUT/
| stats count() by ip
| sort ip asc
| limit 20

CloudTrail Logs

API throttling errors grouped by error category

stats count(errorCode) as eventCount by eventSource, eventName, awsRegion, userAgent, errorCode
| filter errorCode = 'ThrottlingException' 
| sort eventCount desc

This query allows you to see API throttling errors grouped by category and displayed in descending order.

Tip

In order to use this query you would first need to ensure you are sending CloudTrail logs to CloudWatch.

Root account activity in line graph

fields @timestamp, @message, userIdentity.type 
| filter userIdentity.type='Root' 
| stats count() as RootActivity by bin(5m)

With this query you can visualize root account activity in a line graph. This query aggregates the root activity over time, counting the occurrences of root activity within each 5-minute interval.

VPC Flow Logs

Filtering flow logs for selected source IP address with action as REJECT.

fields @timestamp, @message, @logStream, @log  | filter srcAddr like '$SOURCEIP' and action = 'REJECT'
| sort @timestamp desc
| limit 20

This query will return the last 20 log messages containing a 'REJECT' from the $SOURCEIP. This can be used to detect if traffic is explicitly rejected, or if the issue is some type of client side network configuration problem.

Tip

Ensure that you substitute the value of the IP address you are interested in for '$SOURCEIP'

fields @timestamp, @message, @logStream, @log  | filter srcAddr like '10.0.0.5' and action = 'REJECT'
| sort @timestamp desc
| limit 20

Grouping network traffic by Availability Zones

stats sum(bytes / 1048576) as Traffic_MB by azId as AZ_ID 
| sort Traffic_MB desc

This query retrieves network traffic data grouped by Availability Zone (AZ). It calculates the total traffic in megabytes (MB) by summing the bytes and converting them to MB. The results are then sorted in descending order based on the traffic volume in each AZ.

Grouping network traffic by flow direction

stats sum(bytes / 1048576) as Traffic_MB by flowDirection as Flow_Direction 
| sort by Bytes_MB desc

This query is designed to analyze network traffic grouped by flow direction. (Ingress or Egress)

Top 10 data transfers by source and destination IP addresses

stats sum(bytes / 1048576) as Data_Transferred_MB by srcAddr as Source_IP, dstAddr as Destination_IP 
| sort Data_Transferred_MB desc 
| limit 10

This query retrieves the top 10 data transfers by source and destination IP addresses. This query allows for identifying the most significant data transfers between specific source and destination IP addresses.

Amazon SNS Logs

Count of SMS message failures by reasons

filter status = "FAILURE"
| stats count(*) by delivery.providerResponse as FailureReason
| sort delivery.providerResponse desc

The query above lists the count of Delivery failures sorted by reason in descending order. This query can be used to find the reasons for delivery failure.

SMS message failures due to Invalid Phone Number

fields notification.messageId as MessageId, delivery.destination as PhoneNumber
| filter status = "FAILURE" and delivery.providerResponse = "Invalid phone number"
| limit 100

This query returns the message that fails to deliver due to Invalid Phone Number. This can be used to identify phone numbers that need to be corrected.

Message failure statistics by SMS Type

fields delivery.smsType
| filter status = "FAILURE"
| stats count(notification.messageId), avg(delivery.dwellTimeMs), sum(delivery.priceInUSD) by delivery.smsType

This query returns the count, average dwell time and spend for each SMS type (Transactional or Promotional). This query can be used to establish thresholds to trigger corrective actions. The query can be modified to filter only certain SMS Type, if only that SMS Type warrants corrective action.

SNS failure notifications statistics

fields @MessageID 
| filter status = "FAILURE"
| stats count(delivery.deliveryId) as FailedDeliveryCount, avg(delivery.dwellTimeMs) as AvgDwellTime, max(delivery.dwellTimeMs) as MaxDwellTime by notification.messageId as MessageID
| limit 100

This query returns the count, average dwell time and spend for each failed Message. This query can be used to establish thresholds to trigger corrective actions.