⚠️⚠️⚠️ STG SRG Portal ⚠️⚠️⚠️/Technical article/Recording slow queries and analyzing them using pt-query-digest

Recording slow queries and analyzing them using pt-query-digest

2024/11/28 19:182024/11/28 19:18

This is Onikai (@fat47) from the Service Reliability Group (SRG) of the Technology Division.

#SRGThe Service Reliability Group primarily provides comprehensive support for the infrastructure surrounding our media services, focusing on improving existing services, launching new ones, and contributing to open-source software (OSS).

This article is located within SRG.DBWGThe DB Working Group is releasing database-related materials that it provides to the entire company.

I hope this is of some help.

What is a slow query?Recommended settings for slow query logs Enable log output Set up rotation with logrotate How to analyze slow query logs Let's collect statistics on slow queries.Use the options of pt-query-digest to conduct the investigation.Analyze the query with EXPLAIN.In conclusion

What is a slow query?

Query times exceeding the `long_query_time` value specified in the MySQL settings will be logged.

Generally, slow queries are those where indexes are not properly set up or that involve calling a large number of rows.

These queries are often high-cost and therefore place a heavy load on the database.

Recommended settings for slow query logs

💡

This document outlines the configuration steps for running MySQL on-premises or on EC2.

Enable log output

Set in my.cnf

Alternatively, log in to MySQL and then execute the following:

Set up rotation with logrotate

With the current settings, slow queries will continue to be written to the same mysql-slow.log file.

To avoid making analysis difficult, we will rotate the data by date.

Run the test

Once the setup is complete, logs will be accumulated as shown below.

How to analyze slow query logs

When you examine the contents of the recorded slow query log, you'll see that queries exceeding the threshold are recorded in this format.

In actual logs, these entries can appear in the hundreds or even thousands, making it impossible to review them all.

Let's collect statistics on slow queries.

Let's use pt-query-digest, included in the Percona Toolkit, to collect statistics on slow logs.

Download Percona Toolkit

Download the latest version of Percona Toolkit to access advanced command-line tools and perform complex MySQL and system tasks.

https://www.percona.com/downloads/percona-toolkit/LATEST/

Installation example on CentOS 7

Let's try running it without any options.

Let's check the output.

At the top, the log statistics, such as Exec Time and Lock Time, are summarized.

The next section displays a ranking of services with slow response times and a high number of calls.

Response time This displays the total execution time of the query and the percentage of the total slow log that the query accounts for.

Calls This is the number of times the query was executed.

R/Call This is the average response time per execution (Response Time / Call).

V/M This is the variance-to-mean ratio of response time (?).

Item This displays a summary of the query.

The following section records aggregated information for each query.

Query_time distribution A graph showing the average query execution time is displayed.

Tables This query allows you to check the table information related to this query.

EXPLAIN This displays sample queries that can actually be verified using EXPLAIN.

In this log example, we can see that Query1 is an extremely slow query, with response times mostly between 100ms and 10 seconds.

Use the options of pt-query-digest to conduct the investigation.

As mentioned earlier, analysis can be performed even with the default settings,

You can perform the analysis more conveniently by utilizing the following options.

pt-query-digest - Percona Toolkit Documentation

pt-query-digest - Analyze MySQL queries from logs, processlist, and tcpdump. pt-query-digest analyzes MySQL queries from slow, general, and binary log files. It can also analyze queries from and MySQL protocol data from tcpdump. By default, queries are grouped by fingerprint and reported in descending order of query time (i.e.

https://docs.percona.com/percona-toolkit/pt-query-digest.html

Here are some excerpts.

I want to search logs by specifying a date and time range.

I want to sort the queries by total execution time.

I want to sort by the number of lines read, from most to least.

I want to display the tables in order of the number of slow queries.

Analyze the query with EXPLAIN.

Once you've analyzed the slow query logs and identified queries that seem like they could be improved,

Now we'll start analyzing the query itself.

We plan to create a separate article on this topic, so please wait a while.

In conclusion

SRG is looking for new team members. If you are interested, please contact us here.

Recruitment information - CyberAgent SRG #ca_srg

About SRG: SRG (Service Reliability Group) operates under the vision of "improving reliability across media businesses" and promotes the introduction of SRE into media businesses as a cross-functional SRE, working to improve reliability. Our work primarily revolves around the following three areas: Gathering and deploying technical know-how from each business.

https://ca-srg.dev/0d353a3a68ba4fb0a2e3443d7009a22b