HOME/Articles/Recording slow queries and analyzing them using pt-query-digest

Recording slow queries and analyzing them using pt-query-digest

2023/1/19 10:552024/9/20 9:05

This is Oniumi (@fat47) from the Service Reliability Group (SRG) of the Technology Headquarters.

#SRG(Service Reliability Group) mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, contributing to OSS, etc.

This article is located in SRGDBWGWe will publish materials regarding the database that the (DB Working Group) provides to the entire company.

I hope this helps in some way.

What is a slow query?Recommended settings for the slow query log Enable log output Rotation setting with logrotate How to analyze the slow query log Get statistics on slow queries Use pt-query-digest options to investigate Analyzing Queries with EXPLAIN Conclusion

What is a slow query?

Anything that exceeds the long_query_time value specified in the MySQL settings will be logged.

Generally, slow queries are caused by poorly indexed or queries that call up a large number of rows.

These queries are often expensive and put a strain on the database.

Recommended settings for the slow query log

💡

This document describes the configuration procedure when you are running MySQL on-premises or on EC2.

Enable log output

Set in my.cnf

Or log in to MySQL and run the following:

Rotation setting with logrotate

With this setting, slow queries will continue to be written to the same mysql-slow.log file.

Since this will make analysis difficult, we will rotate the data by date.

Running the tests

Once the settings are complete, the logs will be accumulated as shown below.

How to analyze the slow query log

If you check the contents of the recorded slow query log, you will see that queries that exceed the threshold are recorded in the following format.

In actual logs, there can be hundreds or even thousands of these items, so it is impossible to check them all.

Get statistics on slow queries

Let's use pt-query-digest from the Percona Toolkit to gather statistics on slow logs.

Download Percona Toolkit

Download the latest version of Percona Toolkit to access advanced command-line tools and perform complex MySQL and system tasks.

https://www.percona.com/downloads/percona-toolkit/LATEST/

Installation example for CentOS7

Try running it without any options:

Let's check the output

At the top, log statistics such as Exec Time and Lock Time are summarized.

The next section shows a ranking of the slowest response times and the highest number of calls.

Response time This shows the total execution time of that query and the percentage of the total slow log that it takes up.

Calls is the number of times the query was executed

R/Call Response Time / Call is the average response time per execution

V/M is the ratio of the response time variance to the mean (?)

Item displays a summary of the query

The next section records aggregate information for each query.

Query_time distribution A graph of average query execution time is displayed.

Tables This is a query that allows you to check the table information related to this query.

EXPLAIN Here is a sample of a query that can be checked with EXPLAIN:

In this log example, we can see that Query1 is an extremely slow query, with most response times ranging from 100 ms to 10 seconds.

Use the pt-query-digest options to investigate

As mentioned before, you can perform analysis with the default settings,

The following options can be used to make analysis more convenient.

pt-query-digest - Percona Toolkit Documentation

pt-query-digest - Analyze MySQL queries from logs, processlist, and tcpdump. pt-query-digest analyzes MySQL queries from slow, general, and binary log files. It can also analyze queries from and MySQL protocol data from tcpdump. By default, queries are grouped by fingerprint and reported in descending order of query time (i.e.

https://docs.percona.com/percona-toolkit/pt-query-digest.html

We will introduce some excerpts.

I want to search the logs by specifying a date or time range.

I want to sort by total query execution time

I want to sort by the number of lines read

I want to display tables with the most slow queries

Analyzing Queries with EXPLAIN

After analyzing the slow query log and identifying queries that can be improved,

Now it's time to analyze the query itself.

We plan to create a separate article about this, so please wait a while.

Conclusion

SRG is looking for people to work with us. If you are interested, please contact us here.

Recruitment information - CyberAgent SRG #ca_srg

About SRG SRG (Service Reliability Group) is working to improve reliability by promoting the introduction of SREs to the media business as a cross-sectional SRE under the vision of "improving reliability across the media business." The work is centered around the following three pillars: Consolidating and deploying the technical know-how of each business

https://ca-srg.dev/0d353a3a68ba4fb0a2e3443d7009a22b