muehlwitz_queries_13822737

Despite the creation of normalized table structures and the setting of suitable indices, MySQL Queries can be generated which take a few seconds to execute. When used on the Web, such response times are not acceptable and can usually be optimized by simply changing the queries-provided that one recognizes at all that Slow Queries are present in the System.

The speed of a database depends on many factors. The work begins here with the design of the database, as both structure and indices contribute significantly to the speed of Queries. But even if you have done everything right, it can happen with the presence of large amounts of data – for example with complex Joins, but also with the use of Foreign Keys (InnoDB) – that a Query devours a computing power of a few seconds. There are no fixed rules when you fall into such a situation, but the optimization possibilities in MySQL are almost infinite in this respect.

Instead of tackling the evil at the root, Caching is usually used. Although this serves its purpose, it only works as long as you are not dependent on real-time data. For example, the user would not be very pleased if Facebook only updated its Newsfeed once per hour.

Aggravating in the case of speed losses of this kind is the fact that they are usually only noticed months after commissioning. This is precisely when the number of Queries becomes too large due to the number of users, or the amount of data has increased exponentially. The error will not be detected in a local development environment with limited data sets, it may not even help to mirror the data. Instead, an accurate monitoring of the server must take place over a longer period of time. MySQL provides the necessary tools.

What are Slow Queries and how do they arise?

The Definition of what a Slow Query is, is up to you. In the case of MySQL, the maximum execution time that must be reached for a Query to be considered a Slow Query is adjustable in the configuration file. This also makes perfect sense, since different execution times can be accepted depending on the application. In the Frontend of a web application, for example, it is only about pure speed optimization, whereas in the Backend you can accept longer running times for statistical evaluations.

Accordingly, a Slow Query is a Query whose execution time is above a limit defined by the Administrator. the default setting of MySQL, the limit for a Slow Query is 10 seconds. A value of which one might actually think that it is difficult to achieve, but which certainly occurs.

The reasons for the occurrence of Slow Queries are different and among other things also dependent on the set limit for the Definition of such. Of course, you can set these so small that you get a variety of Slow Queries delivered, but they cannot be optimized. For this reason, the maximum runtime of a Query must be selected agile and the application accordingly. As a rule, a Slow Query occurs because the corresponding request to MySQL is clumsy, or the database has not been optimized for this request. In the simplest case, a Slow Query can be fixed by introducing indices or a more skilful table structure; in the most complicated case, MySQL must be helped in selecting data with PHP. There are many more solutions between these two, which are based on the fact that the Query itself is surrounded. In most cases, the problems arise with complex JOINs or with the use of Group by, Foreign Keys and triggers.

How to find Slow Queries?

Slow Queries are published in the my.cnf server switched on for all databases. In principle, the construct works as follows: if the Variable log_slow_queries is set to on, all Queries that last longer than long_query_time seconds are written to the file slow_query_log_file. All these values can either be passed to MySQL as a Parameter at startup or can be specified in the configuration file.

In addition, there are other ways to find non-optimized Queries, namely all those that do not use indices for their queries.

For this, the value log_queries_not_using_indexes must also be set to on. Furthermore, only Queries that return large amounts of data can be observed. The Variable min_examined_row_limit must be set accordingly.

The resulting log file is neither legible nor meaningful. This was probably also thought in MySQL and for this reason you should use the command mysqldumpslow to output the log. This outputs the resulting log entries in a meaningful and, if desired, also filtered Format. An overview of the necessary parameters can be obtained via-help. The respective log file can be specified as a Parameter (Listing 1).

Listing 1: example of a log output

shell> mysqldumpslow
Reading mysql slow query log from /usr/local/mysql/data/mysqld51-apple-slow.log
Count: 1  Time=4.32s (4s)  Lock=0.00s (0s)  Rows=0.0 (0), root[root]@localhost
insert into t2 select * from t1
Count: 3  Time=2.53s (7s)  Lock=0.00s (0s)  Rows=0.0 (0), root[root]@localhost
insert into t2 select * from t1 limit N
Count: 3  Time=2.13s (6s)  Lock=0.00s (0s)  Rows=0.0 (0), root[root]@localhost
insert into t1 select * from t1

 

As you can easily see from the example, you can now rejoice once, if each user uses only a single database. Otherwise, one quickly faces the Problem that one cannot locate in which DB a Query has been executed. However, table names usually help here. As you can see, the Log provides all the necessary information:

How often was the Query executed?
What execution time was required?
Is the Query so bad that it has blocked requests (e.g. complex INSERTs)?
How many lines were returned as a result?
Which user executed the Query?
And last but not least: the Query itself.

By the way, after the configuration you can perform a small Test. To make sure you did everything right, you can simply run the Query SELECT SLEEP(30);. It should definitely appear in the Slow Query Log, but will also take 30 seconds to execute.

It would still be worth mentioning at this point that you do not turn on the Log, restart the Server and get a meaningful result after 30 seconds. If you decide to optimize the Slow Query Log, This is a lengthy process, which is characterized by the fact that you have to give the Server enough time to locate the Slow Queries, collect them and enter them into the Log. Because as you can see in the example, many small reasonably reasonable Queries can also become a Problem.

How to optimize Slow Queries?

Once you have located Slow Queries, you quickly face the Problem that you don’t know how to do it better. To make matters worse, MySQL in particular offers several options to combat Slow Queries. The use of various table formats (InnoDB) also slows down the Server per se. The basic configuration is not optimized for Performance.

The first and most common way to accelerate Slow Queries in practice is to set indexes skillfully. If you take over a database from a third party, it often happens that bad or even no indices have been added to a Table. Broken or fragmented Tables (keyword: repair table) also contribute to slow Queries. If you analyze the Select, you will quickly find out which columns (or column combinations) could be queried faster provided with an Index. This is the simplest type of optimization that often brings the greatest benefit. However, one should assume with a professional programmer that this case should not actually occur.

The second option is to query only the data you really need. If you assume that in a table with 1 million entries you only need the Top 3 for a representation, you do not have to return the entire Million sorted anyway. If you limit the Query with a Limit, you drastically reduce the memory consumption.

This was almost the topic of simple optimizations. The rest of the optimizations that can be performed are usually based on trying out, and most of the time you can’t figure out why a modified Query is executed faster than the “old”one. In practice, InnoDB and its foreign Keys in connection with JOINs in particular have proven to be a data miracle. A Join of several tables via Foreign Keys or a GROUP BY can easily develop into a Slow Query, but if you combine the tables in a different order, this can lead to a speed gain. If you use the options on DELETE or on CASCADE or even TRIGGER, even a simple database query can trigger a rat tail of operations in MySQL, which is no longer manageable. I do not generally advise against using these techniques, but as a rule they do not make sense in the default configuration of the server. One should always ask oneself at this point whether one uses these functions for convenience, or whether it is otherwise not better and faster.

A very unique logic with regard to Slow Queries can be found when one goes into the field of Strings and mathematics. Here, one quickly faces an architectural Problem, namely the question of where the program logic belongs: in the programming language that uses the database server, or the database server itself? Of course, in MySQL I can combine three columns into a single one, calculate the mean on the occasion directly and reformat all stored date formats, but is this useful? And does that even belong in the database server?

A question that everyone has to answer for themselves, but at this point also has incredible optimization potential. It’s not uncommon to dramatically speed up an application by exchanging this logic between a database server and a programming language – and funnily enough, it works in both directions from time to time.

As you can see, there is no universal solution for optimizing Queries and in the end only helps to try. Nevertheless, you should always remember that a database query that works does not have to be the one that works quickly. As a rule, you don’t have to optimize, because we are talking about Microoptimizations, but there may be Queries that take several seconds to execute, but which could be drastically accelerated.

Tools for dealing with Slow Queries

Since we can now find and analyze Slow Queries, one can be glad that MySQL provides these functions on the one hand, on the other hand one will very quickly reach the limits of the system. Two basic problems arise here: the clarity of the log file and the question of how to optimize the Query in order to speed it up. One fact that aggravates the situation in particular is the customer who wants to have his application faster. As a rule, it is not possible for you to calculate two days to optimize MySQL – for something that you could have done better from the beginning. Or in other words: the customer has the impression pretty quickly that he has to pay something he does not need to pay: Bugfixing.

Here you go into an argumentative vacuum, because you admittedly really do Bugfixing, and this usually at your own expense. Therefore, you should avoid Slow Queries from the beginning, or you can use the appropriate Tools to accelerate this optimization process.

Since this is not necessarily easy, one leaves at this point quite quickly the range of the Open Source. Here you have to weigh whether it is not nevertheless useful to pay 500 euros once, or whether it is better to sit two days unpaid in front of the computer.

The basic equipment of a MySQL server nowadays definitely includes the Percona Toolkit. Percona is a company that offers services around MySQL and operates the excellent MySQL performance blog. Among other things, Percona specializes in the performance optimization of MySQL and offers some of the Tools developed in this context as Open Source – including the Toolkit. It extends MySQL with some command line tools, which among other things also deal with the analysis of the Slow Query Log. The mk query digest, which used to be often used for these purposes, is now part of the Percona Toolkit and is no longer developed separately.

The presence of appropriate command line tools makes the situation more comfortable, but not necessarily better, because a graphical preparation of the situation usually helps better and faster. At this point, if you want to work sensibly, you leave the field of Open Source. By the way, this concerns the entire area of graphical user interfaces for MySQL. The tools offered by MySQL do not meet my requirements, on Windows I have used HeidiSQL for years. In the meantime, however, it looks like I would rather pay money once for a reasonable Tool than get upset about missing functions. For this reason I only use Navicat since a change to Mac.

In the area of MySQL Profiling (which, in contrast to mysql slow queries, then also represents a suitable search on Google), there are specialized user interfaces that allow both the display and the analysis of Slow Queries and directly provide corresponding optimization tips (fig. 1).

muehlwitz_queries_13822737

If you initially stay in the area of Open Source, you will very quickly find the SQL Profiler, which is sufficient for most use cases. Through a Remote connection, the Tool, which is available for Windows, Mac and Linux, connects to one or more MySQL servers. The actual task of the tool is the general monitoring of the servers, so it analyzes CPU and memory consumption. In addition, it can also collect and display Slow Queries. By its own admission, it is more effective and precise than Jet Profiler. The latter collects much of its information via the SHOW STATUS command, whereas the SQL Profiler works as a real Sniffer and (should) collect more information. The truth is probably somewhere in between, but does not change the fact that the Tool provides interesting results. Again, it is so that it saves a complete Session over a longer period of time and then analyzes it. At this point, by the way, one should not underestimate the performance loss that such a Profiling Tool causes in a Server. It is not recommended to perform a performance analysis at peak times – even if you get the best results at these times.

The advantage of the Jet profiler over the SQL Profiler is nevertheless obvious: not only Slow Queries are found, but also tips are given on how to optimize them. The Jet Profiler distinguishes between a free Version and a paid version and is available for Windows, Mac and Linux. The difference between the two versions for our use case is that in the free version a Monitoring takes a maximum of 2 hours, but unfortunately no Slow Queries are displayed – in fact not usable. Instead, the top queries are displayed, i.e. the Queries that are most frequently performed (fig. 2).

muehlwitz_queries_26321651

Here, too, there is potential for optimization, but not to the extent that Slow Queries offer it. Another advantage of Jet Profiler is the use of an SSH tunnel (also in the free Version), which makes it possible to access servers via SSH that do not allow Remote Connection. This is a not negligible advantage, since most hosted MySQL servers are only accessible via localhost and do not allow Remote connection.

In both cases, interesting functions are provided beyond the Slow Queries, such as the most used databases, the tables that generate the largest load, but also the Queries that are executed most often. At least the latter are a good starting point for Caching, whereby we are here again in the field of micro-optimization. However, Slow Queries are still the Queries with the greatest optimization potential and the ones that burden the Server the most. Due to the structure, it is also likely that Monitoring with Jet Profiler will consume fewer resources than one with SQL Profiler.

The typing function of the Jet Profiler is quite pleasant, because it usually manages to show why a Query is slow, and how to optimize it. However, it also does not always work reliably, and it is rather the overall package that brings clear advantages to the analysis of MySQL. On the Jet Profiler Homepage, a one-minute Video quickly and effectively shows the possibilities of the tool.

If you limit yourself to the main focus of the article – namely the Slow Queries – an analysis via the command line with the help of the Percona Tools should be sufficient, because ultimately it is about the detection of these Queries and not about a complete Server Analysis. The optimization of Queries always remains with you. None of the tools presented here offers a universal solution that solves the problems by pressing a button. Some tips are given and mostly they fit. As an experienced Database Developer and programmer, however, you should know how to fix such problems – provided you find them. Ultimately, this is the big challenge that all the tools presented can help you with. All programs are either Open Source or can be tested extensively.

Conclusion

The optimization possibilities offered by MySQL are almost infinite. For a simple web application, a large speed gain can usually be achieved by using suitable indices. A configuration of the database server adapted to the application, whose Default settings are not particularly optimized, also helps in many cases. Nevertheless, you can create Queries whose execution time is not acceptable. In some cases-especially in the field of data Mining-this cannot be prevented. In most cases, however, it is the case that a working Query does not necessarily represent an optimal Query.

In many places, you have to carefully consider whether you can achieve your goal differently or more easily, and whether the operations performed in MySQL would not be better anchored in the programming language. There is no ideal solution here, a defusing of the situation is usually only achieved by a variety of experiments. In some cases you simply change the Query, in other cases you merge the data in PHP faster, and every now and then you solve the Problem with an application that is much more complex than the original Query, but is still processed faster.

Especially MySQL optimization is an often neglected area, which is often due to the ignorance of the client. The acceptance of an application usually does not take place in a load situation and subsequent optimization work is often at the expense of the programmer. This sees – which is quite understandable – no need to carry out a time-intensive optimization free of charge. Especially for applications that generate a corresponding load, so that an optimization becomes necessary, two circumstances must be assumed: a) the programmer has worked poorly or B) the application is so important/successful that an optimization should be available in the budget. Since we are all experienced programmers, point a) should not be assumed, but the customer should be sensitized regarding the second point. In times when Google considers the loading time of a web application as a ranking factor, every possibility should be used and known to the customer.

LEAVE A REPLY

Please enter your comment!
Please enter your name here