Here we take a look at how one of the example filters supplied with the MaxScale 1.0 beta can answer that simplest of profiling questions - "Which of my database queries run within the MySQL server for the longest time?".
Why A MaxScale Filter?
MaxScale filters, although they can be thought of as simply a way to modify the SQL requests that pass through MaxScale, are also a good place to insert logging or monitoring. They have the advantage that that receive not just the SQL statements that are sent to the server, but they may also receive the returned results. This means that MaxScale can not just log the query information, but it can also record timestamps related to when it observes different parts of the request or response. As a way to demonstrate this a very simple example filter was constructed for the 1.0 Beta release of MaxScale - the top filter.
What Does Topfilter Do?
How Do I Setup My System
The first step in creating a system to allow you to measure your queries is to configure a MaxScale service that is suitable for your application. That may be any of the supported routers; e.g. the readconnroute or Read/Write Splitter routers. Here we will assume that we only have one server, we are not using MaxScale for the scale-out abilities of the tool, merely as a simple performance tuning aid. Therefore we will use the readconnroute router with the option to connect to any server, the only server we have. The basic configuration for the service is as follows:
The service also requires a listener, to define the port on which MaxScale will listen to requests to connect to the service and to link the service with the protocol module.
We have used port 7706 for MaxScale to listen to incoming requests, this will allow the MaxScale to be located on the same machine as the MySQL/MariaDB server, any port may be chosen by the user provided it is not already in use. If no MySQL instance is running on the same machine the default MySQL port of 3306 may be used.
In this example we will assume we have a single database node; which we will call dbnode1.
The database node should be configured as follows;
The address= entry should contain the IP address or a hostname for the server. The assumption here is that the MySQL server is using the default MySQL port of 3306 to listen for incoming requests.
For completeness we should also add a monitor module that can determine if the node is in the correct state to communicate. This is more important if the backend is not a single database server, but a cluster of servers. It is good practice to always include a monitor even when only a single node is configured as it allows for ease of extension to a cluster and provides more information as to the state of the system.
We now have a configuration that basically allows MaxScale to be used as a pure proxy for accesses to the database. There is no intelligence within MaxScale being used, as we have no options to route to more than a single database. Although the principles here are still true if multiple backend databases are given and the more usual scale-out properties of MaxScale are used, in order to reduce the complexity and illustrate the technique only a single backend database has been used here.
At this point the application should work if connected to port 7706 on the MaxScale host, but no monitoring data will be collected. We most add the filter into the MaxScale configuration.
To add the filter we must first create a filter definition that links our filter module to the parameters we want for this particular case. Assuming we want to capture the 10 longest running queries and write them into a log file in the directory /var/log/myapp with filenames of the form session.$$, a filter section as follows can be used.
We then need to add this filter to the service by adding an extra line to the definition for the filter
Now if we connect to the port 7706 with our application we will get a number of files created in /var/log/myapp, these files will contain reports of the statements executed. An example of such a report is shown below.
Session started Thu Jun 19 17:06:34 2014
Connection from 127.0.0.1
Total of 24 statements executed.
Total statement execution time 35.653 seconds
Average statement execution time 1.486 seconds
Total connection time 80.443 seconds
Top 10 longest running queries in session.
Time (sec) | Query
22.856 | select sum(salary), year(from_date) from ...
5.301 | select d.dept_name as "Department", y.y1 ...
2.898 | select year(now()) - year(birth_date) as age, ...
2.221 | select dept_name as "Department", sum(salary) / 12 ...
0.839 | select dept_name as "Department", avg(year(now()) - ...
0.665 | select year(hire_date) as "Hired", d.dept_name, ...
0.337 | select moves.n_depts As "No. of Departments",...
0.245 | select year(now()) - year(birth_date) as age, ...
0.175 | select year(hire_date) as "Hired", count(*) as ...
0.160 | select year(hire_date) - year(birth_date) as ...
Please note, the queries have been truncated in this blog entry, an actual run would include the complete query text.
Refining The Selection Criteria
Other options may be given to the top filter to reduce the subset of queries it will consider; regular expression patterns may be used to include or exclude particular statements, the client address may be used to limit the sessions to only those originating from that address or the client user name may be used to limit the functionality to only that user.
I am only interested in select performanceIn this case you can use the regular expression capabilities to limit the report to simply show select statements, other statements will not be included in the ranking. To do this modify the filter section with the extra parameter match=
I want to see what data modification statements take a long time
In this case, rather than build a complex match expression the exclude= option can be used to exclude any select statements.
I only want select statements from a particular database user
The match= option can be used to restrict the logging to just select statements and the additional user= parameter can be used to further limit the statements that are including in the ranking.
MaxScale was not designed as a tool that would be used to analyse the performance of your SQL application, and it is not as good or as extensive in this role as some of the purpose written tools, but the filters and the general plugin architecture of MaxScale mean that it can be used to perform tasks it was not originally conceived to do. The filter presented here is only really a demonstration of what can be achieved, it was written very quickly and did not require very much code to implement. However it hopefully can be useful in its own right and also to stimulate more complex and useful filters to be created.