You are here

MySQL single query performance - the truth!

MySQL single query performance - the truth!

As suggested by morgo I did a little test for the same query and the same data-set mentioned in Impact of column types on MySQL JOIN performance but looking into an other dimension: the time (aka MySQL versions).

The answer

To make it short. As a good consultant the answer must be: "It depends!" :-)

The test

The query was again the following:

SELECT *
  FROM a
  JOIN b ON b.a_id = a.id
 WHERE a.id BETWEEN 10000 AND 15000
;

The Query Execution Plan was the same for all tested releases.

The relevant MySQL variables where used as follows where possible. Should I have considered join buffer, or any other of those local per session buffers (read_buffer_size, read_rnd_buffer_size, join_buffer_size)?

innodb_buffer_pool_size        = 768M
innodb_buffer_pool_instances   = 1
innodb_file_per_table          = 1

The results

mysql-4.0.30mysql-4.1.25mysql-5.0.96mysql-5.1.73mysql-5.5.35mysql-5.6.15mysql-5.7.3
AVG40.8638.683.714.694.647.226.05
MEDIAN41.0738.133.694.464.656.326.05
STDEV1.512.260.060.340.032.210.03
MIN39.2736.993.674.404.596.266.02
MAX44.1144.453.865.234.6713.166.10
COUNT10.0010.0010.0010.0010.0010.0010.00

mariadb-5.1.44mariadb-5.2.10mariadb-5.3.3mariadb-5.5.34mariadb-10.0.6
AVG4.588.638.345.026.12
MEDIAN4.587.978.015.026.01
STDEV0.011.451.100.020.25
MIN4.557.867.904.995.97
MAX4.6011.3811.465.066.75
COUNT10.0010.0010.0010.0010.00

percona-5.0.92-23.85percona-5.1.72-14.10percona-5.5.34-32.0percona-5.6.14-62.0
AVG3.794.704.9410.53
MEDIAN3.794.704.8912.41
STDEV0.020.030.143.35
MIN3.764.674.865.68
MAX3.834.755.3412.93
COUNT10.0010.0010.0010.00

galera-5.5.33-23.7.6 / 2.7
AVG4.31
MEDIAN3.98
STDEV1.18
MIN3.76
MAX8.54
COUNT30.00

The Graph

the_truth.png

Conclusion

Do not trust benchmarks. They are mostly worthless for your specific workload and pure marketing buzz... Including the one above! ;-)

Database vendors (Oracle/MySQL, Percona, MariaDB) are primarily focussing on throughput and features. In general this is at the costs of single query performance.
MySQL users like Facebook, LinkedIn, Google, Wikpedia, Booking.com, Yahoo! etc. are more interested in throughput than single query performance (so I assume). But most of the MySQL users (95%) do not have a troughput problem but a single query performance problem (I assume here that this is true also for Oracle, MS-SQL Server, DB2, PostgreSQL, etc.).

So database vendors are not primarily producing for the masses but for some specific users/customers (which possibly pay a hell of money for this).

Back to the data:

My first hypothesis: "The old times were always better" is definitely not true. MySQL 4.0 and 4.1 sucked with this specific query. But since MySQL 5.0 the rough trend is: single query performance becomes worse over time (newer versions). I assume this also true for other databases...

Some claims like: "We have the fastest MySQL" or "We have hired the whole optimizer team" does not necessary reflect in better single query performance. At least not for this specific query.

So in short: If you upgrade or sidegrade (MySQL <-> Percona <-> MariaDB), test always very carefully! It is not predictable where the traps are. Newer MySQL release can increase performance of your application or not. Do not trust marketing buzz!

Artefacts

Some artefacts we have already found during this tiny test:

  • In MySQL 5.0 an optimization was introduced (not in the Optimizer!?!) to speed up this specific query dramatically.
  • MariaDB 5.2 and 5.3 were bad for this specific query.
  • I have no clue why Galera Cluster has shown the best results for 5.5. It is no intention or manipulation! It is poor luck. But I like it! :-)
  • MySQL 5.6 seems to have some problems with this query. To much improvement done by Oracle/MySQL?
  • Percona 5.6 sometimes behaves much better with this query than normal MySQL but from time to time something kicks in which makes Percona dramatically slower. Thus the bad results. I have no clue why. I first though about an external influence. But I was capable to reproduce this behaviour (once). So I assume it must be something Percona internally (AHI for example?).

Finally

Do not shoot the messenger!

If you want to reproduce the results most information about are already published. If something is missing please let me know.

Please let me know when you do not agree with the results. So I can expand my universe a bit...

It was fun doing this tests today! And MyEnv was a great assistance doing this kind of tests!

If you want us to do such test for you, please let us know. Our consulting team would be happy to assist you with upgrading or sidegrading problems.

Comments

Thanks for running the test.
morgocomment

Thanks, Morgan!
Shinguzcomment

Good post. You are raising a real topic here. This Query is just in memory full table scan with thread state transition to make a join. I think MariaDB found the issue that your test case perfectly match http://kristiannielsen.livejournal.com/17598.html Can you try to validate this ? Why do you see so much latency variation of the same in memory query what is wrong with the hardware ? Now if you think of it, how many people will read that blob and just think MySQL is bad doing a simple join in 6s, I think it's a lot :)Have you also try to play with nmap barrier of you hardware ? /steph
svaroquicomment

Hello Stephane,

Sorry, I cannot completely follow you what you want me to do. But please feel free to reproduce. All needed information are available I think!

Oli

olicomment

Hi, With the given standard deviation for the MySQL 5.6.15 run, the 95% confidence interval for the average is between 5.85 and 8.59, if my rusty statistics knowledge serves me right. Hence, the difference between 5.6 and 5.7 is not statiscally significant. How did you run this test? Did you run some warmup queries before starting the measurements? I have found that the InnoDB adaptive hash index will gradually be built during the first runs of a query. Hence, I usually run a query 10 times before starting measurements.
oysteingcomment

you wrote: " But most of the MySQL users (95%) do not have a troughput problem but a single query performance problem (I assume here that this is true also for Oracle, MS-SQL Server, DB2, PostgreSQL, etc.)." where the 95% come from?
massimocomment

Hi Massimo,

From my last 14 years experience as DBA and DB consultant (Oracle, MySQL, MS-SQL Server, PostgreSQL). I have seen several hundreds? different DB performance problems in these years and only few of them (1 upon 20? or even less) where real throughput problems.
Most of them where single query performance problems (including optimizer/optimizing and configuration problems).

Regards,
Oli

olicomment

95% of all statistics is made up ... If you consider all users of MySQL including those using it to store their record collection, I would think that more than 95% does not have any performance problem. (100% of the statistics in this comment is made up ...)
oysteingcomment

Hello Øystein,

Unfortunately I have to agree! MySQL is damn fast/cool and hardware nowadays as well. And having much RAM and fast multi-core CPU's sucks from a tuner point of view ;-)

olicomment

Hello Øystein,

The high/bad standard deviation with MySQL 5.6 comes from some very high outlier in the beginning of the test. I did not delete this outlier because I have seen this as an unfair treatment to the other releases which did not show this phenomena.

I dropped the first measurement always (memory heat-up).

And I did all the test series in the same way. So more or less same fair/unfair for all releases.

By ignoring the outlier you still can see around 5% changes in performance.

About statistics: I never understood these guys. But my feeling tells me that in measurement 4 to 11 5.7 is sligthly faster than 5.6. Possibly not significant but sligthly and measureable...

And in parctice. I understand your skipping of 10 queries before measuring. But how realistic is this in real DB life? The first shot must be fast...

Probemysql-5.6.15mysql-5.7.3Comment
113.046.11value dropped
28.586.10
313.166.05outlier
46.266.08
56.316.05
66.346.03
76.286.05
86.266.02
96.286.08
106.386.03
116.336.04
olicomment

In a production system, I think what matters is the steady-state performance of the system, not what happens just after system restart.
oysteingcomment

From throughput point of view I agree. What about from response time point of view? We fire a query once and never again (the same).
olicomment