Things I have learned from troubleshooting mySQL queries

Commands

# Run query without cache
SELECT SQL_NO_CACHE * from users where name like 'Ol%';

# Profiling
SET SESSION profiling = 1;
SHOW PROFILES;
SHOW PROFILE FOR QUERY 3;
SHOW STATUS LIKE 'Last_Query_Cost';

How to index/optimize

Optimize OR

Example: SELECT * FROM users where customerNumber='qwe' OR birthDate='2020-01-01';

You need an index for each OR, which will create a merge index on modern innoDB. If you are missing an index for one OR you will have to do a full table scan. Basically, one condition can ruin the query. If you create a joined index on (customerNumber, birthDate) or (birthDate, customerNumber) they will have not effect. To use a joined index you have to do two solo queries and UNION them.

Optimize AND

Example: SELECT * FROM users where customerNumber='qwe' AND birthDate='2020-01-01';

Every AND index helps, which will create a merge index if possible. If you are missing an index for one AND, you will have to scan the result of the index/merge index. Basically, missing a condition is probably fine. If you create a joined index on (customerNumber, birthDate) or birthDate, customerNumber) , they further most likely optimize the query and be used instead. It does not matter the order of customerNumber/birthDate, no matter the order in the select or AND.

Cover Index

Example: SELECT customerNumber, birthDate, memberStartDate FROM users where customerNumber='qwe' AND birthDate='2020-01-01';

If you want to optimize a query further, you can create a cover index. It is an index which covers all the data in the select statement, making SQL not having to pop by the row to pick the the required data. To do so, create an index on (customerNumber, birthDate, memberStartDate). That index will be prioritized over a index on (customerNumber, birthDate) despite memberStartdate not being in the WHERE-part.

Gotcha: Group by id with a range

Example: You have a table which you want to query for a time range and group by it’s ID. Even if the ID’s are in the same order as the range (which MySQL cannot know). This can be super quick with a low time range span but become super slow with just a fraction longer time range.