Like all relational databases, MySQL can prove to be a
complicated beast that can halt at a time, leaving your applications in the
lurch and your business on the line. The truth is, most MySQL performance
issues are underlying common mistakes. It is important to eliminate these
errors, which are often obscured by some subtlety in your workload or
configuration trap, in order to ensure that your MySQL server hums along at top
speed, providing stable and consistent performance.
Fortunately, many MySQL
performance optimization issues turn out to have similar solutions, making
it a manageable task to troubleshoot and tun MySQL.
Here are 10 tips from MySQL expert for great performance:
Profile the workload - Profiling the workload of the server is the best way to
understand how your server spends its time. You can expose the costliest
queries for further tuning by profiling your workload. Time is the most
important metric here because you care very little about anything except how
quickly it completes when you issue a query against the server.
The best way to profile your workload is by using a tool
like the query analyzer for MySQL Enterprise Monitor or the Percona Toolkit
pt-query-digest. These tools capture server queries and return a task table
sorted by decreasing response time order, instantly bubbling up to the top of
the most expensive and time-consuming tasks so you can see where to focus your
efforts.
Understand the
fundamental resources - Four basic resources are needed for a database server to
function: CPU, memory, disk, and network. If any of these are weak, erratic, or
overloaded, it is highly likely that the database server will perform poorly.
Understanding the basic resources is important in two specific areas: hardware
selection and troubleshooting.
Ensure good-performing components all around when selecting
MySQL hardware. Just as important, they are fairly well balanced against each
other. Organizations often choose servers that have fast CPUs and disks but are
hungry for memory. In some cases, adding memory is a cheap way to increase
performance by magnitude orders, particularly on disk-bound workloads. This may
seem counterintuitive, but disks are overused in many cases because there is
not enough memory to hold the working data set of the server.
Avoid using MySQL as
a queue - Queues and access patterns like queues can sneak into your
application without your knowledge. For example, if you set an item's status so
that it can be claimed by a particular worker process before acting on it, then
you create a queue unwittingly. It is a common example to mark emails as
unsent, send them, then mark them as sent.
Queues cause problems for two main reasons: they serialize
your workload, preventing parallel tasks, and often result in a table that
includes processing work as well as historical data from long processed jobs.
Both add application latency and load MySQL.
Filter the results by
the cheapest first - A great
way to optimize MySQL is first to do cheap, imprecise work, then hard,
accurate work on the smaller, resulting data set. Suppose you're looking for
something within a given geographic point radius, for example. The first tool
in the toolbox of many programmers is the "Haversine" formula for
computing distance along a sphere's surface. The problem with this technique is
that the formula requires a lot of very CPU-intensive trigonometric operations.
Great-circle calculations tend to run slowly and skyrocket the machine's use of
the CPU.
Before applying the formula of the great circle, set your
records to a small subset of the total and adjust the resulting set to an
accurate circle. A square containing the circle is an easy way to do this
(precisely or inaccurately). Thus, with all those expensive trig functions, the
world outside the square is never hit.
Know the scalability
death traps - Scalability isn't as vague as you might think. There are, in
fact, precise mathematical scalability definitions expressed as equations.
These equations emphasize why systems are not as scaleful as they should be.
Take the Universal Scalability Law, a definition useful to express and quantify
the scalability characteristics of a system. It explains problems of scaling in
terms of two basic costs: serialization and crosstalk.
Their scalability is inherently limited in parallel
processes that must be stopped for something to take place serialized.
Likewise, they limit each other if the parallel processes have to chat with
each other all the time to coordinate their work. Evite serialization and
crosstalk, and it will scale your application much better. What translates into
MySQL's inside? It varies, but there are some examples that would avoid
exclusive row locks. That is why Queues, point 3 above, tend to scale poorly.
Indexing is probably the most misunderstood topic in
databases as there are so many ways to get confused about how indexes work and
how they are used by the server. To really understand what is going on, it
takes a lot of effort.