Optimize and Tweak High-Traffic Servers
Focus: Linux, Apache 1.3+, [PHP], [MySQL]
Notes: Use at your own risk. If this has any errors, please let me know and I will correct them.
Summary
If you are reaching the limits of your server running Apache serving a
lot of dynamic content, you can either spend thousands on new equipment
or reduce bloat to increase your server capacity by anywhere from 2 to
10 times. This article concentrates on important and poorly-documented
ways of increasing capacity without additional hardware.
Problems
There are a few common things that can cause server load problems, and a thousand uncommon. Let's focus on the common:
Drive Swapping - too many processes (or runaway processes) using too much RAM
CPU - poorly optimized DB queries, poorly optimized code, runaway processes
Network - hardware limits, moron attacks
Solutions: The Obvious
Briefly, and for completeness, here are the most obvious solutions:
Use "TOP" and "PS axu" to check for processes that are using too much CPU or RAM.
Use "netstat -anp | sort -u" to check for network problems.
Solutions: Apache's RAM Usage
First and most obvious, Apache processes use a ton a RAM. This minor
issue becomes a major issue when you realize that after each process
has done its job, the bloated process sits and spoon-feed data to the
client, instead of moving on to bigger and better things. This is
further compounded by a bit of essential info that should really be
more common knowledge:
If you serve 100% static files with Apache, each httpd process will use around 2-3 megs of RAM.
If you serve 99% static files & 1% dynamic files with Apache, each
httpd process will use from 3-20 megs of RAM (depending on your MOST
complex dynamic page).
This
occurs because a process grows to accommodate whatever it is serving,
and NEVER decreases again unless that process happens to die. Quickly,
unless you have very few dynamic pages and major traffic fluctuation,
most of your httpd processes will take up an amount of RAM equal to the
largest dynamic script on your system. A smart web server would deal
with this automatically. As it is, you have a few options to manually
improve RAM usage.
Reduce wasted processes by tweaking KeepAlive
This is a tradeoff. KeepAliveTimeout is the amount of time a process
sits around doing nothing but taking up space. Those seconds add up in
a HUGE way. But using KeepAlive can increase speed for both you and the
client - disable KeepAlive and the serving of static files like images
can be a lot slower. I think it's best to have KeepAlive on, and
KeepAliveTimeout very low (like 1-2 seconds).
Limit total processes with MaxClients
If you use Apache to serve dynamic content, your simultaneous
connections are severely limited. Exceed a certain number, and your
system begins cannibalistic swapping, getting slower and slower until
it dies. IMHO, a web server should automatically take steps to prevent
this, but instead they seem to assume you have unlimited resources. Use
trial & error to figure out how many Apache processes your server
can handle, and set this value in MaxClients. Note: the Apache docs on
this are misleading - if this limit is reached, clients are not "locked
out", they are simply queued, and their access slows. Based on the
value of MaxClients, you can estimate the values you need for
StartServers, MinSpareServers, & MaxSpareServers.
Force processes to reset with MaxRequestsPerChild
Forcing your processes to die after a while makes them start over with
low RAM usage, and this can reduce total memory usage in many
situations. The less dynamic content you have, the more useful this
will be. This is a game of catch-up, with your dynamic files constantly
increasing total RAM usage, and restarting processes constantly
reducing it. Experiment with MaxRequestsPerChild - even values as low
as 20 may work well. But don't set it too low, because creating new
processes does have overhead. You can figure out the best settings
under load by examining "ps axu --sort:rss". A word of warning, using
this is a bit like using heroin. The results can be impressive, but are
NOT consistent - if the only way you can keep your server running is by
tweaking this, you will eventually run into trouble. That being said,
by tweaking MaxRequestsPerChild you may be able to increase MaxClients
as much as 50%.
Apache Further Tweaking
For mixed purpose sites (say image galleries, download sites, etc.),
you can often improve performance by running two different apache
daemons on the same server. For example, we recently compiled apache to
just serve up images (gifs,jpegs,png etc). This way for a site that has
thousands of stock photos. We put both the main apache and the image
apache on the same server and noticed a drop in load and ram usage.
Consider a page had about 20-50 image calls -- the were all off-loaded
to the stripped down apache, which could run 3x more servers with the
same ram usage than the regular apache on the server.
Finally, think outside the box: replace or supplement Apache
Use a 2nd server
You can use a tiny, lightning fast server to handle static documents
& images, and pass any more complicated requests on to Apache on
the same machine. This way Apache won't tie up its multi-megabyte
processes serving simple streams of bytes. You can have Apache only get
used, for example, when a php script needs to be executed. Good options
for this are:
TUX / "Red Hat Content Accelerator" - http://www.redhat.com/docs/manuals/tux/
kHTTPd - http://www.fenrus.demon.nl/
thttpd - http://www.acme.com/software/thttpd/
Try lingerd
Lingerd takes over the job of feeding bytes to the client after Apache
has fetched the document, but requires kernel modification. Sounds
pretty good, haven't tried it. lingerd -
http://www.iagora.com/about/software/lingerd/
Use a proxy cache
A proxy cache can keep a duplicate copy of everything it gets from
Apache, and serve the copy instead of bothering Apache with it. This
has the benefit of also being able to cache dynamically generated
pages, but it does add a bit of bloat.
Replace Apache completely
If you don't need all the features of Apache, simply replace it with
something more scalable. Currently, the best options appear to be
servers that use a non-blocking I/O technology and connect to all
clients with the same process. That's right - only ONE process. The
best include:
thttpd - http://www.acme.com/software/thttpd/
Caudium - http://caudium.net/index.html
Roxen - http://www.roxen.com/products/webserver/
Zeus ($$) - http://www.zeus.co.uk
Solutions: PHP's CPU & RAM Usage
Compiling PHP scripts is usually more expensive than running them. So
why not use a simple tool that keeps them precompiled? I highly
recommend Turck MMCache. Alternatives include PHP Accelerator, APC,
& Zend Accelerator. You will see a speed increase of 2x-10x, simple
as that. I have no stats on the RAM improvement at this time.
Solutions: Optimize Database Queries
This is covered in detail everywhere, so just keep in mind a few
important notes: One bad query statement running often can bring your
site to its knees. Two or three bad query statements don't perform much
different than one. In other words, if you optimize one query you may
not see any server-wide speed improvement. If you find & optimize
ALL your bad queries you may suddenly see a 5x server speed
improvement. The log-slow-queries feature of MySQL can be very helpful.
How to log slow queries:
# vi /etc/rc.d/init.d/mysqld
Find this line:
SAFE_MYSQLD_OPTIONS="--defaults-file=/etc/my.cnf"
change it to:
SAFE_MYSQLD_OPTIONS="--defaults-file=/etc/my.cnf --log-slow-queries=/var/log/slow-queries.log"
As you can see, we added the option of logging all slow queries to /var/log/slow-queries.log
Close and save mysqld. Shift + Z + Z
touch /var/log/slow-queries.log
chmod 644 /var/log/slow-queries.log
restart mysql
service myslqd restart
mysqld will log all slow queries to this file.
References
These sites contain additional, more well known methods for optimization.
Tuning Apache and PHP for Speed on Unix - http://php.weblogs.com/tuning_apache_unix
Getting maximum performance from MySQL - http://www.f3n.de/doku/mysql/manual_10.html
System Tuning Info for Linux Servers - http://people.redhat.com/alikins/system_tuning.html
mod_perl Performance Tuning (applies outside perl) - http://perl.apache.org/docs/1.0/guide/performance.html
Once again, if this has any errors or important omissions, please let me know and I will correct them.
If you experience a capacity increase on your server after trying the optimizations, let me know!
Article written by spagmoid additions by: albo,huck on the ev1 forums.