Open Source: Configuring Apache - Don't Succumb To The "Slashdot Effect"
Published January 27, 2006
Like many techno-geeks I host my LAMP website on a cheap ($150) computer and my broadband connection. I have also wondered what would happen if my site was linked on Slashdot or Digg. Specifically, would my setup be able to survive the "Slashdot Effect?" A Pentium 100mhz can easily saturate a T1's worth of bandwidth and my upload speed is capped (supposedly) at 384kbps, so the server should easily be able to handle that. My bandwidth will be saturated before the server is incapacitated, at least that's the idea.
The machine I use for my web server is a $150 PC that I bought from Fry's one day (I always buy their $150 PC's when they're in stock). Here are the relevant specs on my little server:
CPU: AMD Athlon 2600+
RAM: 512MB
Hard Drive: 40GB 7200RPM
Software: Debian Linux, MySQL, Apache, PHP, WordPress
There is additional software installed on this machine because it is also used as a desktop computer. However, none of that software is important for the purposes of this article.
The RAM has been upgraded since I purchased the machine from Fry's because it originally came with 128MB, which is a little low for my tastes. The only other upgrade was a new CPU fan and that was out of personal preference, the default fan was just too loud.
Below are some directives in my httpd.conf and some general recommendations that I think are vital to helping you survive a good Slashdotting on low-budget hardware.
- MaxKeepAliveRequests 0 The KeepAlive directive in httpd.conf allows persistent connections to the web server, so that new connection does not have to be initiated for each request. Setting the MaxKeepAliveRequests directive to 0 enables unlimited number of requests per connection, which makes sense if you think about it. Why allow persistent connections but then terminate them after a short period of time?
- KeepAliveTimeout 15
Because persistent connections are allowed, it is important that they are not kept open indefinitely. This directive will close the connection after 15 seconds of inactivity.
- MinSpareServers 15
This is the minimum number of spare servers you want running at any given time. This way, if multiple simultaneous requests are received there will already be child processes running to handle them. Setting this number too high is a waste of system resources and setting it too low will cause the system to slow down.
- MaxSpareServers 65
Same as above, but the maximum child processes running at any given time.
- StartServers 15
This is the number of servers Apache will start initially. As more servers handle requests a minimum of 15 spare servers will run up to the maximum of 64.
- MaxClients 500
This is the maximum number of simultaneous clients that can connect to the server at any given time. Setting this number too low will result in users being locked out of the server under normal traffic situations and setting it too high will result in your server being so overloaded that all the requests timeout anyway. I think 500 is about right for most people's needs.
- Open Source: Configuring Apache - Don't Succumb To The "Slashdot Effect"
- Published: January 27, 2006
- Type: Opinion
- Section: Sci/Tech
- Part of a feature: Open Source
- Writer: Adam Drake
- Adam Drake's BC Writer page
- Adam Drake's personal site
- Spread the Word
- Like this article?
- Email this
Save to del.icio.us
Comments
Sorry about the typo, you're correct, it will be changed ASAP.
Can't believe that you have an upload capped at 384, a 2800+ server and don't even mention mod_gzip.
Sorry to break this to you like this, but prepairing for a slashdotting, is more about content than cpu power especially if you have a slow connection. If you want to survive a /.'ing remove all large pictures. Even the smallest boxes can survive if they are only serving text, take your standard webpage its one or two pages long, complete with all the code overhead that is about 5k, and if you have 384KB/s upload, that is enough for nearly 60 users hitting your site per second and yes any box faster than a pentium can fill a 10mbit link. The problem with being slash dotted is the graphics, one average size jpeg that 300x300 in size, can be 50K. a screenshot can be 150KB you can see once you have 10 people accessing your site things start to degrade of course 10 people may be fine if they are patient, since each would be getting 3.8KB/s but most likely someone will think this is too slow, time to reload after they have grabed 25% of the file, and the server may not notice and keeps the old connection open for a bit, while another one is made.
I have a friend that has survived numerous slashdottings, and on a daily basis he see's 25,000 hits a day, on a 440 MHz machine. The machine is idle doing this, he even has graphics, not really large ones and he has a 10mbit link to the net.
You got dugg talking about how to respond to being dugg. Wonderful!
Paul:
I wouldn't say I got dugg as bad as some do because the machine is up. However, I did reboot it because I couldn't get in with SSH and when I walked over to the KB everything was frozen. It's working fine now though and the traffic is more intense so I'm not sure what happened. The CPU has been 90% idle so I think (as I said in my article) that bandwidth will be the problem.
I'm sure before the end of the day the folks from digg will succeed :)
Your CPU can be 90% idle *and* still your server is exhausted.
That's because the major bottleneck in (web)servers is the harddisk.
I don't know any tool to measure the load of a harddisk. Does anyone know?
Another tip: use static pages as much as possible. Then cached pages and at the end dynamically generated pages. In dynamically generated pages, keep SQL-queries as few as possible. Also optimize all your SQL-queries. The database is often the slowest part of a webserver with dynamic content.
I can't be bothered to read your comment policy before I post a comment.
Edwin:
I use iostats to get an overview of HDD activity, the output is more than detailed enough for me. I hear there is another utility called watch that is good as well, but I've never used it.
Frankie:
If you're talking about comments on my site, I'm a bit careful with that because of spam. Apologies for any inconvenience it may have caused you.
theCreator:
Trying out lighttpd is my next project. It (evidently) works great with RAILS and it's possible to use WordPress as well (link).
MaxKeepAliveRequests: because if you don't limit the number of requests I can indefinitely use up a whole lot of clients.
MaxClients 500 -- that one you have to be careful with. You need to do tests where you actually consume the max clients to see if you start to swap and/or hit an io wait bottleneck before you get there, which you almost certainly will with many types of dynamic pages running with 500 clients. For example, if you have php pages averaging 8Mb allocation, 500 clients means you're using 4Gb just for php.
I set MaxClients to 500, I get an error message about ServerLimit being set too low at 256 for that MaxClients. I add the line ServerLimit 500 and it still isn't reading that ServerLimit 500 and gives me the same error about ServerLimit being set to 256
Mike:
Did you add a line so that now you have two MaxClients directives? If so then get rid of one of them and just keep one that says MaxClients 256.
If that's not the problem, I don't know how to help you. The Apache package provided in the default apt-get repository doesn't have an issue with it. What distribution of Linux are you using?
err...
"...and just keep the one that says MaxClients 256."
should read
"...and just keep the one that says MaxClients 512."
That's what I get for reading your post and typing at the same time.
If we are talking about a LAMP site the bottleneck is the database queries. apache comes after that. After slashdoted you have to make a static version of that page and upload your files like images, videos maybe css and js files to a free service maybe several one.That survive you and save your bandwidth costs.
Alternatively, you could setup a reverse proxy. This doesnt even have to be on your own site.
I've documented it here ( http://blog.subverted.net/?p=351 ), decided to have something in place before the next time I was slashdotted.
I think some of the advice here should be adjusted:
1) DO NOT LEAVE KeepAliveTimout AT 15!!! By taking it down to 2 or 3, you will easily triple the number of requests that your server can handle (assuming your processor is fast enough). Most people don't realize this, but you are slitting your own throat by letting clients tie up valuable Apache processes for 15 seconds, even though they are just sitting there! When a Slashdotting comes down the wire, you need every available Apache process to be working as much as possible. Don't believe me? When the Slashdotting does come, open up your /server-status page and look at the status of 95% of the processes -- they will all be "K", meaning they're just sitting there, looking stupid and consuming valuable memory resources.
2) The previous comment about Apache complaining about MaxClients being above 256 is correct; you have to change a value in the Apache source code to get it above that. Some distributions may or may not do that, so make sure you check your error_log when starting Apache to see whether increasing MaxClients will work.
3) Don't put MaxClients at 500 unless you know what you are doing. "I think 500 is about right for most people's needs" is wrong, and could end up killing your site. Although it is partially dependent upon your CPU, the major factor here is typically how much available RAM you have. Comment #13 by Not an Admin was exactly correct -- if each Apache process uses 8 MB of memory, then you'll need multiple gigabytes of free memory. If you don't have enough memory for all of those processes, then you'll start swapping and be completely hosed. For a very rough guideline, assume that each Apache process will consume anywhere from 2 to 8 MB of memory. Divide that into your *available* RAM, and you'll get a much better idea of what value to use for MaxClients.
4) The article didn't mention this at all, but you should remove any Apache modules that you don't need. Every module you remove will slim down Apache at least a little, allowing you to slowly inch up that MaxClients value. This isn't as valuable as you might think (due to the way Linux will share the module's non-writable memory between processes), but it will help some.
5) As mentioned by Saiyine: if you still have extra CPU cycles to spare, use mod_gzip -- it will free up a bit of bandwidth, allowing you to service even more requests. Even better would be to pre-compress static pages on disk, but many people don't have that luxury.
6) Do NOT, under any condition, perform an SQL query on every hit to your web site. Trust me, you will kill yourself if you have to do even a light SQL query under Slashdotting conditions. Either don't use a database, or find a way of caching SQL results so you don't hit the database every single time someone views a page.
There are many more things that you could do to improve performance, but this should get people thinking along the right lines. The primary lesson to learn is: memory is usually your most valuable resource when dealing with Apache -- use it as efficiently as possible.
And if all else fails, next time try Lighttpd.
=)
Do a few google searches for Lighttpd vs. Apache benchmarks and you'll see why.
There is one other trick, but I can't recall the attribute. Basically, linux will update a 'last accessed' attribute on a file every time it is accessed. The key is to turn this off. This will turn a bunch of irrelevant disk writes off, and you'll only need disk reads, which the OS can cache. So you get kind of a double improvement on harddrive utilization.
that attribute is 'noatime' and can be added to the options section of your /etc/fstab file. On a different topic but similar line to fs attributes, if you have a seperate partition for /tmp, you should set it to nodev,noexec,nosuid as well. Doing so will add a bit more protection against hackers dropping a zombie bot in your /tmp and running it.
Cheers!,
--
Carl
This article is pretty much useless, except for the comments made by Devin.
Devin's comments are very good. I think that Mikey speaks too glibly about the usefulness of this article: even if the httpd.conf information is not particularly helpful, the point about page size is a good point. Keeping things small is absolutely essential; I wish that more designers would appreciate this.
It is importantly correct, in my estimation, about the necessity of caching -- or more pointedly NOT running queries for every request. Without doing any math here, I'm thinking that an IDE drive spinning at 7200 will not be able to keep up with the necessary IO that would result from the vast quantity of requests that /. generates.
Anyway, it's always valuable to have a discussion about these sorts of things, so thanks for this Adam!
Cheers,
Brian.
Forgot to mention that KeepAlive defaults to off:
#
# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive Off
At least it does in Server version: Apache/2.0.40
Server built: Sep 4 2002 17:20:34
A really useful one I set during a slashdoting is the Apache Expires directive. Set this for dirs that contain (mostly) static files like images, so that the browser is told the file expires (say) one week in the future. If your pages have lots of little images - bullet markers, that sort of stuff - this is a big win.
Apache2 and Lighttpd seems to benchmark fairly close to each other. (as in both side can make reproduceable benchmarks in their favor,)
Although there is a lot of reasons for us to avoid a monoculture, so I would not discourage using Lighttpd. It seems to be a solid server, but if you have a well tuned apache 2 server, you probably won't see much difference in performance between them. (a well tuned zope server could our perform a badly tuned apache 1.3 server, But apache2 can scream if you spend some time tuning and benchmarking it. (personally I have a soft spot for servers like dhttpd, fnord, roxen, micro-httpd, lighttpd, mzscheme, bozohttpd, cherokee, mathopd, boa, yaws, aolserver, thttpd, caudium, and zope. But, I still user apache a fair amount. It is easy to setup, secure, and administrate. In addition to being well supported by web apps.
Tuning your webserver, and shutting off all unneeded extensions making the content as static as possible, and caching everything you can will get you a lot of the way to being able to withstand heavy traffic. After that you are going to have to benchmark your changes, and figure out where your bottleneck is and go from there.
+1 on that endorsement of Devin's comments as the sane alternative to the advice in the original article, only I would add one further tip:
Do not use .htaccess files in your most travelled directory as this causes twice the traffic for every page and graphic. Turn options off and set your config in httpd.conf (or conf.d) instead.
also a caveat: When I added that CDM url extension to this post, the form-post would no longer work; that's logical if you think about it (no site should accept form data that's not from it's own URL) but could really trip you up if you weren't expecting it.
Also, I notice the many endorsements for mod_gzip -- I thought mod_gzip was dead! I can't find any current maintained code; what I can find is a royal pain to install, and doesn't like apache2.
I use mod_deflate instead, nearly the same function, easy as a single line of config to install, and pre-included by most Linux distro bundles.
Another noteworthy item from the original article: Is it my imagination, or did the author neglect to give us the URL of that Slash-dot-proof $150 server? Did I just miss it? ...
I have a server on a fast connection (100Mbit/s). The disks are very fast SCSI disks in a RAID 5 setup and the CPU is about 800Mhz. But I only have 256MB of memory!
What is the best performance tuning solution in this senario when disk, internet connection and CPU is fast but RAM is low?
Watson:
Option 1: Use Lighttpd, it has a very low memory footprint and is faster than Apache in many implementations.
Option 2: If you must use Apache, make sure you load only essential modules, don't set your MaxClients too high, etc. You can look at some sites that are geared towards that kind of thing, but in the end your best bet is to incorporate the things that work at those sites and see how your memory usage pans out. It will take some tweaking but the payoff will be worth it.
How this perosn get qualified for writing an article on Apache
There doesn't seem to be any reference to what the "ideal" worker MPM values should be
(These are the defaults)
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0







For Coral, you need to append '.nyud.net:8090' as opposed to what you said.
Source