PDA

View Full Version : lighttpd and nginx are silly says to try to reduce load


raymor
10-08-2009, 12:30 PM
Recently it's become a fad to put lighttpd or nginx
on top of Apache in a proxying configuration. The idea
is to reduce load. However, many people including
normally competent web hosts don't seem to be taking
a close look at what they are doing first, and not looking
at the cause of the load in the first place before
following this fad. Indeed, for about nine months
lighttpd DID make sense for a very few sites, but with
other options that work better and are simpler, recent
versions of lighttpd never make any sense, once you do
some testing and look into it. I had actually planned
to write a more extensive article on this, with detailed
test results and graphs proving my conclusions as soon
as I got some spare time. It looks like my next "spare
time" will be sometime in 2013, so here's the information
you need, without the pretty graphs.

Webmasters generally say the purpose of using
lightpd or nginx is "to reduce load" or "make the site
faster", which are almost the same thing. Adding lighttpd
in front of Apache, as a proxy, is a really, really bad way
to try to reduce load. Replacing Apache with lighttpd is
almost as bad. There's a much simpler way that works a lot
better, and doesn't have the drawbacks that lighttpd does.
Instead, there's a simple way of reducing load which gives
you lower load than lighttpd, better security, and
far more flexibility.

The first thing I do regarding load takes about five or
ten minutes and often reduces load by 60%-80%. That varies
a lot based on the type of site. Since hard drives are slow
compared to the rest of the system, a site with lots of big
videos can have a bottleneck at the raw speed at which the
drives can deliver the videos, but an 80% reduction in load
average can often be achieved.

I just had to write about this because the 60%-80%
load reduction is SO easy yet most people miss it, even
fairly knowledgeable sysadmins. Most web servers are
pretty much using the default RedHat / CentOS configuration
of Apache which includes pretty much every module that
anyone might ever possibly want to use. Around 55 modules
in total, including modules for autenticating against Microsoft
workgroups servers, modules for trying to automatically
fix mispelled links, six different proxying and load balancing
modules that you almost surely don't need, suexec, which
would be STUPID to use on a dedicated server since it's a
giant security hole, modules for hosting tens of thousands
of home pages using mysite.com/~user/ URLs, modules for
serving the same pages in several different languages, etc.
You are probably using about 6 of these modules, or about 11%.
The other 89% of the Apache code you are loading is complete
waste. Each of these modules gives you features that you
CAN use, if you need them, such as FrontPage extensions.
You can load that module if you need FrontPage extensions,
but if you're not using them it's useless load and you
can just comment out that module and save the load. Contrast
that with lighttpd, where you get what you get. If you
want FrontPage extensions, tough shit - you can't have them
because they're not included. No such thing as a module,
with lighttpd.

We did the testing and this is what we found. By simply
commenting out unused modules and a couple of other simple
settings, Apache is faster than lighttpd, with lower load.
Again, if you want low load, setting up Apache correctly will
give you the least load versus lightttpd and nginx.

It used to be that lighttpd was very fast, largely
because it was very bare bones. Most "features" create some
small amount of load, so they left out 99% of the features
of something like Apache in order to make lighthttpd fast.
What happened a while back is that a few people used lighttpd,
liked it, and started a fad. A bunch of people starting using
lighttpd, each of them wanting a feature or two they were used
to from Apache. Many of those features were added, each one
making lighttpd a little bit slower. It is now measureably
slower than a properly configured Apache installation.
That is of course the key point. If you want a fast server
with low load, testing shows that setting Apache up correctly
gives you lower load than lighttpd does. Yet if and when
you do need to do any proxying, for example, you have six
different proxying modules to choose from that you can load
only if you need one of them, rather than wasting load on
the features you aren't using, as happens with lighttpd and
with your old Apache config.


I also mentioned security. You'll recall that in the
couple of years after PHP became the big fad over 90% of
all server hacks were due to PHP security holes. This was
to be expected because PHP was basically new, so the security
issues hadn't yet been found and addressed. Also, PHP was
designed with one thing in mind - to be easy to use. It
wasn't designed with security in mind, and it showed! Now
lighttpd is in the exact same position. It's newly found
popularity means that the code hasn't been analyzed by thousands
of programmers yet as Apache has been. It's security issues
haven't yet been addressed, but crackers are just now becoming
interested in finding and exploiting those issues. Also like
PHP, lighttpd was designed with one thing in mind - to be light.
It wasn't designed to be secure, and that will show.


Anyway, that's the first thing we do, is get rid of the 39
or so unused Apache modules you're running for every hit.
The second thing I do is to take a quick look at your log
files. I did that on one site server today where a webmaster
was trying out lighttpd and I saw that there was a redirect
loop on one of his scripts. That loop was logging several
error messages per second. Each of those messages represents
ten bogus "hits" being generated by that error. So that one
error was loading the server with around 3,456,000 bogus hits
per day. That's three and a half MILLION hits of load from
that one error. Doesn't fixing the script error make more
sense than dumping the most powerful server software available
in favor of one that hides the error because it's incapable of
handling that feature anyway? Fixing that error and taking
few minutes on your Apache config will give you lower load
than lighttpd without sacrificing power or security?

Hell Puppy
10-08-2009, 10:16 PM
Good stuff for us tech heads.

I'm guessing most people today just openly trust that their hosts' SAs are competent.

I'll say this, even the most sloppily configured apache will rarely be the source of your bottle neck on today's web. If it is, you've got a good problem to have (lots of traffic).

When I see sites running like a slug today it's usually some piss poor database design and config with some sloppy PHP running on the front of it.

raymor
10-10-2009, 07:12 PM
When I see sites running like a slug today it's usually some piss poor database design and config with some sloppy PHP running on the front of it.

Absolutely. How many times have you seen pretty much this exact same code, just with
with the variable names changed to protect the cluless:

$query = "SELECT name, subject, message FROM contact";


while($i < $max)
{
$result = mysql_query(SELECT thing, thang, thung from stuff WHERE id=$i);
$row = mysql_fetch_array($result, MYSQL_ASSOC);

echo "Name :{$row['thing']} <br>" .
"Subject : {$row['thang']} <br>" .
"Message : {$row['thung']} <br><br>";
}

Umm, excuse me Mr. "programmer", wouldn't that run 500 times faster if you did the
query once instead of 500 times?:

$result = mysql_query(SELECT thing, thang, thung from stuff WHERE id < $max);
while($row = mysql_fetch_array($result, MYSQL_ASSOC))
{
echo "Name :{$row['thing']} <br>" .
"Subject : {$row['thang']} <br>" .
"Message : {$row['thung']} <br><br>";
}

and you know none of the tables have any indexes.



I'm guessing most people today just openly trust that their hosts' SAs are competent.


Yeah I'm hoping that in the next conversation between host and webmaster when one
of them suggests "let's add another pointless proxy web server onto this machine" the
other will reply "how about commenting out the 85% of crap we're currently running but
not using instead". Competence doesn't seem to be any more common in the hosting
industry than in so many others. Sure there a few knowledgeable people, but we regularly
spend significant time educating hosts on the basics.

synapse
10-10-2009, 07:48 PM
for those really wanting to learn more about infrastructure performance, lots of info and examples from the net's biggest sites ... http://highscalability.com/