Gentoo - Apache Configuration #1

As we know from the previous article, Gentoo uses a different layout from other systems - let's move on and take a look at the principal files.

We're not actually going to change a lot at this point, just look at the main settings and see what they mean and what a change will actually do.


Defaults

Why no specific changes to the default?

Well, it's difficult to give a definitive configuration as there are so many variables to consider: expected site traffic, Slice size, site type, etc..

Remember that it is very unlikely the default Apache configuration will be ideal for your Slice. Don't be intimidated by the thought of optimising the install — following the next couple of articles will allow you to understand the meaning behind the concepts.

You'll also find the same principles apply to any web server; the terminology may differ, but the concepts remain the same.

My advice is very simple: experiment. Find what works best on your setup.

/etc/apache2/httpd.conf

We'll start with the main file that glues the entire Apache configuration together. You can view it in your favourite editor:

nano /etc/apache2/httpd.conf

All the lines starting with '#' are comments; they are ignored by Apache but very useful to humans. If you find a directive that you want to know more about, the comment paragraph above it is the first place to look. Have a skim through the comments in the first part of this file.

Scroll down the file. The first section, where each line starts with 'LoadModule' was generated when we emerged Apache and we're recommended not to touch it.

The next section determines which Linux 'user' and 'group' the Apache process will run as:

User apache
Group apache

This will be useful in a later article – when we come to set file ownership and permission attributes for our site content.

The next two directives import other, separate configuration files:

Include /etc/apache2/modules.d/*.conf
Include /etc/apache2/vhosts.d/*.conf

According to these settings, Apache will include all the files ending in '.conf' from the 'vhosts.d' and the 'modules.d' sub directories as part of it's overall configuration.

I thought I'd highlight this now, as it is a common problem to create a file named something like '/etc/apache2/vhosts.d/mydomain.com' then wonder why Apache totally ignores it. The solution is to append '.conf' to the file name. Seeing these directives helps to understand why the '.conf' extension is necessary.

/etc/apache2/modules.d/00_default_settings.conf

Open this file in your favourite editor:

nano /etc/apache2/modules.d/00_default_settings.conf

Let's scroll down slowly and take a look at some of the directives.

Timeout

Default: 300

This is the general time out for connections. If nothing happens on a connection for this number of seconds, Apache will drop the connection.

The default is deliberately set high to allow for varied situations. You can reduce this to something more specific for your situation, such as 45 or even lower. A decrease may also help in reducing the effects of a DOS attack.

KeepAlive

Default: On

This allows a web browser to re-use a single TCP/IP connection. When a web page is downloaded there are usually quite a few other objects that the page links to: Images, JavaScript, AJAX queries, style-sheets, etc.

Usually a browser will open several connections at once to download a lot of these in parallel. However, there are often more page components than there are connections.

Having this setting turned on can save a visitor's browser the overhead of breaking and starting up a new connection for each new file it needs to download, making page downloads more efficient.

MaxKeepAliveRequests

Default: 100

With the 'KeepAlive' directive set to 'On' we'll have persistent connections. This directive sets the maximum number of requests per persistent connection.

Keep this high for maximum efficiency. If you have a site with lots of images, javascript, and css, etc, try increasing this to 200.

KeepAliveTimeout

Default: 15

So how long does the persistent connection wait for the next request? The default setting is very high and can easily be reduced to 2 or 3 seconds. If no new requests are received during this time the connection is killed.

What does this mean? Well, once a connection has been established and the client has requested the files needed for the web page, this setting says "sit there and ignore everyone else until the time limit is reached or you get a new request from the client".

Why would you want a higher time? In cases where there will be a lot of interactivity on the site. However, in most cases, people will go to a page, read it for a while and then click for the next page. You don't want the connection to sit there doing nothing and ignoring other visitors.

Other Settings

We won't cover the other settings in this file. I do recommend reading through the comments in the file right to the end though. Needless to say, the other settings are set to very sensible defaults and you'll usually never need to change them.

/etc/apache2/modules.d/mpm.conf

'mpm' stands for 'Multi Processing Module'. Only one mpm can be loaded at a time in Apache — it basically decides how Apache will handle concurrent connections.

By default Apache will be installed using the 'prefork' mpm module. If you want to know more about the differences between the different 'mpm's I will point you towards the official Apache docs (which are actually very good).

The main section we are interested in in this file, is the one that matches our default 'prefork' mpm:

<IfModule mpm_prefork_module>
        StartServers            5
        MinSpareServers         5
        MaxSpareServers         10
        MaxClients              150
        MaxRequestsPerChild     10000
</IfModule>

Again, it's difficult to give a suggestion here as to what is best for your site. Have a read of the definitions below and see if anything could be improved when you consider the content served by your webserver, the number of your visitors, and their habits on your site.

StartServers: number of child server processes created at start-up.

MinSpareServers: minimum number of child server processes not doing anything (idle).

MaxSpareServers: maximum number of child server processes not doing anything (idle) — any more than the maximum will be killed.

Don't set Max lower than Min, but Apache will ignore silly numbers here and set the Max at Min+1.

MaxClients: sets the maximum simultaneous requests that Apache will handle. Anything over this number will be queued until a process is free to action the request.

MaxClients is not the same as the maximum number of visitors you can have. It is the maximum number of concurrent requests.

Remember the KeepAliveTimeout? This was set low so the next request can be actioned while the original (now 'idle') visitor sits there reading your web-page, the new (active) request will be actioned; or if the MaxClients limit has been reached, it will be queued, ready for the next available process.

In most cases, the client is not 'active'. Take this page for example: you requested it (using an active process) and then spent a while reading it which uses no processes. You are presently 'idle' (as far as the server is concerned!).

MaxRequestsPerChild: sets how many requests a child process will handle before terminating. The default is zero, which means it will never die.

Why change this if the Max numbers are set as shown above? Well, it can help in managing your Slice's memory usage.

If you change the default, you give a child a finite number of actions before it will die. This will, in effect, reduce the number of processes in use when the server is not busy, freeing memory.

Freeing it for what though? If other software needed memory then it would also need it when the server is under load. It is unlikely you will have anything that requires memory only when the server is quiet.

Summary

Quite a lot here but as you go through the different settings you will see that the theory is quite simple. Naturally, there is a lot more to it than this article (or set of articles) can go into.

In the second Apache configuration article we will look at other settings that will increase your web server's efficiency and help in increasing the security of your Slice.

matiu

Article Comments:

Michael Battle commented Tue Nov 15 15:30:03 UTC 2011:

great tutorials. this is just a slight correction. on page http://articles.slicehost.com/2009/9/28/gentoo-apache-layout-1 in the last heading /etc/apache2/modules.d/mpm.conf, the actual file is /etc/apache2/modules.d/00_mpm.conf

Want to comment?


(not made public)

(optional)

(use plain text or Markdown syntax)