Configuring the Apache MPM on Gentoo

Now that you know where the files are, let's look at how to tell apache to stay within the memory available to your Gentoo server.


The apache MPM

Part of the apache web server installation is the "MPM", which stands for "Multi-Processing Method". The MPM determines the mechanism apache uses to handle multiple connections. Now that we have an idea of where apache keeps its configs we'll cover in detail how the main MPMs are configured and how you might optimize their settings for your environment.

The difference

The first thing to know is that there are several MPMs that apache can use, but the main MPMs are "worker" and "prefork".

The worker MPM primarily handles connections by creating new threads within a child process, while the prefork MPM spawns a new process to handle each connection. The worker MPM is considered more efficient, but some modules aren't stable when running under the worker MPM. The portage apache package defaults to the prefork MPM for the best compatibility with modules.

Most users won't notice a difference in performance between the MPMs but it's good to know the options are there. If you find your site is having trouble scaling, for example, you might want to switch to the worker MPM even though it isn't recommended by a module you're using. For a change like that, you'll want to consult your module's documentation to see what it may have to say about apache MPMs.

Which am I running?

To check which MPM your apache installation is using run the apache2 program with the "-l" option, like so:

/usr/sbin/apache2 -l

The output will be a list of compiled-in modules by the names apache uses for them, like:

Compiled in modules:
  core.c
  prefork.c
  http_core.c
  mod_so.c

That's not a full list of modules apache loads when it starts, just a list of the modules that are compiled into your base apache installation. That includes the MPM, which in this example is again prefork.

Changing the MPM

In Gentoo the MPM apache uses is set at compile time. By default Gentoo's apache uses prefork, but you can change that by making a couple adjustments on your system.

The first thing you'd likely need to change is the USE flag for threading. If apache is built with the "threads" USE flag active, apache will default to building with the worker MPM. With that USE flag disabled (either by default or by specifying it as "-threads"), apache will default to building with the prefork MPM.

If you want to explicitly use a particular MPM (bearing in mind that the threads flag has to be set appropriately either way), you can do it with the "APACHE2_MPMS" environment variable. There are two ways to pass this to emerge so the setting will be used when compiling apache.

The first approach is defining that variable when you run emerge. You'd put the variable first on the line followed by the emerge command, as in:

APACHE2_MPMS="prefork" emerge apache

That works when you want to temporarily set USE flags, too:

APACHE2_MPMS="prefork" USE="-threads" emerge apache

If you want to keep that value stored somewhere (if you're uninstalling and reinstalling for testing, for example) you can add it to an environment variable file. To hold the variable, create the file:

/etc/env.d/99apache2

Inside you can put that or another apache-specific environment variable. To tell it to use the worker MPM you would just add this line to the file:

APACHE2_MPMS="worker"

There are other variables that can be set for apache's compile, including a list of modules that should be compiled for apache's use. To make sure apache used the prefork MPM and a specific set of modules, you could create the environment variable file containing:

APACHE2_MPMS="prefork"
APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias"

You can see what options were used when apache was built by running the command:

emerge -pv apache

More details on the options that can be passed to apache in this fashion are detailed at Gentoo's web site.

Once you've added environment variables to /etc/env.d you'll need to update your environment to use them:

sudo /usr/sbin/env-update && source /etc/profile

IfModule

Before we dive into the settings for each MPM let's look at how apache knows which settings it should actually read. The "IfModule" directive in apache's config files lets you put a set of instructions into a block that will only be read by apache if the indicated module is loaded.

Note that the module name used by IfModule refers to the module as it's known in the server's code, not its "friendly" name. An IfModule block for the SSL module (mod_ssl) would look like:

<IfModule mod_ssl.c>

Anything in that IfModule block would only be read into the server configuration if mod_ssl is active. That would be the place where SSL-specific directives like "Listen 443" should be found.

Similarly, while you can tell which section applies to which MPM by looking at the comments in the main config file, apache can tell by the IfModule directives. The settings for the prefork MPM can be found in the block that starts:

<IfModule prefork.c>

And the block containing worker MPM settings starts with:

<IfModule worker.c>

The main reason we're looking at IfModule is to make sure you can see why different MPM blocks can contain the same directive, but different values. Those settings don't conflict, because apache will only have one MPM loaded when it starts.

It's also useful to note that when you start installing your own modules and configuring them later, IfModule will come in handy. So long as you keep your instructions inside an IfModule block, your web server will be able to start even if that particular module can't be loaded.

Configuring the MPMs

So why all this attention for the MPMs in an apache set-up tutorial? It's not so much because of how the MPMs work as it is that the configuration of your server's MPM can affect how much memory it tries to use. That in turn can affect how responsive the server will be when your site sees a spike in traffic. We're looking at it here instead of in a series on tuning apache performance because these settings don't just affect performance — they can affect the server's stability if the settings aren't suitable for your environment.

While all this detail on MPMs is probably kind of headache-inducing at this point, it's worth it in the long run, honest. At the least, skim through the section for your MPM's settings and revisit them later, after you've put all your virtual hosts in place and taken a couple aspirin.

We can't give a hard and fast "set this option to this" recommendation since server environments vary wildly. What we can do is look at the defaults for each MPM and highlight how changing some can affect your server. The settings are largely similar so you may notice some duplication between the descriptions.

Apache's default MPM settings are probably best suited to a server environment with 1 GB of memory available. Keep that in mind when you look over the defaults, especially the MaxClients setting.

The prefork MPM

Default:

<IfModule mpm_prefork_module>
    StartServers          5
    MinSpareServers       5
    MaxSpareServers      10
    MaxClients          150
    MaxRequestsPerChild   0
</IfModule>

It's important to keep in mind that with the prefork MPM a new process (server) is spawned for each connection the server handles.

StartServers

This is the number of child server processes created at startup, ready to handle incoming connections. If you're expecting heavy traffic you might want to increase this number so the server is ready to handle a lot of connections right when it's started.

MinSpareServers

The minimum number of child server processes to keep in reserve.

MaxSpareServers

Maximum number of child server processes that will be held in reserve. Any more than the maximum will be killed.

ServerLimit

The ServerLimit directive sets an absolute limit on the MaxClients directive. The reasons for this aren't interesting enough to go into here, so the main thing to know about this directive is that it should usually be set to the same value as MaxClients and probably shouldn't be set at all if you set MaxClients lower than 256.

MaxClients

Sets the maximum simultaneous requests that Apache will handle. Anything over this number will be queued until a process is free to action the request.

MaxClients is not the same as the maximum number of visitors you can have. It is the maximum number of requests that can be fielded at the same time.

Later we'll talk about the KeepAliveTimeout setting, which controls how long connections stay alive after they don't see any new activity. We'll want to set it low so the connections used by idle web clients can be recycled more quickly to handle new web clients. Each active connection uses memory and counts toward the MaxClients total. Web clients will be stuck waiting for a connection slot to free up if you hit the number of connections in the MaxClients setting.

The trick with MaxClients is that you want the number to be high enough that visitors don't have to wait before connecting to your site, but not so high that apache needs to grab more memory than is available on your server. If you go over the available memory for your server it will start dipping into swap memory, which is slow and ugly and trust me you don't want to do that.

For the prefork MPM a new process is started when apache handles a new connection. That means MaxClients sets the maximum number of processes apache will create to handle incoming clients. Memory can definitely be a limiting factor here.

The most straightforward way to optimize this setting for the prefork MPM is to look at how much memory each apache process uses when your server sees a decent amount of traffic. Figure out how much memory you want available for apache (taking into account other programs that will use memory, like MySQL), and divide that by the amount of memory you think you want to have available to each of those apache processes. If you don't want to go to all that trouble, then shoot for maybe a MaxClients setting of 40 if you have 256MB on your virtual server. Scale up from there.

MaxRequestsPerChild

Sets how many requests a child process will handle before terminating. The default is usually zero, which means it will never die.

Why change this if the Max numbers are set as shown above? Well, it can help in managing your Slice memory usage.

If you change the default you give a child a finite number of actions before it will die. This will, in effect, reduce the number of processes in use when the server is not busy, thus freeing memory.

Freeing it for what though? If other software needed memory then it would also need it when the server is under load. It is unlikely you will have anything that requires memory only when the server is quiet.

The worker MPM

Defaults:

<IfModule mpm_worker_module>
    StartServers          2
    MaxClients          150
    MinSpareThreads      25
    MaxSpareThreads      75 
    ThreadsPerChild      25
    MaxRequestsPerChild   0
</IfModule>

Each connection in the worker MPM is handled by a thread, and there are several threads per child process (server) started by apache. This approach handles high loads better but can cause problems with modules that aren't built to be "thread safe".

StartServers

This is the number of child server processes created at startup. Each server will run multiple threads to handle incoming connections. If you're expecting heavy traffic you might want to increase this number so the server is ready to handle a lot of connections right when it's started.

MaxClients

Sets the maximum simultaneous requests that Apache will handle. Anything over this number will be queued until a process is free to action the request.

MaxClients is not the same as the maximum number of visitors you can have. It is the maximum number of requests that can be fielded at the same time.

Later we'll talk about the KeepAliveTimeout setting, which controls how long connections stay alive after they don't see any new activity. We'll want to set it low so the connections used by idle web clients can be recycled more quickly to handle new web clients. Each active connection uses memory and counts toward the MaxClients total. Web clients will be stuck waiting for a connection slot to free up if you hit the number of connections in the MaxClients setting.

The trick with MaxClients is that you want the number to be high enough that visitors don't have to wait before connecting to your site, but not so high that apache needs to grab more memory than is available on your server. If you go over the available memory for your server it will start dipping into swap memory, which is slow and ugly and trust me you don't want to do that.

For the worker MPM a new thread is created when apache handles a new connection. This means MaxClients limits the total number of threads apache can create to handle connections.

Optimizing this setting is kind of a juggling act for the worker MPM. You have to take into account the ThreadsPerChild setting, since apache won't create new child processes if that would cause the total number of threads to exceed MaxClients. That lets you cap the memory used by apache, so long as you keep an eye on how much memory an apache child process typically uses, and thus how many of them your virtual server should be able to handle with its available memory. If you're trying to stay under 256 MB on your virtual server you might try starting MaxClients off at 50 and adjust it from there.

MinSpareThreads

The minimum number of threads to keep active in reserve. If the current number of child processes (servers) won't support enough spare threads and the MaxClients won't be exceeded, a new child process will be spawned to supply the spare threads.

MaxSpareThreads

Maximum number of threads that will be held in reserve. Extra child processes may be killed if the MaxSpareThreads number is exceeded, but only if that wouldn't bring the number of spare threads below MinSpareThreads.

ThreadsPerChild

The number of threads (connections) that will be handled by each child process. When a new child process is started, all of its threads are started as well. This basically means that resources (like memory) are allocated to all potential connections for a child process when it is started, and are only freed when that process ends. The default setting of 25 is usually a good compromise between efficiency and the potential for allocating resources for threads that won't be used.

MaxRequestsPerChild

Sets how many requests a child process will handle before terminating. The default is zero, which means it will never die.

Why change this if the Max numbers are set as shown above? Well, it can help in managing your Slice memory usage.

If you change the default you give a child a finite number of actions before it will die. This will, in effect, reduce the number of processes in use when the server is not busy, thus freeing memory.

Freeing it for what though? If other software needed memory then it would also need it when the server is under load. It is unlikely you will have anything that requires memory only when the server is quiet.

Summary

If you take nothing else away from this article, let it be that you should tailor your MPM's MaxClients setting so that your web server won't try to allocate more resources than you have available. Better that a visitor wait a moment for a connection than that the server should dip into swap for more memory and bring the entire virtual machine to a crawl.

In the next article we'll look at the main configuration options for apache and what sort of changes can be made to optimize it for your needs.

  • -- Jered
Want to comment?


(not made public)

(optional)

(use plain text or Markdown syntax)