﻿# Debugging a segfault on a production service: tutorial/example based on a true story: Apache2 mod_rpaf segfault

What follows are notes taken while debugging segfaults of Apache on a production environment.
It might be useful to me for future debugging needs, and may be of use to other newbies as well.


     .
    /!\  warning: It has been written by a newbie, for himself and for newbies. 
    ---  I have a very basic understanding of C debugging and the apache internals.  
         
    (Please tell me if there are errors)


## Environment and symptoms

 * Debian Squeeze
 * Apache2 MPM prefork with mod_rpaf (behind a reverse proxy)
 * Apache 2.2.16-7 and libapache2-mod-rpaf 0.5-3 (Debian packages)

Apache sometimes segfault. After investigation, it segfaults under some web vulnerability scans.

## First step: get a backtrace / coredump

Make Apache create a coredump when it segfaults, for being able to study the issue under a debugger.

On Debian, read _/usr/share/doc/apache2.2-common/README.backtrace_.

Basically, you'll need:

    CoreDumpDirectory /var/cache/apache2 # in apache's configuration
    ulimit -c unlimited # before starting apache

You might need to recompile apache with debugging and no optimisations. See below.

### Have a look to the coredump

You'll need debugging symbols and a debugger.

On debian:

    apt-get install apache2-dbg libapr1-dbg libaprutil1-dbg gdb

#### debug symbols, no optimisations

However, compiler does optimisation so that some values won't be printable by the debugger.
So, you might need to recompile Apache with debugging and no optimisation.

In general:

<pre class="brush: bash">
CFLAGS= -g -O0 # debugging, no optimisation
</pre>

And, of course, do not _strip_ the binaries.

On Debian:

<pre class="brush: bash">
apt-get build-dep apache2
apt-get source apache2
cd apache2-2.2.16
export DEB_BUILD_OPTIONS="nostrip noopt"
fakeroot debian/rules binary
</pre>

#### Starting the debugger (GDB)

<pre class="brush: bash">
gdb /usr/sbin/apache2 /path/to/coredump
bt full # show the backtrace
</pre>

In this case, the backtrace was:

<pre style="font-size: 0.7em">
#0  0x00007fb11e60ef32 in ap_add_common_vars (r=0x7fb1229f4d30) at /usr/src/apache/apache2-2.2.16/server/util_script.c:241
        e = 0x7fb1229f3860
        s = 0x7fb11fefbd00
        c = 0x7fb1229ecee0
        rem_logname = 0x7fb1229f4cb8 "xÌ\"\261\177"
        env_path = 0x7fff9585df40 "/usr/local/bin:/usr/bin:/bin"
        host = 0x0
        hdrs_arr = 0x7fb1229f5010
        hdrs = 0x7fb1229f5138
        i = 4
        rport = 0
#1  0x00007fb119e983f9 in suphp_script_handler (r=0x7fb1229f4d30) at mod_suphp.c:834
        p = 0x7fb1229f4cb8
        sconf = 0x7fb1214a1c68
        dconf = 0x7fb1229f2d88
        finfo = {pool = 0x7fb1229f4cb8, valid = 7598960, protection = 1604, filetype = APR_REG, user = 1000, group = 1000, inode = 10193601, device = 18, nlink = 1, size = 19981, csize = 20480, atime = 1309525506000000, mtime = 1317514847000000, ctime = 1341403789000000, 
          fname = 0x7fb1229f3988 "/var/www/index.php", name = 0x7fb11dd22968 "\205\300u\nH\213E\330H\213@\b\353\024H\203E\330\030H\213E\330H;E\320v\301\270", filehand = 0x7fb11ed6e0b0}
        procattr = <optimized out>
        argv = 0x7fb1229f39b0
        env = <optimized out>
        rv = <optimized out>
        strbuf = '\000' <repeats 25 times>, "\023\257\035\261\177\000\000\060Ø\"\261\177\000\000\260\300\205\225\377\177\000\000\200\023\257\035\261\177\000\000x\332}\035\261\177\000\000 \000\000\000\060\000\000\000@\267\205\225\377\177\000\000\200\266\205\225\377\177\000\000\000\000\000\000\000\000\000\000p\267\205\225\377\177\000\000q\363\000\000\000\000\000\000\060Ø\"\261\177\000\000\340Î\"\261\177\000\000\200\023\257\035\261\177\000\000\031\000\000\000\377\177\000\000\000\000\000\000\004\002\000\000\000\000\000\000\031", '\000' <repeats 40 times>, "\b\000\000p\264\205\225\377\177\000\000\260\300\205\225\377\177\000\000\000\000\000\000\000\000\000\000\060Ø\"\261\177\000\000\000\023\257\035\261\177\000\000\260\300\205\225\377\177\000\000\006\000\000\000\000\000\000\000\060Í\225\377\177\000\000q\363\000\000\000\000\000\000\060Ø\"\261\177\000\000\061P\207\031\261\177\000\000\060Í\225\377\177\000\000(Í\225\377\177\000\000LÍ\225\377\177\000\000HÍ\225\377\177", '\000' <repeats 14 times>...
        tmpbuf = <optimized out>
        auth_user = <optimized out>
        auth_pass = 0x7fb1229f4d30 "\270L\237\"\261\177"
        ud_user = 0x0
        ud_group = 0x0
        bb = <optimized out>
        b = <optimized out>
#2  0x00007fb119e991b8 in suphp_handler (r=0x7fb1229f4d30) at mod_suphp.c:569
        dconf = 0x7fb1229f2d88
#3  0x00007fb11e607316 in ap_run_handler (r=0x7fb1229f4d30) at /usr/src/apache/apache2-2.2.16/server/config.c:159
        pHook = 0x7fb11e86c1b0
        n = 5
        rv = -1
#4  0x00007fb11e607c25 in ap_invoke_handler (r=0x7fb1229f4d30) at /usr/src/apache/apache2-2.2.16/server/config.c:377
        handler = 0x7fb11ed6e0b0 "application/x-httpd-php5"
        p = 0x0
        result = 0
        old_handler = 0x0
        ignore = 0x7fb11e5f434a "H\213U\370H\211BP\353\031\203}\344\002u\023\350Ia\377\377H\213U\370H\211BX\353\004\220\353\001\220\311\303UH\211\345\211}\374\211u\370\203", <incomplete sequence \374>
#5  0x00007fb11e61cc21 in ap_process_request (r=0x7fb1229f4d30) at /usr/src/apache/apache2-2.2.16/modules/http/http_request.c:282
        access_status = 0
#6  0x00007fb11e6198fa in ap_process_http_connection (c=0x7fb1229ecee0) at /usr/src/apache/apache2-2.2.16/modules/http/http_core.c:190
        r = 0x7fb1229f4d30
        csd = 0x0
#7  0x00007fb11e611458 in ap_run_process_connection (c=0x7fb1229ecee0) at /usr/src/apache/apache2-2.2.16/server/connection.c:43
        pHook = 0x7fb11e86c890
        n = 1
        rv = -1
#8  0x00007fb11e61195d in ap_process_connection (c=0x7fb1229ecee0, csd=0x7fb1229eccf0) at /usr/src/apache/apache2-2.2.16/server/connection.c:190
        rc = -2
#9  0x00007fb11e624427 in child_main (child_num_arg=3) at /usr/src/apache/apache2-2.2.16/server/mpm/prefork/prefork.c:662
        current_conn = 0x7fb1229ecee0
        csd = 0x7fb1229eccf0
        ptrans = 0x7fb1229ecc78
        allocator = 0x7fb1229e98b0
        status = 0
        i = -1
        lr = 0x7fb11e845f10
        pollset = 0x7fb1229ead50
        sbh = 0x7fb1229ead48
        bucket_alloc = 0x7fb1229f0c98
        last_poll_idx = 0
#10 0x00007fb11e6245f6 in make_child (s=0x7fb11e84dd78, slot=3) at /usr/src/apache/apache2-2.2.16/server/mpm/prefork/prefork.c:758
        pid = 0

[...]
</pre>

The source code can be displayed within GDB:

<pre class="brush: bash">
ctrl-x 2
</pre>

What can we learn from this backtrace ?

The segfault happens in function _ap_add_common_vars()_ in _server/util_script.c_ at line _241_.

Line 241 is:

<pre class="brush: bash">
rport = c->remote_addr->port;
</pre>

Let's have a look to the pointers:

<pre class="brush: bash">
(gdb) print c
$2 = (conn_rec *) 0x7fb1229ecee0
(gdb) print c->remote_addr
$3 = (apr_sockaddr_t *) 0x0
</pre>

Arrg. c->remote_addr is a null pointer (0x0). Were are dealing with a NULL pointer dereference.

(... read some code...)

This is kind of weird, since _remote_addr_ store network information (ip, port, inet familly...) about the remote peer (the HTTP client), and therefore this pointer shouldn't be NULL at that place. 

The core of Apache manages the creation of this _c_ (connection) object and should have aborted if the _remote_addr_ thingies couldn't have been retrieved from the socket (and debug messages would have pop up the in error log in this case)...

Remember that we are in _ap_add_common_vars()_ (... read some code...), so the connection has been established already and we are now reading the request and setting up HTTP headers.

Also, Apache segfaults only under some vulnerability scans. There must be something something in between the socket handling and the _ap_add_common_vars()_ call.

Right: Modules ! In the backtrace, only suphp shows up. But after backward inspecting the frames of the stacktrace (see below), you can see that _c->remote_addr_ was already _NULL_.

Often, especially on a simple own software, the upper frames help to find where such a bug was introduced (where was the pointer nulled). However, Apache relies on hooks and modules, and here it seems that we cannot find this precious information.

Inspecting the upper frames:

<pre class="brush: bash">
(gdb) frame 6
#6  0x00007fb11e6198fa in ap_process_http_connection (c=0x7fb1229ecee0) at /usr/src/apache/apache2-2.2.16/modules/http/http_core.c:190
(gdb) print c->remote_addr
$4 = (apr_sockaddr_t *) 0x0
</pre>

### Track down the issue

Let's have a guess: _c->remote_addr_ is correct at first. Then it is nulled, but we cannot see where from the backtrace. It might have been done by a module.
A module that modifies _c->remote_addr_.

<pre class="brush: bash">
apache2ctl -M # or apache2 -M
[...]
core_module (static)
log_config_module (static)
logio_module (static)
rpaf_module (shared)
[...]
</pre>

Uho ! What is mod_rpaf again ?

    module for Apache2 which takes the last IP from the 'X-Forwarded-For' header

*ding* *ding* *ding* We might have a winner !

Let's read the source:

<pre class="brush: bash">
apt-get source libapache2-mod-rpaf
cd libapache2-mod-rpaf-0.5
</pre>

Arg... Nothing that seems too dangerous... at first...

<pre class="brush: bash">
$ view mod_rpaf-2.0.c
[...]
            r->connection->remote_addr->sa.sin.sin_addr.s_addr = inet_addr(r->connection->remote_ip);
[...]
$ for p in debian/patches/*; do patch -p1 < $p; done
$ view mod_rpaf-2.0.c
</pre>

<pre class="brush: bash">
[...]
            r->connection->remote_ip = apr_pstrdup(r->connection->pool, last_not_in_array(arr, cfg->proxy_ips));
            r->connection->remote_addr->sa.sin.sin_addr.s_addr = inet_addr(r->connection->remote_ip);
            tmppool = r->connection->remote_addr->pool;
            tmpport = r->connection->remote_addr->port;
            memset(r->connection->remote_addr, '\0', sizeof(apr_sockaddr_t));
            r->connection->remote_addr = NULL;
            apr_sockaddr_info_get(&(r->connection->remote_addr), r->connection->remote_ip, APR_UNSPEC, tmpport, 0, tmppool);

[...]
</pre>


Hm :) remote_addr is nulled there ! Then, it is passed by reference to
apr_sockaddr_info_get() to be filled up with the new data (the module
uses the value of the X-Forwarded-For HTTP header to rewrite the ip the
of client, remember?).

However, there is no sanity checks. What happens if
apr_sockaddr_info_get() fails ?

r->connection->remote_addr = NULL;, so c->remote_addr is NULL later on
in ap_add_common_vars() where it is dereferencedâ¦

Fine, apr_sockaddr_info_get() should not fail. But here, it might be the
case.

Let's check that out:


<pre class="brush: diff">
--- libapache2-mod-rpaf-0.5/mod_rpaf-2.0.c      2012-07-28 15:40:09.000000000 +0200
+++ rpaf_/mod_rpaf-2.0.c        2012-07-27 15:32:37.000000000 +0200
@@ -186,6 +186,7 @@
     apr_pool_t *tmppool;
     rpaf_server_cfg *cfg = (rpaf_server_cfg *)ap_get_module_config(r->server->module_config,
                                                                    &rpaf_module);
+    int rv;

     if (!cfg->enable)
         return DECLINED;
@@ -206,7 +207,11 @@
             tmpport = r->connection->remote_addr->port;
             memset(r->connection->remote_addr, '\0', sizeof(apr_sockaddr_t));
             r->connection->remote_addr = NULL;
-            apr_sockaddr_info_get(&(r->connection->remote_addr), r->connection->remote_ip, APR_UNSPEC, tmpport, 0, tmppool);
+            rv = apr_sockaddr_info_get(&(r->connection->remote_addr), r->connection->remote_ip, APR_UNSPEC, tmpport, 0, tmppool);
+            if(rv != APR_SUCCESS) {
+              ap_log_error(APLOG_MARK, APLOG_ERR, 0, NULL,
+                             "MYDEBUG apr_sockaddr_info_get != sucess -- %s : %d", r->connection->remote_ip, tmpport);
+            }
             if (cfg->sethostname) {
                 const char *hostvalue;
                 if (hostvalue = apr_table_get(r->headers_in, "X-Forwarded-Host")) {
</pre>


 * rebuild the module
 * install it
 * restart apache
 * reproduce the problem

<pre class="brush: bash">
$ tail -f /var/log/apache/error.log | egrep '(MYDEBUG|Segmentation)' 
[date] [error] MYDEBUG apr_sockaddr_info_get != success -- remote_ip: GARBAGE, 172.16.0.1 tmpport: 38601
[date] [notice] child pid 9282 exit signal Segmentation fault (11), possible coredump in /var/cache/apache2/coredump
</pre>

Gotcha ! There is some garbage in _remote_ip_ !

### The silly request

I've been able to track the request by using tcpdump and looking for the garbage pattern:

<pre class="brush: bash">
# tcpdump -s 1500 -lnXvvv port 80 > /tmp/log
0x0080:  703a 2031 2722 3430 3030 0d0a 782d 666f  XXXXXXXXXXX.x-fo
0x0090:  7277 6172 6465 642d 666f 723a 2031 2722  rwarded-for:.YYY
0x00a0:  3530 3030 0d0a 6163 6365 7074 2d6c 616e  YYYY..accept-lan
</pre>

Where YYYYYYY is the garbage pattern.

## It is eventually a security issue

Well, a single request makes Apache segfault. On one environment that I tested, it even made the apache processes becoming zombies and killed the HTTP service totally.

Even if there are no zombies and only a single process dies, it is an issue: Apache will have to clone() for each new request.

This can lead to a DOS.

__Conclusion__:

 * if you're able to inject a single X-forwarded-For header in a direct request to the webserver: segfault 
 * if you inject a specific crafted X-forwarded-For header in a request to the reverse proxy: segfault

##### A specific X-forwarded-For header ?

From my experiments:

A specific value is needed when targeting the reverse proxy. The reverse proxy adds his own X-forwarded-For header, and mod_rpaf is supposed to used the last one.
Injecting a X-forwarded-For header can make apache segfault but it has to be a specific value (_toto_ is not, for example)

I won't give the one value I've tested (and approved) before Debian updates its package (or tell me they are fine with this issue... Which I doubt)

##### Fix

###### What is the wrong part of the code ?

I do not understand why there is some garbage in remote_ip. There must be a bug in the memory handling in mod_rpaf.

Here is a quote from the CHANGES document released with mod_rpaf 0.6:

    mod_rpaf was incorrectly using r->pool to allocate memory for the
    ip. The correct pool for this when you are dealing wth keep-alive
    requests was r->connection->pool.

###### *UPDATE* 0.6 version *DOES NOT* fix it 

I previously stated that version 0.6 fixes the issue. I was wrong, I messed up when testing custom patches and packages.

###### Do not patch 

The bug does not trigger when using unpatched 0.5 or 0.6 sources (remote_addr is not nulled.)

###### Workaround

Remove extra X-Forwarded-For headers added by (malicious?) clients on the reverse proxy before passing the requests to the backends.

##### affected environments and warnings sent to upstream

* tested: Debian squeeze MPM prefork: 3 different production environments
* not tested: Debian with event or worker MPM - other distros

__affected__:

* mod_rpaf v0.5 patched with the 'IPv6 support patch' that can be found in Debian's package, see [Debian's bug #409521](http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=409521)
* Custom v0.6 patched with the 'IPv6 support patch'

__warnings sent to upstream__:

* Debian Security team has been mailed. A bug is open: [#683984](http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=683984)
* A bug has been opened on Mandriva's bugtracker (Haven't checked myself if their package is vulnerable, but they seem to have applied the patch: #63373)