Debugging a segfault on a production service: tutorial/example based on a true story: Apache2 mod_rpaf segfault
What follows are notes taken while debugging segfaults of Apache on a production environment. It might be useful to me for future debugging needs, and may be of use to other newbies as well.
.
/!\ warning: It has been written by a newbie, for himself and for newbies.
--- I have a very basic understanding of C debugging and the apache internals.
(Please tell me if there are errors)
Environment and symptoms
- Debian Squeeze
- Apache2 MPM prefork with mod_rpaf (behind a reverse proxy)
- Apache 2.2.16-7 and libapache2-mod-rpaf 0.5-3 (Debian packages)
Apache sometimes segfault. After investigation, it segfaults under some web vulnerability scans.
First step: get a backtrace / coredump
Make Apache create a coredump when it segfaults, for being able to study the issue under a debugger.
On Debian, read /usr/share/doc/apache2.2-common/README.backtrace.
Basically, you'll need:
CoreDumpDirectory /var/cache/apache2 # in apache's configuration
ulimit -c unlimited # before starting apache
You might need to recompile apache with debugging and no optimisations. See below.
Have a look to the coredump
You'll need debugging symbols and a debugger. With debugging and a debugger, you'll be able to easily detect and correct errors, making the best salesforce marketing cloud training a more effective tool in helping you become an expert in cloud computing.
On debian:
apt-get install apache2-dbg libapr1-dbg libaprutil1-dbg gdb
debug symbols, no optimisations
However, compiler does optimisation so that some values won't be printable by the debugger. So, you might need to recompile Apache with debugging and no optimisation.
In general:
CFLAGS= -g -O0 # debugging, no optimisation |
And, of course, do not strip the binaries.
On Debian:
apt-get build-dep apache2 apt-get source apache2 cd apache2-2.2.16 export DEB_BUILD_OPTIONS= "nostrip noopt" fakeroot debian /rules binary |
Starting the debugger (GDB)
gdb /usr/sbin/apache2 /path/to/coredump bt full # show the backtrace |
In this case, the backtrace was:
#0 0x00007fb11e60ef32 in ap_add_common_vars (r=0x7fb1229f4d30) at /usr/src/apache/apache2-2.2.16/server/util_script.c:241 e = 0x7fb1229f3860 s = 0x7fb11fefbd00 c = 0x7fb1229ecee0 rem_logname = 0x7fb1229f4cb8 "xÌ\"\261\177" env_path = 0x7fff9585df40 "/usr/local/bin:/usr/bin:/bin" host = 0x0 hdrs_arr = 0x7fb1229f5010 hdrs = 0x7fb1229f5138 i = 4 rport = 0 #1 0x00007fb119e983f9 in suphp_script_handler (r=0x7fb1229f4d30) at mod_suphp.c:834 p = 0x7fb1229f4cb8 sconf = 0x7fb1214a1c68 dconf = 0x7fb1229f2d88 finfo = {pool = 0x7fb1229f4cb8, valid = 7598960, protection = 1604, filetype = APR_REG, user = 1000, group = 1000, inode = 10193601, device = 18, nlink = 1, size = 19981, csize = 20480, atime = 1309525506000000, mtime = 1317514847000000, ctime = 1341403789000000, fname = 0x7fb1229f3988 "/var/www/index.php", name = 0x7fb11dd22968 "\205\300u\nH\213E\330H\213@\b\353\024H\203E\330\030H\213E\330H;E\320v\301\270", filehand = 0x7fb11ed6e0b0} procattr =argv = 0x7fb1229f39b0 env = rv = strbuf = '\000' , "\023\257\035\261\177\000\000\060Ø\"\261\177\000\000\260\300\205\225\377\177\000\000\200\023\257\035\261\177\000\000x\332}\035\261\177\000\000 \000\000\000\060\000\000\000@\267\205\225\377\177\000\000\200\266\205\225\377\177\000\000\000\000\000\000\000\000\000\000p\267\205\225\377\177\000\000q\363\000\000\000\000\000\000\060Ø\"\261\177\000\000\340Î\"\261\177\000\000\200\023\257\035\261\177\000\000\031\000\000\000\377\177\000\000\000\000\000\000\004\002\000\000\000\000\000\000\031", '\000' , "\b\000\000p\264\205\225\377\177\000\000\260\300\205\225\377\177\000\000\000\000\000\000\000\000\000\000\060Ø\"\261\177\000\000\000\023\257\035\261\177\000\000\260\300\205\225\377\177\000\000\006\000\000\000\000\000\000\000\060Í \225\377\177\000\000q\363\000\000\000\000\000\000\060Ø\"\261\177\000\000\061P\207\031\261\177\000\000\060Í \225\377\177\000\000(Í \225\377\177\000\000LÍ \225\377\177\000\000HÍ \225\377\177", '\000' ... tmpbuf = auth_user = auth_pass = 0x7fb1229f4d30 "\270L\237\"\261\177" ud_user = 0x0 ud_group = 0x0 bb = b = #2 0x00007fb119e991b8 in suphp_handler (r=0x7fb1229f4d30) at mod_suphp.c:569 dconf = 0x7fb1229f2d88 #3 0x00007fb11e607316 in ap_run_handler (r=0x7fb1229f4d30) at /usr/src/apache/apache2-2.2.16/server/config.c:159 pHook = 0x7fb11e86c1b0 n = 5 rv = -1 #4 0x00007fb11e607c25 in ap_invoke_handler (r=0x7fb1229f4d30) at /usr/src/apache/apache2-2.2.16/server/config.c:377 handler = 0x7fb11ed6e0b0 "application/x-httpd-php5" p = 0x0 result = 0 old_handler = 0x0 ignore = 0x7fb11e5f434a "H\213U\370H\211BP\353\031\203}\344\002u\023\350Ia\377\377H\213U\370H\211BX\353\004\220\353\001\220\311\303UH\211\345\211}\374\211u\370\203", #5 0x00007fb11e61cc21 in ap_process_request (r=0x7fb1229f4d30) at /usr/src/apache/apache2-2.2.16/modules/http/http_request.c:282 access_status = 0 #6 0x00007fb11e6198fa in ap_process_http_connection (c=0x7fb1229ecee0) at /usr/src/apache/apache2-2.2.16/modules/http/http_core.c:190 r = 0x7fb1229f4d30 csd = 0x0 #7 0x00007fb11e611458 in ap_run_process_connection (c=0x7fb1229ecee0) at /usr/src/apache/apache2-2.2.16/server/connection.c:43 pHook = 0x7fb11e86c890 n = 1 rv = -1 #8 0x00007fb11e61195d in ap_process_connection (c=0x7fb1229ecee0, csd=0x7fb1229eccf0) at /usr/src/apache/apache2-2.2.16/server/connection.c:190 rc = -2 #9 0x00007fb11e624427 in child_main (child_num_arg=3) at /usr/src/apache/apache2-2.2.16/server/mpm/prefork/prefork.c:662 current_conn = 0x7fb1229ecee0 csd = 0x7fb1229eccf0 ptrans = 0x7fb1229ecc78 allocator = 0x7fb1229e98b0 status = 0 i = -1 lr = 0x7fb11e845f10 pollset = 0x7fb1229ead50 sbh = 0x7fb1229ead48 bucket_alloc = 0x7fb1229f0c98 last_poll_idx = 0 #10 0x00007fb11e6245f6 in make_child (s=0x7fb11e84dd78, slot=3) at /usr/src/apache/apache2-2.2.16/server/mpm/prefork/prefork.c:758 pid = 0 [...]
The source code can be displayed within GDB:
ctrl-x 2 |
What can we learn from this backtrace ?
The segfault happens in function ap_add_common_vars() in server/util_script.c at line 241.
Line 241 is:
rport = c->remote_addr->port; |
Let's have a look to the pointers:
(gdb) print c $2 = (conn_rec *) 0x7fb1229ecee0 (gdb) print c->remote_addr $3 = (apr_sockaddr_t *) 0x0 |
Arrg. c->remote_addr is a null pointer (0x0). Were are dealing with a NULL pointer dereference.
(... read some code...)
This is kind of weird, since remote_addr store network information (ip, port, inet familly...) about the remote peer (the HTTP client), and therefore this pointer shouldn't be NULL at that place.
The core of Apache manages the creation of this c (connection) object and should have aborted if the remote_addr thingies couldn't have been retrieved from the socket (and debug messages would have pop up the in error log in this case)...
Remember that we are in ap_add_common_vars() (... read some code...), so the connection has been established already and we are now reading the request and setting up HTTP headers.
Also, Apache segfaults only under some vulnerability scans. There must be something something in between the socket handling and the ap_add_common_vars() call.
Right: Modules ! In the backtrace, only suphp shows up. But after backward inspecting the frames of the stacktrace (see below), you can see that c->remote_addr was already NULL.
Often, especially on a simple own software, the upper frames help to find where such a bug was introduced (where was the pointer nulled). However, Apache relies on hooks and modules, and here it seems that we cannot find this precious information.
Inspecting the upper frames:
(gdb) frame 6 #6 0x00007fb11e6198fa in ap_process_http_connection (c=0x7fb1229ecee0) at /usr/src/apache/apache2-2.2.16/modules/http/http_core.c:190 (gdb) print c->remote_addr $4 = (apr_sockaddr_t *) 0x0 |
Track down the issue
Let's have a guess: c->remote_addr is correct at first. Then it is nulled, but we cannot see where from the backtrace. It might have been done by a module. A module that modifies c->remote_addr.
apache2ctl -M # or apache2 -M [...] core_module (static) log_config_module (static) logio_module (static) rpaf_module (shared) [...] |
Uho ! What is mod_rpaf again ?
module for Apache2 which takes the last IP from the 'X-Forwarded-For' header
ding ding ding We might have a winner !
Let's read the source:
apt-get source libapache2-mod-rpaf cd libapache2-mod-rpaf-0.5 |
Arg... Nothing that seems too dangerous... at first...
$ view mod_rpaf-2.0.c [...] r->connection->remote_addr->sa.sin.sin_addr.s_addr = inet_addr(r->connection->remote_ip); [...] $ for p in debian /patches/ *; do patch -p1 < $p; done $ view mod_rpaf-2.0.c |
[...] r->connection->remote_ip = apr_pstrdup(r->connection->pool, last_not_in_array(arr, cfg->proxy_ips)); r->connection->remote_addr->sa.sin.sin_addr.s_addr = inet_addr(r->connection->remote_ip); tmppool = r->connection->remote_addr->pool; tmpport = r->connection->remote_addr->port; memset(r->connection->remote_addr, '\0' , sizeof(apr_sockaddr_t)); r->connection->remote_addr = NULL; apr_sockaddr_info_get(&(r->connection->remote_addr), r->connection->remote_ip, APR_UNSPEC, tmpport, 0, tmppool); [...] |
Hm :) remote_addr is nulled there ! Then, it is passed by reference to apr_sockaddr_info_get() to be filled up with the new data (the module uses the value of the X-Forwarded-For HTTP header to rewrite the ip the of client, remember?).
However, there is no sanity checks. What happens if apr_sockaddr_info_get() fails ?
r->connection->remote_addr = NULL;, so c->remote_addr is NULL later on in ap_add_common_vars() where it is dereferencedâ¦
Fine, apr_sockaddr_info_get() should not fail. But here, it might be the case.
Let's check that out:
--- libapache2-mod-rpaf-0.5/mod_rpaf-2.0.c 2012-07-28 15:40:09.000000000 +0200 +++ rpaf_/mod_rpaf-2.0.c 2012-07-27 15:32:37.000000000 +0200 @@ -186,6 +186,7 @@ apr_pool_t *tmppool; rpaf_server_cfg *cfg = (rpaf_server_cfg *)ap_get_module_config(r->server->module_config, &rpaf;_module); + int rv; if (!cfg->enable) return DECLINED; @@ -206,7 +207,11 @@ tmpport = r->connection->remote_addr->port; memset(r->connection->remote_addr, '\0', sizeof(apr_sockaddr_t)); r->connection->remote_addr = NULL; - apr_sockaddr_info_get(&(r->connection->remote_addr), r->connection->remote_ip, APR_UNSPEC, tmpport, 0, tmppool); + rv = apr_sockaddr_info_get(&(r->connection->remote_addr), r->connection->remote_ip, APR_UNSPEC, tmpport, 0, tmppool); + if(rv != APR_SUCCESS) { + ap_log_error(APLOG_MARK, APLOG_ERR, 0, NULL, + "MYDEBUG apr_sockaddr_info_get != sucess -- %s : %d", r->connection->remote_ip, tmpport); + } if (cfg->sethostname) { const char *hostvalue; if (hostvalue = apr_table_get(r->headers_in, "X-Forwarded-Host")) { |
- rebuild the module
- install it
- restart apache
- reproduce the problem
$ tail -f /var/log/apache/error .log | egrep '(MYDEBUG|Segmentation)' [ date ] [error] MYDEBUG apr_sockaddr_info_get != success -- remote_ip: GARBAGE, 172.16.0.1 tmpport: 38601 [ date ] [notice] child pid 9282 exit signal Segmentation fault (11), possible coredump in /var/cache/apache2/coredump |
Gotcha ! There is some garbage in remote_ip !
The silly request
I've been able to track the request by using tcpdump and looking for the garbage pattern:
# tcpdump -s 1500 -lnXvvv port 80 > /tmp/log 0x0080: 703a 2031 2722 3430 3030 0d0a 782d 666f XXXXXXXXXXX.x-fo 0x0090: 7277 6172 6465 642d 666f 723a 2031 2722 rwarded- for :.YYY 0x00a0: 3530 3030 0d0a 6163 6365 7074 2d6c 616e YYYY..accept-lan |
Where YYYYYYY is the garbage pattern.
It is eventually a security issue
Well, a single request makes Apache segfault. On one environment that I tested, it even made the apache processes becoming zombies and killed the HTTP service totally.
Even if there are no zombies and only a single process dies, it is an issue: Apache will have to clone() for each new request.
This can lead to a DOS.
Conclusion:
- if you're able to inject a single X-forwarded-For header in a direct request to the webserver: segfault
- if you inject a specific crafted X-forwarded-For header in a request to the reverse proxy: segfault
A specific X-forwarded-For header ?
From my experiments:
A specific value is needed when targeting the reverse proxy. The reverse proxy adds his own X-forwarded-For header, and mod_rpaf is supposed to used the last one. Injecting a X-forwarded-For header can make apache segfault but it has to be a specific value (toto is not, for example)
I won't give the one value I've tested (and approved) before Debian updates its package (or tell me they are fine with this issue... Which I doubt)
Fix
What is the wrong part of the code ?
I do not understand why there is some garbage in remote_ip. There must be a bug in the memory handling in mod_rpaf.
Here is a quote from the CHANGES document released with mod_rpaf 0.6:
mod_rpaf was incorrectly using r->pool to allocate memory for the
ip. The correct pool for this when you are dealing wth keep-alive
requests was r->connection->pool.
UPDATE 0.6 version DOES NOT fix it
I previously stated that version 0.6 fixes the issue. I was wrong, I messed up when testing custom patches and packages.
Do not patch
The bug does not trigger when using unpatched 0.5 or 0.6 sources (remote_addr is not nulled.)
Workaround
Remove extra X-Forwarded-For headers added by (malicious?) clients on the reverse proxy before passing the requests to the backends.
affected environments and warnings sent to upstream
- tested: Debian squeeze MPM prefork: 3 different production environments
- not tested: Debian with event or worker MPM - other distros
affected:
- mod_rpaf v0.5 patched with the 'IPv6 support patch' that can be found in Debian's package, see Debian's bug #409521
- Custom v0.6 patched with the 'IPv6 support patch'
warnings sent to upstream:
- Debian Security team has been mailed. A bug is open: #683984
- A bug has been opened on Mandriva's bugtracker (Haven't checked myself if their package is vulnerable, but they seem to have applied the patch: #63373)
CVE-2012-3526
A CVE number has been assigned to this issue: CVE-2012-3526