PHP exited on signal 7 (SIGBUS) with Prestashop

PHP exited on signal 7 (SIGBUS)

It breaks

Debug of the day: random PHP crashes (SIGBUS) on a production platform. The website is Prestashop, a (quite) well known online shop software.

PHP crashes and 500 errors had been reported for this website, running under SuPHP/CGI mode. When I migrated this website to a new platform, running PHP-FPM, crashes seemed to be more prevalent, and, more importantly, had a much more important impact:

I/O (or maybe random) functions were "hanging", so that parts of the website were not working at all (timeout). Indeed, a crash of a PHP process in a FPM pool happen to make the whole pool feel... bad.

Well, emergency_restart_threshold / emergency_restart_interval are there for a reason.

However, if a whole pool of PHP processes behave wrongly every single or two crashes, and that crashes are as regular as 1 every 3 or 5 minutes, you'ld better... fix the software.

Necessary advisory

 .
/!\  warning: I am not a developper, and don't know anything about prestashop
---  other than "it is a shop software, and some are hosted on servers I manage"

(Please tell me if there are errors)

Look at the source of the problem

So here we go.

# echo "/tmp/core-%e.%p" > /proc/sys/kernel/core_pattern
# vi /etc/init.d/php5-fpm # add 'ulimit -c unlimited' at a right place
# /etc/init.f/php5-fpm restart

Make yourself a coffee. Wait for coredumps.

Let's see what the coredump can tell us (probably on a dev/preprod machine)

# apt-get install gdb php5-dbg build-essential
# gdb /usr/sbin/php5-fpm /tmp/core-php5-fpm.6739

Core was generated by `php-fpm: pool bigotter                                '.
Program terminated with signal 7, Bus error.
#0  lex_scan (zendlval=0x7fff20bda898) at Zend/zend_language_scanner.l:2000
(gdb) bt
#0  lex_scan (zendlval=0x7fff20bda898) at Zend/zend_language_scanner.l:2000
#1  0x00000000006885e0 in zendlex (zendlval=0x7fff20bda890) at /tmp/buildd/php5-5.4.4/Zend/zend_compile.c:6692
#2  0x0000000000662c42 in zendparse () at /tmp/buildd/php5-5.4.4/Zend/zend_language_parser.c:3432
#3  0x00000000006680fb in compile_file (file_handle=<incomplete type>, type=2) at Zend/zend_language_scanner.l:582
#4  0x000000000051d27a in phar_compile_file (file_handle=<incomplete type>, type=32767) at /tmp/buildd/php5-5.4.4/ext/phar/phar.c:3391
#5  0x00000000006682b0 in compile_filename (type=2, filename=0x7fa871878ce8) at Zend/zend_language_scanner.l:625
#6  0x00000000006e1e2b in ZEND_INCLUDE_OR_EVAL_SPEC_TMP_HANDLER (execute_data=0x7fa871878af8) at /tmp/buildd/php5-5.4.4/Zend/zend_vm_execute.h:7006
#7  0x00000000006fed17 in execute (op_array=0x7fa8718b1af8) at /tmp/buildd/php5-5.4.4/Zend/zend_vm_execute.h:410
#8  0x000000000069eb5e in zend_execute_scripts (type=8, retval=0x7fa8718a8d38, file_count=3) at /tmp/buildd/php5-5.4.4/Zend/zend.c:1279
#9  0x000000000063e133 in php_execute_script (primary_file=0x6aabfc) at /tmp/buildd/php5-5.4.4/main/main.c:2473
#10 0x0000000000431ac0 in main (argc=1904694192, argv=0xdc2660) at /tmp/buildd/php5-5.4.4/sapi/fpm/fpm/fpm_main.c:1885
(gdb) frame 5
#5  0x00000000006682b0 in compile_filename (type=2, filename=0x7fa871878ce8) at Zend/zend_language_scanner.l:625
625     in Zend/zend_language_scanner.l
(gdb) print *filename
$1 = {value = {lval = 1904908816, dval = 6.9347763598749545e-310, str = {val = 0x7fa8718a9a10 "/var/www/bigotter/www/cache/class_index.php", len = 43}, ht = 0x7fa8718a9a10, obj = {handle = 1904908816, handlers = 0x2b}}, refcount__gc = 1904909728, type = 6 '\006', is_ref__gc = 127 '\177'}

So:

After some researches, it appears that this file is regularly rewritten, used in a cache system for Prestashop.

Let's see how it is written:

        // Write classes index on disc to cache it
        $filename = $this->root_dir.Autoload::INDEX_FILE;
        if ((file_exists($filename) && !is_writable($filename)) || !is_writable(dirname($filename)))
                throw new PrestaShopException($filename.' is not writable, please give write permissions (chmod 666) on this file.');
        else
        {
                // Let's write index content in cache file
                // In order to be sure that this file is correctly written, a check is done on the file content
                $loop_protection = 0;
                do
                {
                        $integrity_is_ok = false;
                        file_put_contents($filename, $content);
                        if ($loop_protection++ > 10)
                                break;

                        // If the file content end with PHP tag, integrity of the file is ok
                        if (preg_match('#\?>\s*$#', file_get_contents($filename)))
                                $integrity_is_ok = true;
                }
                while (!$integrity_is_ok);

                if (!$integrity_is_ok)
                {
                        file_put_contents($filename, '');
                        throw new PrestaShopException('Your file '.$filename.' is corrupted. Please remove this file, a new one will be regenerated automatically');
                }
        }

Facepalm. I posted that on a pastebin and shared it around among friends, technical people. It seems that I'm not the only one to be amazed.

What's wrong?

Well... A totally broken approach of handling "corruption problems", as stated in this commit, a "bugfix" for this report.

The good question is not "how to work around corruption problems?". It is "how do corruption problems appear, and how to prevent them?".

Notes:

Concurrency

The issue here is concurrency. Or rather the lack of handling of concurrency.

cache/class_index.php is a PHP file. It is certainly included in some other PHP file, probably loaded for each request to the prestashop application.

It is also rewritten regularly, using the PHP function file_put_contents() which is not atomic: 3 ops: open, write, close.

But, hey, PHP developper... There is more than one PHP process on a loaded production server!

Sometimes, the file needs to be refreshed, so a process opens the file... Begins to write into it. And another process loads the file, as it is required for the application, before the file is fully written.

Boum, big badaboum. "Corruption problem".

The issue is more and more prevalent as more and more PHP processes are running concurrently.

Sad, but true.

The more you get visitors on your website, the more it breaks.

How do I fix that?

Use atomic operations. The correct way to refresh the file is to create a temporary file, and then rename it to the proper name. Why? rename is atomic. file_put_contents is not.

This is an untested patch (against master on github):

    --- a/classes/Autoload.php
    +++ b/classes/Autoload.php
    @@ -136,27 +136,13 @@ class Autoload
                    }
                    else
                    {
    -                       // Let's write index content in cache file
    -                       // In order to be sure that this file is correctly written, a check is done on the file content
    -                       $loop_protection = 0;
    -                       do
    -                       {
    -                               $integrity_is_ok = false;
    -                               file_put_contents($filename, $content, LOCK_EX);
    -                               if ($loop_protection++ > 10)
    -                                       break;
    -
    -                               // If the file content end with PHP tag, integrity of the file is ok
    -                               if (preg_match('#\?>\s*$#', file_get_contents($filename)))
    -                                       $integrity_is_ok = true;
    -                       }
    -                       while (!$integrity_is_ok);
    -
    -                       if (!$integrity_is_ok)
    -                       {
    -                               file_put_contents($filename, '', LOCK_EX);
    -                               // Cannot use PrestaShopException in this context
    -                               die('Your file '.$filename.' is corrupted. Please remove this file, a new one will be regenerated automatically');
    +                       $filename_tmp = $filename.'.tmp.'.getmypid();
    +                       if(file_put_contents($filename_tmp, $content, LOCK_EX) !== FALSE) {
    +                               rename($filename_tmp, $filename);
    +                       } else {
    +                               // $filename_tmp couldn't be written. $filename should be there anyway (even if outdated),
    +                               // no need to die.
    +                               error_log('Cannot write file '.$filename_tmp);
                            }
                    }
     

Notes:

Then, no more corruption problems, no more SIGBUS issues, PHP is in peace with Prestashop.

Update: proper fix merged in Prestashop

I proposed my cheesy patch (with getmypid()) to Prestashop to bring the issue to their attention. I eventually made a PR with a proper fix using tempnam() which has been merged into the development branch so that upcoming Prestashop releases will hopefully be free of this crap :)

Copyright @2023. Zecrazytux . All Rights Reserved .