Article: PHP Web Security

May 1st, 2010

This article is the first part of the HTML version of SektionEins GmbH’s PHP Web Security Poster. You can download an outdated PDF version here.

PHP Web Security

Vulnerabilities & Concepts

Vulnerability Types

Cross Site Scripting (XSS)
This vulnerability allows data to be injected into webpages. This data is then interpreted as code and executed by the viewer‘s web browser, which can effectively be seen as remote controlling a victim‘s browser.

Cross Site Request Forgery (CSRF)
CSRF refers to a type of exploits where the victim‘s browser is being tricked into triggering an authenticated action inside a vulnerable web application. The target website can be affected by CSRF regardless of being susceptible to XSS. How dangerous CSRF can be really depends on the kind of action triggered this way and its impact.

SQL Injection
SQL injection attacks lead to the manipulation of SQL queries. Vulnerable applications allow dynamically built SQL queries to contain unfiltered or improperly sanitised user input. If exploited successfully an attacker can gain access to all data in the database as well as modify data, limited only by the access level of the database user.

Insecure Session Handling
This category covers problems enabling attackers to access or manipulate a session token in order to control or take over a session.

Session Fixation
Session Fixation allows an attacker to control the session of a user. This is done by injecting a known token to be used as a valid session token.

Information Disclosure
As the name suggests, security related information is being divulged by the target system, which may simplify an attack. Such information can be found in various places, e.g. code comments, directory listings, error messages or even in search results of your favourite search engine.

Header Injection
This vulnerability allows HTTP headers to be injected into an HTTP response.

File Inclusion
The inclusion of local or remote files into a web application is a serious security vulnerability, which may lead to arbitrary code execution on the server.

Insecure Configuration
Misconfiguration of server or application software may facilitate or simplify attacks.

Weak randomness
This problem refers to predictable random number generation; e.g. badly chosen random seeds or algorithms using insufficient entropy are known to generate weak random numbers.


Secure Input Handling
Input filters and validators can be used to scan user input for specific patterns known to trigger unwanted side effects in web applications. User input can contain fragments of JavaScript, SQL, PHP or other code which – if unfiltered – could then lead to code execution within the context of the web application.

Sanitising functions can be used to “repair” user input, according to the application‘s restrictions (e.g. specific datatypes, maximum length) instead of rejecting potentially dangerous input entirely. In general, the use of sanitising functions is not encouraged, because certain kinds and combinations of sanitising filters may have security implications of their own. In addition, the automatic correction of typos could render the input syntactically or semantically incorrect.

There are several different kinds of escaping:

  • The backslash prefix “\” defines a meta character within strings. For Example: \t is a tab space, \n is a newline character, … This can be of particular interest for functions where the newline character has a special purpose, e.g. header(). Within regular expressions the backs- lash is used to escape special characters, such as \. or \*, which is relevant for all functions handling regular expressions.
  • HTML encoding translates characters nor- mally interpreted by the web browser as HTML into their encoded equivalents – e.g. < is < or < or < and > is > or > or >. HTML encoding should be used for output handling, where user input should be reflected in HTML without injecting code. (See also: htmlentities())
  • URL encoding makes sure, that every character not allowed within URLs, according to RFC 1738, is properly encoded. E.g. space converts to + or %20 and < is %3C. This escaping is relevant for functions handling URLs, such as urlencode() and urldecode().

There are two different approaches to filtering input data – whitelisting and blacklisting. Blacklisting checks input data against a list of “bad patterns”. This way, unwanted input can be discarded and all other content can be processed further. On the other hand, whitelisting checks input data against a list of known “good patterns”. All unmatched input can be discarded and only input recognised as valid is accepted.
In the real world whitelisting turned out to be far more resistant to security vulnerabilities than blacklisting, since it is usually a lot easier to specify the narrow set of valid patterns for the whitelist than to exclude every invalid input with a blacklist. In particular, whitelisting should be used for input directly controlling the program flow, e.g. for include statements or eval().

Security Related PHP Functions

Validation and Sanitising Functions

The PHP core provides a few functions suitable for sanitising:

  • is_numeric() Checks a variable for numeric content.
  • is_array() Checks if a variable is an array.
  • strlen() Returns a string‘s length.
  • strip_tags() Removes HTML and PHP tags. Warning: As long as certain HTML tags remain, JavaScript can be injected along with tag attributes.

CType Extension
By default, PHP comes with activated CType exten- sion. Each of the following functions checks if all characters of a string fall under the described group of characters:

  • ctype_alnum() alphanumeric characters – A-Z, a-z, 0-9
  • ctype_alpha() alphabetic characters – A-Z, a-z
  • ctype_cntrl() control characters – e.g. tab, line feed
  • ctype_digit() numerical characters – 0-9
  • ctype_graph() characters creating visible output e.g. no whitespace
  • ctype_lower() lowercase letters – a-z
  • ctype_print() printable characters
  • ctype_punct() punctuation characters – printable characters, but not digits, letters or whitespace, e.g. .,!?:;*&$
  • ctype_space() whitespace characters – e.g. newline, tab
  • ctype_upper() uppercase characters – A-Z
  • ctype_xdigit() hexadecimal digits – 0-9, a-f, A-F
if (!ctype_print($_GET['var'])) {
   die("User input contains non-printable characters");

Filter Extension – ext/filter
Starting with PHP 5.2.0 the filter extension has provided a simple API for input validation and input filtering.

  • filter_input() Retrieves the value of any GET, POST, COOKIE, ENV or SERVER variable and applies the specified filter.
      <?php $url = filter_input(INPUT_GET, 'url', FILTER_URL); ?>
  • filter_var() Filters a variable with the specified filter.
      <?php $url = filter_var($var, FILTER_URL); ?>

List of Filters Validation Filters

Validation Filters

  • FILTER_VALIDATE_INT Checks whether the input is an integer numeric value.
  • FILTER_VALIDATE_BOOLEAN Checks whether the input is a boolean value.
  • FILTER_VALIDATE_FLOAT Checks whether the input is a floating point number.
  • FILTER_VALIDATE_REGEXP Checks the input against a regular expression.
  • FILTER_VALIDATE_URL Checks whether the input is a URL.
  • FILTER_VALIDATE_EMAIL Checks whether the input is a valid email ad- dress.
  • FILTER_VALIDATE_IP Checks whether the input is a valid IPv4 or IPv6.

Sanitising Filters

  • FILTER_SANITIZE_STRING / FILTER_SANITIZE_STRIPPED Strips and HTML-encodes characters according to flags and applies strip_tags().
  • FILTER_SANITIZE_SPECIAL_CHARS Encodes ‘ " < %gt; & \0 and optionally all characters > chr(127) into numeric HTML entities.
  • FILTER_SANITIZE_EMAIL Removes all characters not commonly used in an email address.
  • FILTER_SANITIZE_URL Removes all characters not allowed in URLs.
  • FILTER_SANITIZE_NUMBER_INT Removes all characters except digits and + -.
  • FILTER_SANITIZE_NUMBER_FLOAT Removes all characters not allowed in floating point numbers.
  • FILTER_SANITIZE_MAGIC_QUOTES Applies addslashes().

Other Filters

  • FILTER_UNSAFE_RAW Is a dummy filter.
  • FILTER_CALLBACK Calls a userspace callback function defining the filter.

Escaping and Encoding Functions

  • htmlspecialchars() Escapes the characters & < and > as HTML entities to protect the application against XSS. The correct character set and the mode ENT_QUOTES should be used.
      <?php echo "Hello " . htmlspecialchars(
    $_GET['name'], ENT_QUOTES, 'utf-8'); ?>
  • htmlentities() Applies HTML entity encoding to all applicable characters to protect the application against XSS. The correct character set and the mode ENT_QUOTES should be used.
      <?php echo "Hello " . htmlentities($_GET['name'], ENT_QUOTES, 'utf-8'); ?>
  • urlencode() Applies URL encoding as seen in the query part of a URL.
      <?php $url = "" .
    "index.php?param=" . urlencode($_GET['pa']); ?>
  • addslashes() Applies a simple backslash escaping. The input string is assumed to be single-byte encoded. addslashes() should not be used to protect against SQL injections, since most database systems operate with multi-byte encoded strings, such as UTF-8.
  • addcslashes() Applies backslash escaping. This can be used to prepare strings for use in a JavaScript string context. However, protection against HTML tag injection is not possible with this function.
  • mysql_real_escape_string() Escapes a string for use with mysql_query(). The character set of the current MySQL connection is taken into account, so it is safe to operate on multi-byte encoded strings. Applications implementing string escaping as protection against SQL injection attacks should use this function.
          $sql = "SELECT * FROM user WHERE" .
             " login='" . mysql_real_escape_string($_GET['login'], $db) . "'";
  • preg_quote() Should be used to escape user input to be inserted into regular expressions. This way the regular expression is safeguarded from semantic manipulations.
          $repl = preg_replace('/^' .
          preg_quote($_GET['part'], '/').
          '-[0-9]{1,4}', '', $str);
  • escapeshellarg() Escapes a single argument of a shell command. In order to prevent shell code injection, single quotes in user input are being escaped and the whole string enclosed in single quotes.
          system('resize /tmp/image.jpg' .
             escapeshellarg($_GET['w']).' '.
  • escapeshellcmd() Escapes all meta characters of a shell command in a way that no additional shell commands can be injected. If necessary, arguments should be enclosed in quotes.
          system(escapeshellcmd('resize /tmp/image.jpg "' .
             $_GET['w'].'" "'.
             $_GET['h']. '"'));
  • Secure Programming

    Securing HTML Output
    In order to prevent the execution of JavaScript code originating from user input, it is mandatory to perform a suitable string sanitisation on all dynamic data before any HTML output. The use of htmlentities() is considered sufficient within normal HTML context.
    However, if data can be injected into tags or tag attributes, JavaScript can be executed by means of event handlers such as onClick or by modifying style attributes. For these cases it is recommended to apply a whitelist filter allowing only predefined tag attributes or style sheets to be inserted.
    URLs within tag attributes must be checked as well. Some URI schemes, such as data: vbscript: and javascript: can be used to execute code. Therefore only specific schemes should be allowed. Of course, it is always a good idea to encode the query part of a URL appropriately as well.
    Finally, data put directly into JavaScript code must be prevented from breaking out of its JavaScript context. JavaScript strings are known to be particularly prone to incorrect escaping.

    Regular Expressions
    Every user input placed inside regular expressions must be prepared using preg_quote(). Otherwise an injection into the expression‘s logic can easily lead to incorrect application behaviour, buffer overflows, denial of service or application crashes.


    HTTP Header Output
    HTTP headers can be set using the header() function. User input should always be checked before being passed to header(), otherwise a number of security issues become relevant.
    Newline characters should never be used with header() in order to prevent HTTP header injections. Injected headers can be used for XSS and HTTP response splitting attacks, too. In general, user input should be handled in a context-sensitive manner.
    Dynamic content within parameters to Location or Set-Cookie headers should be escaped by urlencode().

    <?php if (strpbrk($_GET['x'], "\r\n"))
    die('line break in x'); header("Location: " .
    urlencode($_GET['x'])); header("Set-Cookie: mycookie=". urlencode($_GET['x']) .

blog comments powered by Disqus