The New Dawn Of Filter Evasion

Fri, 13 Jul 2007 06:57:16 GMT

This article is about the most important phase when attacking web applications. The phase when the markup has just been broken and the attacker will try to inject his own markup, script code or other data - let's call it the PMBP (post-markup-breaking-phase). This phase is mostly possible to occur when quotes aren't correctly sanitized or when input is placed between two tags. In this article we will set the focus on the first variant - the attribute injection. And we will prove that protecting your markup from being broke into is the very most important task in client side security.

Basic filtering

Many developers use standard filter functions to sanitize their output. Mostly a good idea, but the developer has to know what his filters and the attacker is capable of. Some say that the developer is to blame for the bugs, because he/she doesn't implement properly the security examples as they come from the books. Developers usually does not have enough time and knowledge for cutting edge security research. Mostly of the time they need to chose between stripping the tags, converting HTML and special characters to entities, urlencoding the input, using approaches like escapeshellcmd() or even combining those filtering methods, in order to secure the applications they are working on. Usually those methods aren't used and combined with exact knowledge and so they tear open security holes or even cripple the user experience in some situations. Did you know that PHP's escapeshellcmd() leaves the characters .-=a-Z0-9, space and / untouched? This function is recommended by several books as a good filter alternative.

Get it running

So what is the attacker able to do when he/she breaks out the markup, and injects new tags and attributes? When new tags can be injected the site is considered owned - even if there are filters that block script tags and iframes. It is interesting to look at the attributes. Depending on the browser - we're talking about the two major ones in this article - there are several ways to inject attributes that will fire JavaScript without requiring user interaction. If the attacker breaks out a tag which points to binary contents, like a link tag, an img tag or even an iframe, embed or object we can use the onerror handler and provide a crippled source attribute. Once the browser tries to load a source like xxx it will fail and fire the event - for which we already injected a handler (javascript function call). On the other hand, we can utilize the style tag to create XSS without user interaction. Both gecko based browsers and all important IE versions ship proprietary selectors and methods to execute JavaScript within a style tag or attribute - just to name a few, we have: -moz-binding and expression(). In order to exploit these situations the attacker has to inject more code. I've seen several websites where developers seemed to know that and had implemented filters that stripped a bunch of special chars - sometimes dot and brackets, sometimes other stuff - to avoid the injection of active code. Unfortunately, there's a way to circumvent all of them.

Circumvent the ignorance

One of the most common filters - never understood why - is the stripping of the pattern http://. It seems thousands of developers out there believe that without this string it's not possible to get an external inclusion running and furthermore without an external inclusion you can't do much bad stuff on the targeted application. This is not true. It has been moths ago since I've published a miniaturized vector for script inclusions - 20 characters of length which does not have the http:// string. The cross browser version is 27 characters in length. Same goes for filtering the dot, brackets, spaces etc. Here is an example:

<script src=// 

Some filters transform all user input to uppercase letters - which is more useless than stripping http:// string. A major portal about a server side programming languages once suffered from a bad filter which stripped out all event handlers beginning with on\w+ and style. This is good example. Plain stripping will never make sense - the filter was easily circumvented by the following:


But it's even getting better. Some weeks ago a pretty new and very intelligent kind of filter evading vectors came to light - these vectors were capable of carrying large payloads in totally stealth mode. These vectors does not require externally hosted scripts to perform the task. This is the reason why they are called self contained XSS.

CSO's nightmare

Self contained XSS is mostly based on the fact, that it's possible to pass values via URL, that are seen by the client only, but not by the server. Those values are called fragment identifiers. Everything passed behind the URL hash (#) is only visible to the client which includes the JavaScript runtime engine. So, the attacker just needs to inject a code snippet that evaluates the contents of this part of the URL - which is mostly very short and contains no information of what the real payload consists of. Due to the very dynamic nature of JavaScript it is possible to create myriads of variants of those payload triggers and in combination with browser peculiarities the possibilities of creating these triggers become uncountable. Here is an exmaple,,location))
code:_=eval,__=unescape,___=document.URL,_(__(___)) code:with(location)with(hash)eval(substring(1))

Of course, it's also possible to rip these functions apart, shuffle their fragments and compose them back together at the end like this:

l=0||'str',m= 0||'sub',x= 0||'al',y= 0||'ev',g= 0||'tion.h',f= 0 ||'ash',k= 0||'loca',d=(k)+(g)+(f),a

Conclusion and credits

So - why are we showing all those vectors and talking about all the possibilities to hide, obfuscate possible attacks and evade filter mechanisms? To prove a point, I must say! Once the attacker breaks the markup he/she becomes in charge of that content - whether there's a filter between the application and the user or not. There's no way of finding a perfect balance between effective filtering and not crippling the user's experience. To make it short - the PMBP must never happen. In order to make this article a dirty cliffhanger - in the next part we will talk about how to avoid this exact phase. We'll learn how to make sure the user won't be annoyed by your filter and still guaranteeing stalwart security too. Credits for the shown vectors go the author, the PHPIDS Group, Giorgio Maone, Kishor, Martin Hinks, Christian Matthies, sirdarckcat and the guys from

good stuff, quite interesting
Mario, this article is definately a Saturday morning read :)
:( I'm not in the credits,13209,page=1#msg-13326,,location))
code:_=eval,__=unescape,___=document.URL,_(__(___)) code:with(location)with(hash)eval(substring(1))
that is from ma1:,13209,page=2#msg-13366 but based on:,13209,page=2#msg-13361 and the last vector.. doesn't work:
l=0||'str',m= 0||'sub',x= 0||'al',y= 0||'ev',g= 0||'tion.h',f= 0 ||'ash',k= 0||'loca',d=(k)+(g)+(f),a
a is not defined.. it should be something like..
instead of a like this:
l=0||'str',m= 0||'sub',x= 0||'al',y= 0||'ev',g= 0||'tion.h',f= 0 ||'ash',k= 0||'loca',d=(k)+(g)+(f),1[(y)+(x)](d)
well, I don't know what to say. definitely, we are going to discuss your comments with .mario and see if there is a problem. no worries, there must be some kind of misunderstanding :) cheers
Hi! Oh man I am sorry - you are perfectly right. We'll fix that asap!! The vector was you are talking about is from Martin and worked - but I guess the bolg's filter must have cut something. Here's the full version: Greetings, .mario
Shame on you who don't give credits. Correctness it the first thing to look at. d. c. marktwin
Thx for your comment marktwin. Mistakes happen to (most) people and some even learn of them. Few have reached perfection and avoid them and some of those few even share their knowledge to do so - thx again. Greetings, .mario
:P no problem .mario, I hope you get better from that flue. Greetz!!