Yahoo Site Explorer Spider
I've being talking about client-side spiders for quite some time now over here and here and I even came up with POC based on Yahoo Pipes for my OWASP presentation on "Advanced Web Hacking Reveled", which you can find over there.
Web spiders in particular are nothing interesting. They have been with us for quite some time now and there is no point of discussing what they can do. Though, spiders are the first step towards a successful web attack. Obviously, in order to find the weaknesses within a web application, first of all we have to enumerate all entry points. This is where we launch spiders. Sometimes spiders are semi-automatic or completely automatic and may contain attack payloads.
The Page Data service allows you to retrieve information about the subpages in a domain or beneath a path that exist within the Yahoo! index. Yahoo Developer
You see, worms are often quite stupid in nature. They propagate either too fast or too slow. Very often, they are static and attack from specific IP ranges. During the first stage, we are able see a raise of particular type of traffic that originates from a particular geographical region. In order to stop further propagation, we can simply block the malicious traffic based on the worm signature. Game Over for the worm. The good guys win!
The spider that I wrote is anything by malicious. It just spiders. However, keep in mind that it will take no time to make it equipped with the latest client-side and server-side exploits. So, here is the spider's source code:
[http://www.gnucitizen.org/static/blog/2007/07/spider.js](http://www.gnucitizen.org/static/blog/2007/07/spider.js) _and this is how I use it:_ [http://www.gnucitizen.org/static/blog/2007/07/spider-init.js](http://www.gnucitizen.org/static/blog/2007/07/spider-init.js)
Keep in mind that this spider is ultra fast. It does only several connects in order to obtain the entire directory structure of the targeted website. You can launch the POC from here.