Today, these criticisms seem rather arbitrary: although framed navigation had its share of amusing missteps (not any worse than most other HTML features, I'd argue), the frames have become an important and unobtrusive part of the modern web, and a valuable content compartmentalization tool. But shockingly, even if for all the wrong reasons, the original detractors had one thing right: in a sense, they turned out to be our doom.
How so? Recall that framed browsing dates back to the days of the web being a simple tool for distributing static content - and in that context, the technology warranted no special consideration from the security community; but as our browsers morphed into de facto operating systems for increasingly complex, dynamic applications - well, we quickly discovered that the ability to selectively embed fully functional, third-party content on unrelated and potentially malicious websites is pretty bad news.
One of the earliest problems - with early reports dating back to at least 2004, and variants still being discovered several years later - is the realization that frames are implemented using essentially the same model as standalone windows; this model allows any website in possession of window name (or its DOM handle) to navigate it at will. This property is mostly harmless when dealing with proper windows equipped with an address bar - but is a disaster for seamlessly framed regions on trusted websites: if
malicious-site.com can open
trusted-application.com in a new window, and then navigate that application's frames to any other location - it can, essentially, silently hijack the UI.
Following this discovery, Adam Barth and others spent a fair amount of time proposing a better approach, and convincing several browser vendors to implement it; but even today, certain unavoidable weaknesses in this model prevail.
The next notable milestone: clickjacking - a seemingly obvious threat essentially ignored by the security community (perhaps in hope it disappears), until extravagantly publicized by Jeremiah Grossman and Robert 'RSnake' Hansen in 2008. The idea behind the attack is simple: if a frame containing
trusted-application.com is placed on
malicious-site.com, and then partly obscured or made transparent - the user can be easily tricked into thinking he is interacting with the UI of
malicious-site.com - but end up sending the UI event to
As the name implies, their analysis focused on mouse clicks - which in a sense, did the attack some disservice: the reporting led the community to assume that only certain exceedingly simple UI actions (such as the "like" buttons on social networking sites) could be realistically targeted - and that the attacker would still be facing difficulties computing the right alignment of visual elements for all targeted systems, browsers, and screen resolutions. But that's simply not true.
To demonstrate other perils of cross-domain frames, I posted a proof-of-concept exploit for an attack I jokingly dubbed strokejacking - showing that with the use of
onkeydown events, selective keystroke redirection across domains can be used to perform very complex UI actions in the targeted application, far beyond what is possible with clickjacking alone. I also discussed reverse strokejacking - an even more depressing variant where evil embeddable gadgets on a targeted site are able silently intercept user input by playing with the
focus() method. These reports received very little attention - but given the ridiculous name, that's perhaps for the best.
Since then, the situation with framed content has gotten even worse: not long ago, we witnessed this presentation from Paul Stone. Paul discussed drag-and-drop attacks on third-party frames: text selected in one obscured frame pointing to
trusted-application.com could be unintentionally dragged and dropped into the area controlled by
malicious-site.com - thus revealing the content across domains. Many researchers and browser vendors summarily dismissed this threat, on the grounds that the necessary interactions must complex and unusual - for example, triple-clicking or pressing
Ctrl-A to select text - and therefore, that they are difficult to solicit; but this is incorrect.
What have we missed, then? Paul casually mentioned one special type of a common UI interaction we all frequently engage in on even the least interesting sites: using the scroll bar. Note that the act of grabbing the slider, dragging it down, and releasing it... is eerily similar to the act of selecting text, or dragging and dropping a selection across the page. The attack can be modified thus:
- Create a page with an article that spans more than a single screen - or has a
TEXTAREAwith an EULA that needs to be scrolled to the end before the "I agree" button is enabled, instead.
- Have a transparent
trusted-application.comthat follows the mouse pointer.
- As soon as the user clicks the slider and holds the mouse button, reposition the frame up in relation to the cursor. This ensures that the entire framed text is selected, regardless of mouse movement (yes, this works!).
- Wait for mouse button to be released.
- Reposition the frame so that the next click will begin to drag the selection.
- While the user is interacting with the slider, move the frame away, and place a receiving
designModecontainer under the mouse pointer.
- Steal documents across domains!
In the end, cross-domain frames proved to be a giant and completely unexpected attack surface; and very depressingly, we still have no idea how to properly address the problem once and for all. There simply are no simple and elegant solutions compatible with the modern web; and rest assured, browser vendors are extremely hesitant to experiment with complex heuristics instead. The only thing we decided to do to tackle the general threat is plastering the holes over with
X-Frame-Options - a naive opt-in mechanism that allows websites to refuse being framed across domains. Alas, this mechanism will never be used by all the sites that actually need it - and it offers no protection in more complex cases, such as the increasingly prevalent embeddable gadgets.
Because of this, I often fear that we are bound to repeat the painful security lessons of framed browsing very soon; for example, I am simply intimidated by the rush to deploy some of the more complex and at times exotic features as a part of HTML5 - web sockets, workers, sandboxing, storage, application caches, notifications, CORS, UMP, and countless other new HTML, CSS, and JS extensions added there every other week.
Yes, it's called "job security". But at times, it tends to suck.
PS. Yes, yes, I know. Interesting bugs coming soon. I have a very cool and major fuzzer waiting to be released, but I am still waiting for all vendors to fix the outstanding issues. Neat Firefox SOP bug is coming soon, too.