As some of you may be aware, Microsoft Internet Explorer supports MHTML, a simple container format that uses MIME encapsulation (nominally
multipart/related) to combine several documents into a single file. Each container may consist of a number of possibly
base64-encoded documents, with their content type determined solely by the inline MIME data.
Perhaps by the virtue of not having cross-browser support, the MHTML format is not commonly used on the web - but it is employed by Internet Explorer itself to save downloaded pages to disk; and embraced by some third-party applications to deliver HTML-based documentation and help files.
To facilitate access to MHTML containers, the browser also supports a special
mhtml: URL scheme, followed by a fully-qualified URL from which the document is to be retrieved; a "!" delimiter; and the name of the target resource inside the container. Unfortunately, when MHTML containers are accessed over protocols that provide other, normally authoritative means for specifying document type (e.g.
Content-Type in HTTP traffic), this protocol-level information is ignored, and a very lax MIME envelope parser is invoked on the retrieved document, instead. The behavior of this parser is not documented, but it appears that in many cases, adequately sanitized user input appearing on HTML pages, in JSON responses, CSV exports, image metadata, and so forth, is sufficient to trick it into treating the underlying document as valid MHTML. All that is needed to keep this parser happy is the ability to place several alphanumeric and punctuation characters on the target page, in several separate lines.
Based on this 2007 advisory, it appears that a variant of this issue first appeared in 2004, and has been independently re-discovered several times in that timeframe. In 2006, the vendor reportedly acknowledged the behavior as "by design"; but in 2007, partial mitigations against the attack were rolled out as a part of MS07-034 (CVE-2007-2225). Unfortunately, these mitigations did not extend to a slightly modified attack published in the January 2011 post to the full-disclosure@ mailing list.
It appears that the affected sites generally have very little recourse to stop the attack: it is very difficult to block the offending input patterns perfectly, and there may be no reliable way to distinguish between MHTML-related requests and certain other types of navigation (e.g.,
<embed> loads). A highly experimental server-side workaround devised by Robert Swiecki may involve returning HTTP code
201 Created rather than
200 OK when encountering vulnerable
User-Agent strings - as these codes are recognized by most browsers, but seem to confuse the MHTML fetcher itself.
Update: see this announcement for more.