blog
June 22, 2011
We have recently presented our paper Peering through the iFrame at the INFOCOM mini-conference.
In this paper, we look in-depth into a drive-by-download campaign, the one used to spread the Mebroot malware. In a way, this paper is an ideal continuation of our earlier investigations of the Mebroot/Torpig botnet; more in general, however, it aims at providing a snapshot (as comprehensive as possible) of a modern drive-by-download campaign. Mebroot is not the most pervasive or widespread drive-by-download campaign (during our monitoring, it affected "only" several thousands domains), but it is long-lasting and quite successful, and therefore it makes for an interesting subject of study.
We started off our study with the goal of gaining a better understanding of all the parties involved in a drive-by-download campaign: the attackers (what is their modus operandi? what infrastructure do they rely on for running the campaign?); the legitimate web sites that get compromised to drive traffic to exploit sites (which sites are targeted? do they notice they have been compromised? how long does it take for them to cleanup?); and the final potential victims of the attacks (are they indeed vulnerable to the attacks? what is the actual infection rate?)
To answer these questions, we needed to get visibility into the
operations of the drive-by-download campaign, and, as in our previous
studies on Mebroot, we obtained it by infiltrating the Mebroot's
infrastructure. A little bit of background is necessary to understand
how this worked in practice.
As in other drive-by-download attacks, the Mebroot campaign compromises
legitimate web sites with code that redirects the visitors of these
sites to the campaign's exploit sites (where the actual exploits are
launched). In the Mebroot case, the injected code uses domain
generation algorithms (DGAs) to dynamically generate the name of the
exploit sites where victims are sent to (instead of having their names
statically hard-coded). In practice, every so often (from one day to a
few days), the DGAs generate a different domain name, thus they redirect
victims to a different exploit site.
This presumably is done to be more resilient to take-down attempts: in
the traditional model (hard-coded exploit sites), whenever the current
exploit site is blocked, the campaign is effectively disabled: all the
legitimate web sites that an attacker has compromised of suddenly become
useless, because they point to a disabled domain.
On the contrary, in the Mebroot case, the disruption caused by taking
down the current exploit sites is only temporary: as soon as the DGAs
generate a new exploit site, the campaign is active again and the sites
that were compromised in the past resume sending their victims to the
new exploit site.
However, DGAs also open a window of opportunity for defenders. In particular, we were able to register some of the domain names that were to be used in the campaign. As a consequence, for several days over a period of almost a year, our own servers were used in the campaign in place of the actual exploit sites. Of course, our servers simply monitored the traffic they received and performed several measurements of their visitors.
This monitoring gave us a lot of interesting information; for all the results, refer to the full paper. Here are two findings (on the final target of the attacks and on the compromised web sites) that I think are particularly interesting.
How vulnerable really are the users that are redirected to exploit sites? Quite a bit. During our study, we found that roughly between 60% and 80% of the visitors used at least one browser plugin that was known to be vulnerable. Between 30% and 40% of the users we observed was vulnerable to one of the exploits used in the Mebroot drive-by-download campaign. Clearly, these are very worrying statistics. To be precise, these are upper bounds on the actual infection rates: from our vantage point, we could not determine whether an exploit was successful—an attack could be blocked by a host-based defense mechanism, such as an anti-virus tool. In any case, the potential for infection (and the lack of updating and patching) is staggering.
Switching our attention to the compromised web sites that expose their users to exploits, do they realize that they have been compromised, and, if so, do they clean up and remediate the infection? Not really. Almost 20% of the compromised web sites remained infected during our entire monitoring period. Those that did clean up, did so very slowly: after 25 days only half of the sites had removed the malicious code.
For more results, stats, and graphs, check out the paper. Here is the abstract:
Drive-by-download attacks have become the method of choice for cyber-criminals to infect machines with malware. Previous research has focused on developing techniques to detect web sites involved in drive-by-download attacks, and on measuring their prevalence by crawling large portions of the Internet. In this paper, we take a different approach at analyzing and understanding drive-by-download attacks. Instead of horizontally searching the Internet for malicious pages, we examine in depth one drive-by-download {\em campaign}, that is, the coordinated efforts used to spread malware. In particular, we focus on the Mebroot campaign, which we periodically monitored and infiltrated over several months, by hijacking parts of its infrastructure and obtaining network traces at an exploit server.
By studying the Mebroot drive-by-download campaign from the inside, we could obtain an in-depth and comprehensive view into the entire life-cycle of this campaign and the involved parties. More precisely, we could study the security posture of the victims of drive-by attacks (e.g., by measuring the prevalence of vulnerable software components and the effectiveness of software updating mechanisms), the characteristics of legitimate web sites infected during the campaign (e.g., the infection duration), and the {\em modus operandi} of the miscreants controlling the campaign.
To leave a comment, complete the form below. Mandatory fields are marked *.
For Brett's take on the paper, see this post on the iSecLab blog.