Recent large-scale cyberattacks such as the information leakage at SK Communications, attacks on the network infrastructure of NongHyup Bank in 2011, and the cyberattack targeted at major television stations and banks in 2013 were categorized as threats in the form of Advanced Persistent Threats (APT). At the starting point of an APT, an attacker builds malware and customizes it for the purpose of an attack. The attacker injects the malware into a victim’s PC and can then carry out malicious activities such as accessing the internal network and stealing personal information. In the past, attackers delivered malicious code through e-mail or a portable media device such as a USB drive. Recently, there has been a more sophisticated attack method for spreading malware through websites called, “drive-by download”. Figure 1 illustrates the anatomy of a drive-by download attack. An attacker hacks a legitimate website and inserts a link to the malware distribution page. If there are security vulnerabilities on the visitors’ PC, the visitor is automatically redirected to the malware distribution page when he/she visits the legitimate, but compromised, website.

Figure 1. Drive-by download attack process
Figure 1. Drive-by download attack process

In general, there are two types of detection methods for a drive-by download attack: low-interaction and high-interaction security devices known as honeyclients. A low-interaction honeyclient commonly utilizes a static analysis or web browser emulation technique which inspects a webpage based on the source code of the page or metadata of the website without rendering it in the web browser. Although this method has the advantage of faster analysis, it is known to be vulnerable to newer types of attacks. On the other hand, a high-interaction honeyclient utilizes virtualization technology based on dynamic analysis which renders the website in a real web browser running on a virtual machine. Unfortunately, its analysis speed is relatively slow and an attack can only be detected if the vulnerable component targeted by the exploit is activated inside the detection system.

The KAIST Cyber Security Research Center (CSRC) developed a system called SIMon which detects malware distribution networks to prevent damage from malware spreading through the Web. SIMon is a web-crawling based monitoring system which analyzes more than 40,000 websites and 40,000,000 webpages. SIMon was developed based on the low-interaction honeyclient approach. Figure 2 depicts the architecture of SIMon. Management servers schedule workers to crawl websites and extract suspicious web components. These components include hidden iframes, meta tags, and javascripts which could lead visitors to be automatically redirected to a malware distribution webpage. Working agents decide whether the website acts as a part of a malware distribution network based on the analysis result which includes the existence of script obfuscation, abnormal web components, and a high degree of text entropy.

Figure 2. SIMon's architecture
Figure 2. SIMon’s architecture

The CSRC filed a Korean patent application for SIMon in 2013 and published the research contribution in the Journal of the Korea Institute of Information Security and Cryptology. In addition, as a part of the SIMon system, we have developed a countermeasure against web-based device fingerprinting mechanisms and presented the results at the IEEE Symposium on Security and Privacy in May of 2014. The CSRC has also been publishing “The KAIST Weekly Security Report” every Tuesday. The contents of the report contain a list of malware distribution websites and analysis of security trends every week. Currently, there are about 250 recipients including security officers at Cheong Wa Dae, the Korea Internet Security Agency (KISA), the National Intelligence Service (NIS), and industrial companies. (You may request a free subscription at csrc@kaist.ac.kr).

Based on our research achievement, we are proceeding with plans to commercialize SIMon in an attempt to cope with advanced drive-by download attacks. The commercialized product, named WebCure, is a standalone hardware device inside which the core functionality of SIMon has been embedded. Key features of WebCure include real-time analysis of web traffic, monitoring designated websites, and preventing internal users from accessing malicious webpages. Real-time web traffic analysis utilizes SIMon’s malicious webpage detection engine to check websites that internal users visit through the Internet. In a company’s network, there are a number of web servers connected to external networks. WebCure has the capability of monitoring such web servers to check whether they are compromised and acting as a part of malware distribution network. WebCure is also capable of blocking access to webpages that spread malware. Based on its own analysis history and SIMon’s malicious website database, it prevents internal users from accessing malware-spreading websites and alerts them of the danger. We expect WebCure to be the best standalone solution for protecting network infrastructures from drive-by download attacks.

By conducting research on countermeasures against drive-by download attacks and commercializing them, we hope the CSRC will make a significant contribution to help strengthen national cyber security.

Figure 3. WebCure architecture
Figure 3. WebCure architecture