Real-Time Detection of Malicious Network Activity Using Stochastic Models

Jaeyeon Jung
Massachusetts Institute of Technology, Cambridge, MA,

This dissertation develops approaches to rapidly detect malicious network traffic including packets sent by portscanners and network worms. The main hypothesis is that stochastic models capturing a host's particular connection-level behavior provide a good foundation for identifying malicious network activity in real-time. Using the models, the dissertation shows that a detection problem can be formulated as one of observing a particular "trajectory" of arriving packets and inferring from it the most likely classification for the given host's behavior. This stochastic approach enables us not only to estimate an algorithm's performance based on the measurable statistics of a host's traffic but also to balance the goals of promptness and accuracy in detecting malicious network activity.

This dissertation presents three detection algorithms based on Wald's mathematical framework of sequential analysis. First, Threshold Random Walk (TRW) rapidly detects remote hosts performing a portscan to a target network. TRW is motivated by the empirically observed disparity between the frequency with which connections to newly visited local addresses are successful for benign hosts vs. for portscanners. Second, it presents a hybrid approach that accurately detects scanning worm infections quickly after the infected local host begins to engage in worm propagation. Finally, it presents a targeting worm detection algorithm, Rate-Based Sequential Hypothesis Testing (RBS), that promptly identifies high-fan-out behavior by hosts (e.g. targeting worms) based on the rate at which the hosts initiate connections to new destinations. RBS is built on an empirically-driven probability model that captures benign network characteristics. It then presents RBS+TRW, a unified framework for detecting fast-propagating worms independently of their target discovery strategy.

All these schemes have been implemented and evaluated using real packet traces collected from multiple network vantage points.

[PDF (4033KB)]