Semi-Automated Discovery of Application Session Structure

Jayanthkumar Kannan, Jaeyeon Jung, Vern Paxson, Can E. Koksal
Internet Measurement Conference (IMC), Rio de Janeiro, Brazil, October 2006

While the problem of analyzing network traffic at the granularity of individual connections has seen considerable previous work and tool development, understanding traffic at a higher level---the structure of user-initiated sessions comprised of groups of related connections---remains much less explored. Some types of session structure, such as the coupling between an FTP control connection and the data connections it spawns, have prespecified forms, though the specifications do not guarantee how the forms appear in practice. Other types of sessions, such as a user reading email with a browser, only manifest empirically. Still other sessions might exist without us even knowing of their presence, such as a botnet zombie receiving instructions from its master and proceeding in turn to carry them out. We present algorithms rooted in the statistics of Poisson processes that can mine a large corpus of network connection logs to extract the apparent structure of application sessions embedded in the connections. Our methods are semi-automated in that we aim to present an analyst with high-quality information (expressed as regular expressions) reflecting different possible abstractions of an application's session structure. We develop and test our methods using traces from a large Internet site, finding diversity in the number of applications that manifest, their different session structures, and the presence of abnormal behavior. Our work has applications to traffic characterization and monitoring, source models for synthesizing network traffic, and anomaly detection.

[PDF (180KB)]

Bibtex Entry:

@inproceedings{kannan2006semi-automated,
   author =       "Jayanthkumar Kannan and Jaeyeon Jung and Vern Paxson and Can E. Koksal",
   title =        "{Semi-Automated Discovery of Application Session Structure}",
   booktitle =    {Internet Measurement Conference (IMC)},
   year =         {2006},
   month =        {October},
   address =      { Rio de Janeiro, Brazil}
}