On the feasibility of utilizing correlations between user populations for traffic inference

Abstract

Network models today are often derived from two different methods. On one hand, detailed traffic models are generated based on traces from a single tap into the network. Alternatively, one can collect higher-level traffic-matrix data with SNMP from many routers. However, inferring flow-level details from such data is still an open research issue. Today it is infeasible to collect a fine-grained, packet-level representation of a complete, multi-router network. Even if it were economically feasible to synchronize and monitor every router in a large network, the amount of data generated would tax storage and computation resources. In this work, we propose a methodology to infer flow-level traffic across a network by exploiting the correlations between user populations across different networks. The contribution of this paper is twofold. First, based on traces of web traffic collected from two different sources, we observe that the user-behavior parameters of the traffic (such as user “think” time in web traffic) are correlated across time, while the application-specific parameters of the traffic (such as object size) are correlated across “similar” networks. Second, by utilizing the correlations between similar networks, we propose a methodology for inferring traffic at places where continuously taking measurements is infeasible. We evaluate the effectiveness of our methodology via simulation.

Citation

Kun-chan Lan and John Heidemann, "On the feasibility of utilizing correlations between user populations for traffic inference," in IEEE LCN 2005, November, 2005.

Bitex

@ARTICLE{lan2005:TrafficInference,
AUTHOR = {Kun-chan Lan and John Heidemann},
TITLE = {On the feasibility of utilizing correlations between user populations for traffic inference},
BOOKTITLE = {IEEE LCN 2005},
MONTH = {November},
YEAR = {2005}
}

Download

pdf Full Text (PDF Format)