TY - GEN
T1 - Efective and real-time in-app activity analysis in encrypted internet trafic streams
AU - Liu, Junming
AU - Fu, Yanjie
AU - Ming, Jingci
AU - Ren, Yong
AU - Sun, Leilei
AU - Xiong, Hui
N1 - Funding Information:
This research was supported in part by the Natural Science Foundation of China (71329201) and Futurewei Technologies, Inc.
Publisher Copyright:
© 2017 Association for Computing Machinery.
PY - 2017/8/13
Y1 - 2017/8/13
N2 - The mobile in-App service analysis, aiming at classifying mobile internet trafic into different types of service usages, has become a challenging and emergent task for mobile service providers due to the increasing adoption of secure protocols for in-App services. While some efforts have been made for the classification of mobile internet trafic, existing methods rely on complex feature construction and large storage cache, which lead to low processing speed, and thus not practical for online real-time scenarios. To this end, we develop an iterative analyzer for classifying encrypted mobile trafic in a real-time way. Specifically, we first select an optimal set of most discriminative features from raw features extracted from trafic packet sequences by a novel Maximizing Inner activity similarity and Minimizing Different activity similarity (MIMD) measurement. To develop the online analyzer, we first represent a trafic flow with a series of time windows, which are described by the optimal feature vector and are updated iteratively at the packet level. Instead of extracting feature elements from a series of raw trafic packets, our feature elements are updated when a new trafic packet is observed and the storage of raw trafic packets is not required. The time windows generated from the same service usage activity are grouped by our proposed method, namely, recursive time continuity constrained KMeans clustering (rCKC). The feature vectors of cluster centers are then fed into a random forest classifier to identify corresponding service usages. Finally, we provide extensive experiments on real-world trafic data from Wechat, Whatsapp, and Facebook to demonstrate the effectiveness and eficiency of our approach. The results show that the proposed analyzer provides high accuracy in real-world scenarios, and has low storage cache requirement as well as fast processing speed.
AB - The mobile in-App service analysis, aiming at classifying mobile internet trafic into different types of service usages, has become a challenging and emergent task for mobile service providers due to the increasing adoption of secure protocols for in-App services. While some efforts have been made for the classification of mobile internet trafic, existing methods rely on complex feature construction and large storage cache, which lead to low processing speed, and thus not practical for online real-time scenarios. To this end, we develop an iterative analyzer for classifying encrypted mobile trafic in a real-time way. Specifically, we first select an optimal set of most discriminative features from raw features extracted from trafic packet sequences by a novel Maximizing Inner activity similarity and Minimizing Different activity similarity (MIMD) measurement. To develop the online analyzer, we first represent a trafic flow with a series of time windows, which are described by the optimal feature vector and are updated iteratively at the packet level. Instead of extracting feature elements from a series of raw trafic packets, our feature elements are updated when a new trafic packet is observed and the storage of raw trafic packets is not required. The time windows generated from the same service usage activity are grouped by our proposed method, namely, recursive time continuity constrained KMeans clustering (rCKC). The feature vectors of cluster centers are then fed into a random forest classifier to identify corresponding service usages. Finally, we provide extensive experiments on real-world trafic data from Wechat, Whatsapp, and Facebook to demonstrate the effectiveness and eficiency of our approach. The results show that the proposed analyzer provides high accuracy in real-world scenarios, and has low storage cache requirement as well as fast processing speed.
KW - In-app analytics
KW - Internet trafic analysis
KW - Service usage classification
KW - Time series segmentation
UR - http://www.scopus.com/inward/record.url?scp=85029111041&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029111041&partnerID=8YFLogxK
U2 - 10.1145/3097983.3098049
DO - 10.1145/3097983.3098049
M3 - Conference contribution
AN - SCOPUS:85029111041
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 335
EP - 344
BT - KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017
Y2 - 13 August 2017 through 17 August 2017
ER -