Novel applications of Machine Learning to Network Traffic Analysis

PDF Publication Title:

Novel applications of Machine Learning to Network Traffic Analysis ( novel-applications-machine-learning-network-traffic-analysis )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 059

predicting on/off connectivity (a discrete variable), which constitutes a new starting point and a different issue itself, providing novel results. - It is proven that a mixed method (ARIMAX) provides the best accuracy (93%) but requires a huge training time. ARIMA and some non-time-series methods (logistic regression and random forest) with accuracy over 90% also provide very good performances. Considering the higher computational requirements for ARIMAX compared to logistic regression, random forest and ARIMA; the latter methods would be a better choice for a production environment (as they provide similar practical accuracy with less computing time). QoE estimation - It is demonstrated that a CNN can be applied to a time-series of samples (formed by aggregated information from network packets) to predict QoE for video transmission. - The proposed model can be integrated into a network management system to monitor network quality (as observed by the end-user), which is an essential part of a self-adapting network (e.g. SDN, edge computing...). The model is applicable to a real-time environment (in time-steps of 1-second) and is able to predict video QoE for current and near-future video transmissions. - The best proposed model includes a combination of CNN and RNN networks, being the CNN network the most critical piece. This is somewhat surprising given the time-series nature of the data that was formed by adding 3 samples of elementary flows (which consist of aggregated information from networks packets taken in a 1 second period). - Excellent prediction performance for not extremely unbalanced labels with a small dataset Synthesize training data to improve classification - First application, as far as we know, of a conditional VAE to generate fully synthetic network traffic data - Application of the synthetic data to an intrusion detection problem, which shows that, when training an ML algorithm with the new synthetic data, the detection results obtain a substantial improvement. This improvement is greater with the synthetic data generated by the proposed method in comparison with the results obtained by training with synthetic data generated by alternative SOTA over-sampling methods: SMOTE, ADASYN, ... - Innovative methods to assess the similarity of the probability distributions of features for the real and synthetic data are proposed. - Analysis of different variants of a VAE architecture for the proposed model, providing an extensive study on the alternatives. - The new method allows to synthesize the new samples just knowing the intrusion label to which the synthetic data should belong, with the advantage of not relying on specific samples associated with the labels. This association is usually noisy and identifying a canonical set of samples associated with each label can be complex. Therefore, the proposed model streamlines the data generation process based exclusively on the intrusion label. Synthesize missing data - First proposal, as far as we know, of a feature reconstruction model using a conditional VAE and first application for intrusion detection. - General framework available for other areas that may need a technique for feature reconstruction and imputation of missing values - Generative method that learns the probability distribution of the features conditioned on the label value. Inclusion of the label value to obtain the probability distribution of the features. - The model achieves excellent accuracy for the recovery of features: over 90% accuracy for labels with around 10 values and over 70% for labels with around 70 values. The accuracy is based in the correct prediction of all these values. - Intrusion data is strongly unbalanced, noisy and with numerous continuous and categorical features, which is a challenge to synthesize intrusion data with a probabilistic structure similar to the original one. That is the reason why we generate synthetic samples conditioned to the distribution of labels. That is, Doctoral Thesis: Novel applications of Machine Learning to NTAP - 57

PDF Image | Novel applications of Machine Learning to Network Traffic Analysis

PDF Search Title:

Novel applications of Machine Learning to Network Traffic Analysis

Original File Name Searched:

456453_1175348.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)