Novel applications of Machine Learning to Network Traffic Analysis

PDF Publication Title:

Novel applications of Machine Learning to Network Traffic Analysis ( novel-applications-machine-learning-network-traffic-analysis )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 079

classifier for intrusion detection. The work carries out an extensive comparison of the synthetic data produced by the new method with data produced by classis over-sampling techniques showing the better performance (when used as synthetic training data) of the new proposed method. 7.5.2 Datasets For this work we have used the NSL-KDD [67] dataset. This is a classic Intrusion Detection dataset. The dataset has 32 continuous and 3 categorical features, with an intrusion label of 5 values (Normal, DoS, Probe, R2L and U2R). This is a quite unbalanced dataset. We have performed an additional data transformation: scaling all NSL-KDD continuous features to the range [0,1] and one-hot encoding all categorical features. This provides a final dataset with 116 features: 32 continuous and 84 with values in {0,1} associated to the three one-hot encoded categorical features. The three categorical features: protocol, flag and service have respectively 3, 11 and 70 distinct values. The accuracy obtained when synthesizing these discrete features (having as reference the original ones) depends heavily on the cardinality of the feature. We provide all results using the full original training dataset of 125973 samples and the full original test dataset of 22544 samples. 7.5.3 Models The novel proposed architecture consists of a VAE which tries to recover an output identical to the inputs (the inputs being the network traffic features used to detect the intrusion class) but introducing a variation to the normal VAE consisting of the inclusion of an additional input to the decoder. This additional input is the one-hot encoded class label. The addition of this input is critical to improve the model in two directions: making easier the data generation process (which is now conditioned on the class label) and producing better synthetic data which is more closely related to the original one in terms of probability distribution conditioned on the class label. To arrive to the proposed model, we have analyzed different VAE architecture variants, providing an extensive study on the alternatives. Besides the proposal of a new architecture based on a conditional VAE we have used several machine learning techniques to demonstrate that the generated synthetic data can be used to improve the intrusion detection results of several classifiers (Random Forest, Logistic Regression, SVM and MLP). The work also shows that the synthetic data has a similar probability distribution for the features depending on their intrusion classes. We have developed several approaches to verify the similarity: (1) extended histograms of the original and synthesized features; and (2) Doctoral Thesis: Novel applications of Machine Learning to NTAP - 77

PDF Image | Novel applications of Machine Learning to Network Traffic Analysis

PDF Search Title:

Novel applications of Machine Learning to Network Traffic Analysis

Original File Name Searched:

456453_1175348.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)