Novel applications of Machine Learning to Network Traffic Analysis

PDF Publication Title:

Novel applications of Machine Learning to Network Traffic Analysis ( novel-applications-machine-learning-network-traffic-analysis )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 040

Another possibility to create synthetic data is provided by generative models that learn the latent joint probability distribution of the data. This allows the subsequent sampling of the joint probability distribution, creating synthetic data with a joint probability distribution similar to that of the original data. This is an alternative way to generate synthetic data, and it is the one that is followed in this thesis [5] using a variational autoencoder. Authors in [121] present a work of a similar nature, where a generative model is constructed to capture the joint probability distribution of the data. In this case, the data to be synthetized is relational data (contained in a database). The joint probability distribution for the complete dataset is obtained through a complex process that identifies the probability distribution of each column in the database, followed by an estimate of the covariance between columns using a Gaussian Copula. The covariance estimate is extended to related tables. To synthetize new data, they sample through the resulting (and complex) joint probability distribution. A similar approach to synthetize data with different generative models has also been applied to generate images [10][35][38] and text [36][43]. When generating synthetic data there are two scenarios: (a) to create synthetic samples with all their features [5], or, (b) to complete partially-filled samples where the values of some features are known but other are missing, in this case the synthesis is reduced to the missing features, with the important constraint of synthetizing the missing features conditioned on the values of the known ones [3]. There are several works related to the creation of semi-synthetic data for intrusion detection: In [122] the authors propose a modular synthetic dataset generation framework for web applications, together with a monitoring environment to collect data at multiple protocol layers (e.g. TCP, database queries, system calls...). They can create different types of attacks or reuse existing ones by adopting the Metasploit Framework within their own simulation environment, which they call Wind Tunnel. The approach corresponds to a semi-synthetic model. The work in [123] proposes a simulated environment to create intrusion data for a vehicular adhoc network (VANET). They present an experiment using a network simulator with 10 simulated scenarios of mobility of VANET hosts and 5 types of emulated security threats with the capacity to define the total number of vehicles and the number of malicious hosts in the VANET. In [124] a generator architecture (semi-synthetic approach) is proposed for datasets of system calls used for host intrusion detection systems (HIDS). The generator architecture is generic, but it is demonstrated using Ubuntu Linux and Mozilla Firefox as the profiled application. Authors in [125] implement a software simulated environment to create high-level human threats produced by malicious employees/agents inside an organization. They create a complex high-level simulated environment including aspects such as human behaviour, relationship and communications models within the organization. They create synthetic datasets corresponding to complex threats scenarios associated with personal dynamics within the organization Synthetic data generation is an interesting research area that will surely be further explored with the arrival of new algorithms (e.g. variational autoencoders and generative adversarial networks). The following table presents a summary of the main works related to the research carried out for this thesis. It provides a reference to the document, the data set used and the scope of the work. In this case, only similar works (for synthetic data) are presented in a general sense and coming from different fields. Doctoral Thesis: Novel applications of Machine Learning to NTAP - 38

PDF Image | Novel applications of Machine Learning to Network Traffic Analysis

PDF Search Title:

Novel applications of Machine Learning to Network Traffic Analysis

Original File Name Searched:

456453_1175348.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)