Novel applications of Machine Learning to Network Traffic Analysis

PDF Publication Title:

Novel applications of Machine Learning to Network Traffic Analysis ( novel-applications-machine-learning-network-traffic-analysis )

Previous Page View | Next Page View | Return to Search List

Text from PDF Page: 102

show a maximum in performance for a 24 hours period and a minimum around a 12 hours period. This behavior is similar for all methods. An explanation for this behavior is given in section 5, connecting the mean global activity of the SIMs with the mean accuracy of the forecasts. The training speed (i.e. the time it takes for the algorithms to tune its parameters) is very high for all methods except GBM which is much slower (see section 3.3). Figure 3 shows the performance results for non-time-series methods; the upper diagram presents the mean accuracy over a prediction period of 48 hours. The mean accuracy is calculated using the process presented in section 2.3 (Figure 1). Lower diagram presents the standard deviation of the accuracy values used to build the upper diagram. Figure 3. Performance results for non-time-series methods. As already mentioned, the non-time-series methods explored have been: Logistic regression, Bayesian logistic regression, Random Forest and GBM. All of them are well known methods with good performance in several areas of application. For all these methods, a training of the algorithm for each particular SIM was performed, using the day of the week (7 possible values) and hour of day (24 values) as predictor variables, and the on/off activity in one-hour periods as the predicted variable. We considered these features as the best due to the time-series nature of the data. We tried to add additional predictors related with other time elapsed periods in hours (e.g. 2 or 4-hour periods) not improving the results significantly. Other available features were not used since the computational needs would have increased substantially. Intuitively another interesting feature to explore could be the customer, since devices from the same customer may have similar traffic patterns (e.g. a smart meter from a utility, or a connected vehicle from a car manufacturer); this feature could be explored in future work. The results for logistic regression were quite satisfactory; nevertheless, we incurred in a complete separation (also named perfect separation) problem during training. This problem happens when one or several independent variables can fully predict the result, this usually implies over-fitting (results are good for the training set but not as good for the real set). To Doctoral Thesis: Novel applications of Machine Learning to NTAP - 100

PDF Image | Novel applications of Machine Learning to Network Traffic Analysis

PDF Search Title:

Novel applications of Machine Learning to Network Traffic Analysis

Original File Name Searched:

456453_1175348.pdf

DIY PDF Search: Google It | Yahoo | Bing

Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info

Cruising Review Topics and Articles More Info

Software based on Filemaker for the travel industry More Info

The Burgenstock Resort: Reviews on CruisingReview website... More Info

Resort Reviews: World Class resorts... More Info

The Riffelalp Resort: Reviews on CruisingReview website... More Info

CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)