PDF Publication Title:
Text from PDF Page: 099
This work has been executed in a desktop PC with Intel i7 processor and 16 GB of RAM, using R and RStudio software. All the R packages used are open source and freely available. 2.2. Algorithms Regarding the prediction methods, taking into account the time-series nature of the data it was natural to consider time series forecasting methods; additionally, it seems to make sense to explore the adequacy of non-time–series methods by using other variables for prediction: e.g. time of day (hour of day), day of the week, customer identity, access point, etc. In this study we have evaluated the following time-series methods: Hidden Markov Model (HMM), Exponential Smoothing, ARIMA and ARIMAX. And the following non-time-series methods: Logistic Regression, Random Forest, Gradient Boosting Method (GBM) and Bayesian Logistic Regression. It is out of scope of the paper to explain the details of the algorithms and good references cover them and are indicated through the paper, e.g. random forest [1] and GBM [2] are both based on decision trees. 2.3. Evaluation To evaluate the results, in order to calculate the prediction performance of the different methods, the last 8 days were used (making forecasts for 4 consecutive periods of 48 hours, based on the available data of the previous days). In order to optimize available data, we have used a variant of cross-validation (CV) applied to time series data. We have divided the training and testing data in 4 blocks, and for each consecutive block we have added to the training data the test data of the previous block. We have trained individually all the methods for each Subscriber Identification Module (SIM) and for each cross-validation block (x4). Finally, we have performed an averaging process (along SIMs and cross-validation blocks) to have a final performance result for the 48 hours forecasting period. In Figure 1 we present graphically the evaluation process; to obtain the performance results for the different methods we have used a kind of cross-validation applied to the time series data coming from the different devices. We have averaged results coming from the cross- validated test results and for all devices. Doctoral Thesis: Novel applications of Machine Learning to NTAP - 97PDF Image | Novel applications of Machine Learning to Network Traffic Analysis
PDF Search Title:
Novel applications of Machine Learning to Network Traffic AnalysisOriginal File Name Searched:
456453_1175348.pdfDIY PDF Search: Google It | Yahoo | Bing
Cruise Ship Reviews | Luxury Resort | Jet | Yacht | and Travel Tech More Info
Cruising Review Topics and Articles More Info
Software based on Filemaker for the travel industry More Info
The Burgenstock Resort: Reviews on CruisingReview website... More Info
Resort Reviews: World Class resorts... More Info
The Riffelalp Resort: Reviews on CruisingReview website... More Info
CONTACT TEL: 608-238-6001 Email: greg@cruisingreview.com (Standard Web Page)