Incremental HMM with an improved Baum-Welch Algorithm
There is an increasing demand for systems which handle higher density, additional loads as seen in storage workload modelling, where workloads can be characterized on-line. This paper aims to find a workload model which processes incoming data and then updates its parameters "on-the-fly." Essentially, this will be an incremental hidden Markov model (IncHMM) with an improved Baum-Welch algorithm. Thus, the benefit will be obtaining a parsimonious model which updates its encoded information whenever more real time workload data becomes available. To achieve this model, two new approximations of the Baum-Welch algorithm are defined, followed by training our model using discrete time series. This time series is transformed from a large network trace made up of I/O commands, into a partitioned binned trace, and then filtered through a K-means clustering algorithm to obtain an observation trace. The IncHMM, together with the observation trace, produces the required parameters to form a discrete Markov arrival process (MAP). Finally, we generate our own data trace (using the IncHMM parameters and a random distribution) and statistically compare it to the raw I/O trace, thus validating our model.
hidden Markov model
Baum-Welch algorithm
Backward algorithm
discrete Markov arrival process
incremental workload model
29-34
Regular Paper
Tiberiu S.
Chis
Tiberiu S. Chis
Peter G.
Harrison
Peter G. Harrison
10.4230/OASIcs.ICCSW.2012.29
Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported license
https://creativecommons.org/licenses/by-nc-nd/3.0/legalcode