jmlr jmlr2008 jmlr2008-37 jmlr2008-37-reference knowledge-graph by maker-knowledge-mining

37 jmlr-2008-Forecasting Web Page Views: Methods and Observations


Source: pdf

Author: Jia Li, Andrew W. Moore

Abstract: Web sites must forecast Web page views in order to plan computer resource allocation and estimate upcoming revenue and advertising growth. In this paper, we focus on extracting trends and seasonal patterns from page view series, two dominant factors in the variation of such series. We investigate the Holt-Winters procedure and a state space model for making relatively short-term prediction. It is found that Web page views exhibit strong impulsive changes occasionally. The impulses cause large prediction errors long after their occurrences. A method is developed to identify impulses and to alleviate their damage on prediction. We also develop a long-range trend and season extraction method, namely the Elastic Smooth Season Fitting (ESSF) algorithm, to compute scalable and smooth yearly seasons. ESSF derives the yearly season by minimizing the residual sum of squares under smoothness regularization, a quadratic optimization problem. It is shown that for longterm prediction, ESSF improves accuracy significantly over other methods that ignore the yearly seasonality. Keywords: web page views, forecast, Holt-Winters, Kalman filtering, elastic smooth season fitting


reference text

B. D. O. Anderson and J. B. Moore. Optimal Filtering, Prentice-Hall, Englewood Cliffs, New Jersey, 1979. A. Aussem and F. Murtagh. Web traffic demand forecasting using wavelet-based multiscale decomposition. International Journal of Intelligent Systems, 16(2):215-236, 2001. S. Basu, A. Mukherjee, and S. Klivansky. Time series models for Internet traffic. INFOCOM ’96. Fifteenth Annual Joint Conference of the IEEE Computer Societies, Networking the Next Generation, 611-620, 1996. G. E. P. Box and G. M. Jenkins. Time-Series Analysis, Forecasting and Control, San Francisco: Holden-Day, 1970. P. J. Brockwell and R. A. Davis. Time Series: Theory and Methods, 2nd Edition, Springer-Verlag, New York, 1991. P. J. Brockwell and R. A. Davis. Introduction to Time Series and Forecasting, 2nd Edition, Springer Science+Business Media, Inc., New York, 2002. C. Chatfield. The Analysis of Time Series, Chapman & Hall/CRC, New York, 2004. J. Durbin and S. J. Koopman. Time Series Analysis by State Space Methods, Oxford University Press Inc., New York, 2001. M. Grossglauser and J.-C. Bolot. On the relevance of long-range dependence in network traffic. IEEE/ACM Transactions Networking, 7(5):629-640, 1999. R. E. Kalman. A new approach to linear filtering and prediction problems. Transactions of the ASME - Journal of Basic Engineering, Series D, 82:35-45, 1960. A. Khotanzad and N. Sadek. Multi-scale high-speed network traffic prediction using combination of neural networks. Proc. Int. Joint Conf. Neural Networks, 2:1071-1075, July 2003. A. M. Odlyzko. Internet traffic growth: sources and implications. Optical Transmission Systems and Equipment for WDM Networking II, B. B. Dingel, W. Weiershausen, A. K. Dutta, and K.-I. Sato, eds.,Proc. SPIE, 5247:1-15, 2003. K. Papagiannaki, N. Taft, Z.-L. Zhang, and C. Diot. Long-term forecasting of Internet backbone traffic. IEEE Trans. Neural Networks, 16(5):1110-1124, 2005. K. Park and W. Willinger. Self-Similar Network Traffic and Performance Evaluation, John Wiley & Sons, Inc., 2000. A. P. Sage and J. L. Melsa. Estimation Theory with Applications to Communication and Control, McGraw Hill, New York, 1971. A. Sang and S. Li. A predictability analysis of network traffic. Computer Networks, 39(4):329-345, 2002. 2249 L I AND M OORE W. W. S. Wei. Time Series Analysis, Univariate and Multivariate Methods, 2nd Edition, Pearson Education, Inc., 2006. C. You and K. Chandra. Time series models for Internet data traffic. Proc. 24th Annual IEEE Int. Conf. Local Computer Networks (LCN’99), 164, 1999. 2250