Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/5078
Title: Time series pattern matching, discovery & segmentation for numeric-to-symbolic conversion
Authors: Fu, Tak-chung
Subjects: Data mining
Time-series analysis
Hong Kong Polytechnic University -- Dissertations
Issue Date: 2001
Publisher: The Hong Kong Polytechnic University
Abstract: Recently, the increasing use of temporal data has initiated various research and development attempts in the field of data mining. Time series are an important class of temporal data objects and they can be easily obtained from financial and scientific applications, e.g. daily temperatures, prices of mutual funds and stocks. They are in fact major sources of temporal databases and undoubtedly finding useful time series patterns are of primordial importance. While most of the research communities have concentrated on the forecasting issues, discovery of hidden behavior and relationship within a time series or among a set of time series has so far not yet been fully addressed. Unlike the transactional databases with discrete/symbolic items, time series data are characterized by their numerical, continuous nature. Hence, time series data are difficult to manipulate. But when they can be treated as segments instead of data points, interesting patterns can be discovered and it becomes an easy task to query, understand and mine them. So, it is suggested to break down the sequences into meaningful subsequences and represent them symbolically. We term this process as numeric-to-symbolic (N/S) conversion and consider it as one of the most important components in time series data mining systems. In this thesis, various algorithms for N/S conversion is proposed. They include: a flexible temporal pattern matching scheme which attempts to locate the perceptually important points in the data sequence for similarity computation is first proposed. As to human's behavior in identifying patterns from time series, the frequently used patterns are typically characterized by a few critical points and these points are perceptually important in human's identification process and should also be taken into accounts in the pattern matching process. The proposed scheme follows this idea by locating those perceptually important points and attractive results have been obtained. Based on that, methods for discovering frequently appearing patterns from time series are developed. The raw numerical data sequence of certain length will undergo a clustering process using the Kohonen's self-organizing maps through which similar data sequences or patterns are grouped together and represented by a pattern symbol. With the new time series pattern matching scheme and the pattern discovery algorithm introduced, we propose to address the time series segmentation problem in a more flexible way so as to facilitate dynamic N/S conversion, i.e., to segment the time series irregularly. This is achieved by an evolutionary segmentation algorithm which works with the pattern matching scheme to make the cutting decisions. Simulation results on the time series of the Hang Seng Index as well as different Hong Kong stocks show that the proposed models are effective and yet efficient.
Description: x, 89 leaves : ill. ; 30 cm.
PolyU Library Call No.: [THS] LG51 .H577M COMP 2001 Fu
Rights: All rights reserved.
Type: Thesis
URI: http://hdl.handle.net/10397/5078
Appears in Collections:COMP Theses
PolyU Electronic Theses

Files in This Item:
File Description SizeFormat 
b1599529x_link.htmFor PolyU Users 162 BHTMLView/Open
b1599529x_ir.pdfFor All Users (Non-printable) 14.72 MBAdobe PDFView/Open


All items in the PolyU Institutional Repository are protected by copyright, with all rights reserved, unless otherwise indicated. No item in the PolyU IR may be reproduced for commercial or resale purposes.