PolyU Institutional Repository >
COMP Theses >
Please use this identifier to cite or link to this item:
|Title: ||Manifold learning for visual data analysis|
|Authors: ||Liu, Yang|
|Subjects: ||Computer vision|
Hong Kong Polytechnic University -- Dissertations
|Issue Date: ||2011 |
|Publisher: ||The Hong Kong Polytechnic University|
|Abstract: ||This thesis proposes a manifold learning framework for visual data analysis from four aspects: 1) extracting manifold structure of the visual data sets; 2) preserving natural tensor structure of the visual data points; 3) measuring the distance between high-order visual data points during data embedding; and 4) designing effective learning model for visual data analysis task. Two novel algorithms, multi-layer isometric feature mapping (ML-Isomap) and hybrid distance isometric embedding (HDIE) are proposed to extract low-dimensional manifold embedded in the high-dimensional observation space. Different from traditional manifold learning techniques, which only handle single manifold embedding, ML-Isomap embeds multiple manifolds in one low-dimensional space while keeping the relationship between data points in the same manifold or from different manifolds well. HDIE improves the performance of ML-Isomap further by constructing more faithful graph of data linkage. ML-Isomap has been applied to rushes editing task successfully and HDIE has achieved good performance in dynamic texture analysis task. Two novel algorithms, multilinear isometric embedding (MIE) and bidirectional visible neighborhood preserving embedding (BVNPE) are proposed to preserve nature tensor structure of the visual data points in manifold embedding. MIE integrates multilinear algebra and global manifold learning strategy to learn the low-dimensional tensor subspace. Besides keeping the high-order structure of the visual data, BVNPE provides more reliable neighborhood graph in manifold embedding based on two criteria: bidirectional linkage and visible neighborhood preserving.|
Two distance metrics, tensor distance (TD) and L₁-norm distance are proposed to measure the relations between data points in the embedding. Different from traditional Euclidean distance, which is constrained by orthogonality assumption, TD measures the distance between data points by considering the relationships among different coordinates of high-order data. According to the proposed TD metric, three new algorithms, TD based multilinear multidimensional scaling (TD-MMDS), TD based multilinear isometric embedding (TD-MIE), and TD based multilinear locality-preserved maximum information embedding (TD-MLPMIE), are proposed. TD-MMDS finds the transformation matrices by keeping the TDs between all pairs of input data in the embedded space. TD-MIE intends to preserve all pair-wise distances calculated according to TDs along shortest paths in the neighborhood graph. TD-MLPMIE aims to keep both local and global structures in a manifold model using the TD metric. By integrating tensor distance into tensor embedding, these algorithms achieve stable performance improvement on various standard data sets. Another novel algorithm maximum distance embedding (MDE) and its extension multilinear MDE (M²DE) are proposed according to L₁-norm distance, which is more robust to outliers than widely used Euclidean distance in measuring dissimilarity between data points. MDE and M²DE aim to maximize the distances between some particular pairs of data points, with the intention of flattening the local non-linearity and keeping the discriminant information simultaneously in the embedded feature space. Finally, a novel classifier named tensor-based locally maximum margin classifier (TLMMC) is proposed to map the high-dimensional observation space of the raw data to the semantic labels. By combining the advantages of preserving tensor structure of high-order data, constructing nonlinear hyperplanes by local method, and finding the optimal solution using maximum margin technique, TLMMC not only demonstrates good performance in real-world image and video applications, but also leads to a new thought of learning model design. Different from recent works that emphasize a more accurate classifier, TLMMC provides a more meaningful model with understandable physical explanation and clear theoretical support, so that the performance improvement of the proposed algorithm is convictive and the further extension to other applications is possible. Manifold learning has demonstrated good performance in visual data analysis. Further work will be explored from two aspects: how to address multimodal feature space of multimedia data and how to speedup manifold learning algorithms for real-world applications.
|Degree: ||Ph.D., Dept. of Computing, The Hong Kong Polytechnic University, 2011|
|Description: ||xxi, 184 p. : ill. (some col.) ; 30 cm.|
PolyU Library Call No.: [THS] LG51 .H577P COMP 2011 Liu
|Rights: ||All rights reserved.|
|Appears in Collections:||COMP Theses|
PolyU Electronic Theses
All items in the PolyU Institutional Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
No item in the PolyU IR may be reproduced for commercial or resale purposes.