PolyU IR
 

PolyU Institutional Repository >
Electronic and Information Engineering >
EIE Theses >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/1023

Title: Neural network techniques for graffiti interpretation and speech recognition
Authors: Leung, Koon-fai
Subjects: Neural networks (Computer science)
Pattern perception
Speech perception
Hong Kong Polytechnic University -- Dissertations
Issue Date: 2004
Publisher: The Hong Kong Polytechnic University
Abstract: This thesis explores the neural network classification techniques on an electronic book (eBook) reading device. Two areas of application are addressed: a graffiti interpreter and a Cantonese-speech recognizer. Different structures of neural networks and hybrid neural networks incorporating fuzzy sets are used to realize the applications.
An eBook reading device enhances our reading environment with interactive and multimedia features. Input for this device is possibly made using a stylus on a touch-screen or voice through a microphone; practically, the former is a pattern recognition (graffiti interpretation) problem and the latter is a speech recognition problem.
With graffiti interpretation, eBook users can take full advantage of the graffiti input to issue commands or input texts. The interpretation is done by the template matching technique. Two approaches are developed to realize the pattern recognition, which apply a self-structured neural network and a self-structure neural-fuzzy network. Improved from a 3-layer fully connected neural network/neural-fuzzy network, the self-structured network has a variable structure that adapts to the characteristics of the input patterns by incorporating link switches. By properly determining the states of the link-switches through training, the dummy links can be eliminated. Simulation results show that the self-structure network performs better than a fixed-structure network in terms of the network size.
With a speech recognizer, eBook users can use natural speech to execute some functions of the eBook and enter characters whenever necessary. Four approaches are proposed to recognize Cantonese speech. Of them, three are feed-forward neural networks, and one is a recurrent neural network.
As the first approach, the self-structured neural-fuzzy network used for graffiti interpretation is also applied to recognize Cantonese-speech commands. Then, a neural-fuzzy network and a neural network are modified by adding associative memory to provide the network parameters. In both of these approaches, the neural-fuzzy network/neural network effectively has variable parameters that change with respect to the input patterns. Thus, the leaning ability can be enhanced for the case if two feature vectors belong to the same class but sparsely distributed. Results will be given to demonstrate the improvement on recognition accuracy, network complexity and learning rate. A discussion on comparing the various approaches will also be given.
By using a recurrent neural network, the sequential properties of the double-syllable Cantonese-digit can be modeled. The fourth approach therefore involves an associative memory for a recurrent neural network. Results will be given to demonstrate the merits of the proposed approach. A discussion on the comparison between the static approaches and the dynamic approach will also be given.
In this thesis, all neural networks are trained by an improved genetic algorithm (GA). The details about this algorithm and its performance in some benchmark test functions will be given in the Appendix.
Degree: M.Phil., Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University, 2004.
Description: xix, 121 leaves : ill. ; 30 cm.
PolyU Library Call No.: [THS] LG51 .H577M EIE 2004 Leung
Rights: All rights reserved.
Type: Thesis
URI: http://hdl.handle.net/10397/1023
Appears in Collections:EIE Theses
PolyU Electronic Theses

Files in This Item:

File Description SizeFormat
b17726621_ir.pdfFor All Users (Non-printable)3.02 MBAdobe PDFView/Open
b17726621_link.htmFor PolyU Users167 BHTMLView/Open



Facebook Facebook del.icio.us del.icio.us LinkedIn LinkedIn


All items in the PolyU Institutional Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
No item in the PolyU IR may be reproduced for commercial or resale purposes.

 

© Pao Yue-kong Library, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Powered by DSpace (Version 1.5.2)  © MIT and HP
Feedback | Privacy Policy Statement | Copyright & Restrictions - Feedback