Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/4946
Title: A self-adaptive spectral rotation approach to detection of DNA sequence periodicities and their relationship with molecular mechanisms
Authors: Chen, Bo
Subjects: Nucleotide sequence.
DNA -- Analysis.
Hong Kong Polytechnic University -- Dissertations
Issue Date: 2011
Publisher: The Hong Kong Polytechnic University
Abstract: Computational investigations into the relationship and interaction between DNA sequences and cell components help biologists and medical scientists to address many important issues, such as diagnosis of gene-related diseases, medicine development, protein design, and so on. This study initiates a new approach, namely, Self-Adaptive Spectral Rotation (SASR), to investigate the relationship between periodicities in DNA sequences and various molecular mechanisms in cells, including genetic coding and nucleosome formation. This newly developed approach could be very useful in fields of bioinformatics, such as protein-coding region prediction and nucleosome positioning prediction. Protein-coding region prediction, especially computational methods to find locations of protein-coding regions in uncharacterized DNA sequences, is a meaningful issue in computational molecular biology. In this study, the SASR approach is first developed to visualize a coding related feature, i.e., the Triplet Periodicity (TP) or 3bp (base pairs) periodicity, in DNA sequences. Applications on real genomic datasets show that, in SASR's output, the graphic patterns for coding and non-coding regions differ so significantly that the former can be visually distinguished from the latter. Such visualization by the SASR approach requires no training process, and takes the advantage of "auto-scale analysis ability" from human vision. However, as a visualization method, the SASR approach does not provide exact numerical predictions. Therefore, a T-Z-T approach is developed to extract numerical information from the SASR's graphic result. The combination of the SASR and the T-Z-T provides computational predictions of coding regions without any training process. Moreover, the predictions from this SASR based approach are more robust than those from commonly used methods based on Hidden Markov Model (HMM), since this new approach is not sensitive to input errors contained in DNA sequences.
Experimental studies on nucleosome positioning have revealed the preference of nucleosome binding for certain regions of a DNA sequence. However, it is still not clear whether or not such a binding preference is sequence-specific. Therefore, the study on the relationship between sequence features and nucleosome formation is of great significance. A major concern in this issue is the ~10bp periodicity property in DNA sequences, which appears to be associated with the structure of DNA helixes and the formation of nucleosomes. In this study, the original SASR approach is extended to investigate the relationship between nucleosome formation and the ~10bp periodicity of dinucleotides in DNA sequences. A Genetic Algorithm (GA) based method is developed to identify which dinucleotide combination mostly connects its ~10bp periodicity with nucleosome formation. The results from the GA support the "sequence-specific" argument of nucleosome formation. Meanwhile, they also suggest that some dinucleotides connect their ~10bp periodicity with nucleosome formation only in some local regions. Moreover, the ~10bp periodicity of dinucleotides is associated with not only the occurrence of nucleosome formation, but also the binding preference for the phase in the ~10bp period. Besides the TP and the ~10bp periodicity, some other unknown periodicity properties may also be contained in DNA sequences, and may have some connections with some important molecular mechanisms. Investigations of new periodicity properties might help with the computational studies of sequence-specific molecular mechanisms in organisms. In this study, another extension of the SASR approach, i.e., the mature SASR, shows its ability to detect a hypothetical anti-TP property in DNA sequences. Some real DNA fragments are found with such an anti-TP property by using the mature SASR. However, the universality of this property in genomes and its biological interpretation need further investigations.
Description: xix, 189 leaves : ill. (some col.) ; 30 cm.
PolyU Library Call No.: [THS] LG51 .H577P ISE 2011 Chen
Rights: All rights reserved.
Type: Thesis
URI: http://hdl.handle.net/10397/4946
Appears in Collections:ISE Theses
PolyU Electronic Theses

Files in This Item:
File Description SizeFormat 
b24625413_link.htmFor PolyU Users162 BHTMLView/Open
b24625413_ir.pdfFor All Users (Non-printable) 3.94 MBAdobe PDFView/Open


All items in the PolyU Institutional Repository are protected by copyright, with all rights reserved, unless otherwise indicated. No item in the PolyU IR may be reproduced for commercial or resale purposes.