Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/5200
Title: Mining concepts from Wikipedia for ontology construction
Authors: Cui, Gaoying
Lu, Qin
Li, Wenjie
Chen, Yirong
Subjects: Concept
Ontology construction
Wikipedia
Issue Date: 15-Sep-2009
Publisher: IEEE Computer Society
Source: 2009 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT Workshops 2009 : proceedings : 15-18 Sept. 2009, Università degli Studi di Milano Bicocca, Milano, Italy, v. 3, p. 287-290.
Abstract: An ontology is a structured knowledgebase of concepts organized by relations among them. But concepts are usually mixed with their instances in the corpora for knowledge extraction. Concepts and their corresponding instances share similar features and are difficult to distinguish. In this paper, a novel approach is proposed to comprehensively obtain concepts with the help of definition sentences and Category Labels in Wikipedia pages. N-gram statistics and other NLP knowledge are used to help extracting appropriate concepts. The proposed method identified nearly 50,000 concepts from about 700,000 Wiki pages. The precision reaching 78.5% makes it an effective approach to mine concepts from Wikipedia for ontology construction.
Rights: © 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Type: Conference Paper
URI: http://hdl.handle.net/10397/5200
DOI: 10.1109/WI-IAT.2009.284
ISBN: 978-0-7695-3801-3 (print)
978-1-4244-5331-3 (E-ISBN)
Appears in Collections:COMP Conference Papers & Presentations

Files in This Item:
File Description SizeFormat 
WI_cgy_v3.pdfPre-published version118.13 kBAdobe PDFView/Open


All items in the PolyU Institutional Repository are protected by copyright, with all rights reserved, unless otherwise indicated. No item in the PolyU IR may be reproduced for commercial or resale purposes.