Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/133097
Title: ENHANCING THE PERFORMANCE OF AN ONTOLOGY BASED INFORMATION RETRIEVAL SYSTEM USING A HYBRID GENETIC ALGORITHM
Researcher: Vanjulavalli, S
Guide(s): Kovalan, Dr .A
Keywords: REPtree,BFtree,J48,CART
University: Periyar Maniammai University
Completed Date: 28/07/2014
Abstract: newlineInformation Retrieval (IR) issues have attracted increasing newlineattention due to the growing availability of the documents. IR determines relevant newlinedocuments from a collection of documents based on a query from the user. The newlineretrieval of web pages is more challenging due to the ambiguous nature of the newlineunstructured information found in these pages. Ontologies help to overcome the newlinedisambiguate nature of the natural language by the use of standard terms that newlinerelate to specific concepts. Ontology is a hierarchy of concepts with attributes and newlinerelations that defines an agreed terminology to describe semantic networks of newlineinterrelated information units. Ontology provides a vocabulary of classes and newlineproperties to describe a domain, emphasizing the sharing of knowledge and the newlineconsensus about its representation. newlineSome of the key challenges in Web based information retrieval is the newlineambiguity of words due to the meaning it conveys. These challenges can be newlineovercome using semantic interpretation and Ontology based systems. However the newlinecorpus in web is extremely large with a good number of attributes not contributing newlineto the information retrieval process. Poor features decrease the precision and newlinerecall. Selecting features can be done statistically using techniques like newlineInformation Gain (IG), Mutual Information (MI) or Singular Value newlineDecomposition (SVD). However feature selection is NP hard. This work newlineinvestigates technique for feature selection and soft computing based classifiers newlinefor classification. newlineThe work proposed and carried out in this work can be broadly classified into newlineand#61623; Investigation of various Bagging based classifiers for web page newlineclassification newlineand#61623; Ontology based feature extraction newlineand#61623; A novel feature selection algorithm using Genetic Algorithm newlineand#61623; An improved Neural Network. newlinevi newlineKeywords and features based on ontology are classified through newlinebagging with various decision trees like REPtree, BFtree, J48, and CART. newlineExperiments show that the new feature extraction improves precision and recall newlinesatisfactorily. newlineFor effective classification, the extracted features should give valuable newlineinformation about the categories, and it should be inexpensive in terms of newlinecomputation. In the proposed features extraction, the features are extracted based newlineon the ontology and feature selection is achieved by GA. A concept based tree newlinestructure is built on a generalization/specialization relationship to newlineconceptualization the domain. Browsing knowledge is made easier if the newlineconceptual architecture of the knowledge base is identified as a whole and newlineinformation is accessible by intra conceptual hierarchical links during browsing. newlineThe experimental results demonstrate that proposed feature extraction improves newlinethe precision and recall satisfactorily. The Hybrid GA based feature selection newlineimproves classification accuracy by 0.27% to 1.7% than GA based feature newlineselection. newlineUsing features selected based on GA a Multilayer Perceptron Neural newlineNetwork (MLPNN) is trained to classify web pages. It is proposed to use GA for newlinetraining the MLPNN.The GA tends to get trapped in the local minima. Thus to newlineovercome this problem, Hill Climbing is used as a local search in the hybrid newlinealgorithm. Numerical results revealed that hybrid classifier trained by multilayer newlineNeural Network (NN) with GA to select IDF and ontology based features gave newline94% accuracy, high precision and recall and lowest Root Mean Square Error newline(RMSE) when compared to other methods.
Pagination: 
URI: http://hdl.handle.net/10603/133097
Appears in Departments:Department of Computer Science and Applications

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File124.48 kBAdobe PDFView/Open
02_certificate.pdf953.67 kBAdobe PDFView/Open
03_declaration.pdf953.67 kBAdobe PDFView/Open
04_acknowledgement.pdf60.18 kBAdobe PDFView/Open
05_abstract.pdf44.77 kBAdobe PDFView/Open
06_table.pdf35 kBAdobe PDFView/Open
07_figure.pdf37.13 kBAdobe PDFView/Open
08_abbreviation.pdf60.33 kBAdobe PDFView/Open
09_contents.pdf39.78 kBAdobe PDFView/Open
10_chapter1.pdf166.58 kBAdobe PDFView/Open
11_chapter2.pdf231.14 kBAdobe PDFView/Open
12_chapter3.pdf532.74 kBAdobe PDFView/Open
13_chapter4.pdf450.74 kBAdobe PDFView/Open
14_chapter5.pdf410.67 kBAdobe PDFView/Open
15_chapter6.pdf89.6 kBAdobe PDFView/Open
16_reference.pdf170.56 kBAdobe PDFView/Open
17_appendix.pdf97.44 kBAdobe PDFView/Open


Items in Shodhganga are protected by copyright, with all rights reserved, unless otherwise indicated.