ISSN ONLINE(2320-9801) PRINT (2320-9798)

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Ontology Based Question Answering System

Ankita Singh1 and Nidhi Tyagi2
  1. Assistant Professor, Dept. Of CSE, BIT, Meerut, India
  2. Associate Professor, Dept. Of CSE, Shobhit University, Meerut, India
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering

Abstract

Data and Information requirement is increasing with the Increase in the volumes of data in the repositories such as www etc, now question arises that out of this enormous data how to find the information which is required by the user and should be specific in nature. Information retrieval techniques solves the problem to an extent but they cannot help in a situation where only specific information pertaining to a question is required. Information retrieval engines will retrieve documents containing phrases and paragraphs which may have an answer to user query. This problem is addressed in this research paper which proposes a question answering system to satisfy users specific information need.

Keywords

QAS, PQC, Ontology, NE.

INTRODUCTION

Question Answering systems are designed to satisfy the users specific information need. In these systems questions are asked in natural language which is then used to identify keywords, named identities, question type which are then used to formulate the database query. Ontology is the conceptualization of knowledge [1], Ontologies are written in OWL and they exhibit the hierarchical structure, this paper presents a way to store the ontologies in the database in such a way that they can be queried irrespective of their hierarchical nature, and also proposes an architecture of question answering system which is used to process the natural language question and retrieve the answer from the knowledge base.

RELATED WORK

The question answering system proposed so far are completely based on document analysis, for example In [6] author has proposed a system called BASEBALL which is one of the earliest question answering system, it is a program for answering questions about baseball games played in the American league over one season. The system was able to answer narrow-domain questions about statistics compiled over a season of American League play by using shallow parsing techniques on the natural language query to identify the teams and statistics in question. In [7] author proposed a question answering system named LUNAR was also based on narrow domain question answering, In [8] during 8th TREC conference author first proposed the QA track, which required answering factoid questions by returning a text snippet which contained an answer to the asked question. In [9], author proposed the first web based question answering system which was different from earlier question answering system because this system was using web as its corpus for extracting answers to the question unlike to earlier system which were based on fixed size corpus. Some of the web based systems are START, Answer Bus, AskMAR etc. In [10] authors has proposed question answering systems named Chinese QAS, which is language specific, morphological analysis and parsing is more difficult in this systems because of its language specific annotations and symbols.

PROPOSED ARCHITECTURE

The proposed architecture of OBQAS comprises of three functional components. They are Question Processing module, query formulation module, Answer selection module.
A. Question Processing Module
Question Processing Module consists of two components PQC and Lexical analyser. Lexical analyser is further consisting of two components Question type Identifier and keywords Identifier. A natural language question (NLQ) is presented to this module which is directly fed to PQC.
PQC is Previous Question Cache whenever a natural language question is presented to the system it goes to this cache which records the previously asked questions if the just arrived question matches the previously asked question then answer to that question is retrieved directly from that cache otherwise it gets stored at the top of the list. Previous question cache is maintained in linear list stack where stack pointer points to latest question asked.
The question is then presented to lexical analyser which parses it to identify the keywords and named entity in the question. These identified keywords are then presented to query formulator.
B. Query Formulator
The query formulator consist of query formulation engine and query cluster the named entities identified, question type and keywords extracted from NLQ are passed to query formulation engine, query cluster consist of query syntax depending upon the question type query is formulated using the syntax. Table 2 [2] consist of the various question types for which the query syntax is present in the query cluster.
C. ANSWER SELECTOR
The query formulated in the query formation module is presented to the relational database which returns the answer, the answer to the question is returned to the user and also stored in the PQC along with the question on the top of the list.
D. Database Creation
The system is ontology based and ontologies are written in OWL the documents pertaining to a specific domain are in xml these documents have a hierarchical structure between the objects present in the document the xml document are stored in the relational table in the form of xml schema.

SIMULATION

Step 1: First of all the system is presented an XML document [4] from which an ontology is derived,
<dept bldg=“101”>
<employee id=“901”>
<name>John Doe</name>
<phone>408 555 1212</phone>
<office>344</office>
</employee>
<employee id=“902”>
<name>Peter Pan</name>
<phone>408 555 9918</phone>
<office>216</office>
</employee>
</dept>
Step 2: Hierarchical representation of above Ontology.
image
Step 3: Database Creation for Department Ontology
• create table dept (deptID char(8), deptdoc xml);
Through this command a table will be created which is a relational containing 2 columns
They are dept id and deptdoc which is stored in its hierarchical structure as shown:
Step 4: User query in natural language:
NLQ: What is the name of employee with phone number 408-222
This natural language query is parsed to identify:
i) Question Type: what
ii) Keyword: phone number 408-222
iii) Named entity: name, employee
Query formulation is an internal process, the query formed by the formulator engine is
Select employee name from Dept
Where
Xmlexist (‘$ DEPTDOC/ department/ employee [phone number = ‘408-222’]’)
The retrieved answer is Peter Pan.

CONCLUSION

The QAS proposed here not only serves the purpose of question answering but its architectures simplicity makes it efficient in terms of answer retrieval. The system can be improved if the ontology can be updated automatically just as web repositories are updated through page refreshing [5] techniques.

Tables at a glance

Table icon Table icon Table icon Table icon
Table 1 Table 2 Table 3 Table 4
 

Figures at a glance

Figure 1 Figure 2 Figure 3 Figure 4
Figure 1 Figure 2 Figure 3 Figure 4
 

References