ISSN ONLINE(2320-9801) PRINT (2320-9798)

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

A User’s Feedback Relevant Dynamic Query Forms for Database Queries

Neenu Anna Philip1, S.S Jaya2, Pradeep G3
  1. P.G. Scholar, Department of CSE, R.V.S. College of Engineering and Technology, Coimbatore, India
  2. Assistant Professor, Department of CSE, R.V.S. College of Engineering and Technology, Coimbatore, India2
  3. P.G. Scholar, Department of CSE, R.V.S. College of Engineering and Technology, Coimbatore, India3
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering

Abstract

Scientific databases and web databases maintain large and diverse data. Real-world databases contain over hundreds or even thousands of relations and attributes. Old predefined query forms don't seem to beable to satiate various ad-hoc queries from users on those databases. A novel user’s feedback relevant dynamic database query form interface is implemented, which is able to dynamically generate query forms result. The system is to capture a user’s preference and rank query form components, helping him/her to take decisions. The generation of a query form is an iterative process and is guided by the user. The ranking of form components is based on the captured user preference. A user can fill the query form and submit queries to view the query result at each iteration. In this way, a query form results could be dynamically refined till the user satisfies with the query results.



Keywords

Query form, user interaction, query form generation

INTRODUCTION

One of the most extensively used user interfaces for querying databases to access information is query form. Historicquery forms are configured and predefined by developers or Database Administrator in different informationmanagement systems. With the fast development of web information and scientific databases, new databases becomevery huge and difficult. In natural sciences, like genetics and diseases, the databases have variety of entities forchemical and/or biological data resources. Diversekinds of web databases have thousands of structured web entities.Therefore, it is difficult to plan a set of static query forms to answer various ad-hoc database queries on those difficultand complex databases.
Existing tools for database management and development, such as SAP and Microsoft Access,users create customizedqueries on databases. However, the design of customized queries entirely depends on user’s manual editing .If a user isnot familiar with the database schema prior to, those hundreds or thousands of data attributes would confuse him/her.
1.1 Approach
A user’s feedback relevant dynamic query form system is implemented, a query interface which is capable ofdynamically generating query forms for users. Users in database retrieval are often willing to perform many rounds ofactions (i.e., refining query conditions) before identifying the final candidates.
The essence of this is to capture user interests during user interactions and to adapt the query results iteratively. It startswith a basic query form which contains very few primary attributes of the database. Figure 1 shows user’s feedbackrelevant dynamic query form system’sflow. The basic query form is then iteratively enriched via the interactionsbetween the user and our system till the user is content with the query results.

RELATED WORK

Customized Query Form:Existing database clientsand tools make great efforts to help developers design and generatethe query forms, such as EasyQuerySAP, Microsoft Access and so on. They deliver visual interfaces for developers tocreate or customize query forms. The difficulty of those tools is that, they are provided for the professional developerswho are familiar with their databases, not for end-users .Proposed a system which allows end-users to customize theexisting query form at run time. However, an end-user may not be aware with the database. For huge database schemais, it is difficult for them to catch appropriate database entities and attributes and to create desired query forms.
Auto-completion for Database Queries:In user interfaces have been developed to assist the user to type the databasequeries based on the query load, the data distribution and the database schema. Different from this which focuses onquery forms, the queries in their work are in the forms of SQL and keywords.

QUERY FORM

3.1 Interface for Query form

Each query form corresponds to an SQL query template. A query form F is defined as a tuple (AF, RF, σF, (RF)),which represents a database query template as follows:F= (SELECT A1,A2, ...,AkFROM (RF) WHERE σF ),where AF= {A1,A2, ...,Ak} are k attributes for projection, k >0.RF= {R1,R2, ...,Rn} is the set of n relations (or entities) involved in this query, n >0. Each attribute in AFbelongs to onerelation in RF. σFis a conjunction of expressions for selections (or conditions) on relations in RF. (RF) is a joinfunction to generate a conjunction of expressions for joining relations of RF.
In the user interface of a query form F, AFis the set of columns of the result table. σFis the set of input components to befilled by users. Query forms allow users to fill parameters to produce different queries. RFand (RF) are not visible inthe user interface, which are usually created by the system according to the database schema. For a query form F,(RF) is automatically constructed according to the foreign keys among relations in RF. Meanwhile, RFis determined byAFand σF. RFis the union set of relations which contains at least one attribute of AFor σF. Hence, the components of query form F are actually determined by AF and σF. Only AFand σFare visible to the user in the user interface. In this,the focus is on projection and selection components of a query form.
3.2 Ranking Estimation
The user’s desired result is returned by query form. Two measures to evaluate the quality of the query results: precisionand recall . Query forms yield different queries by different inputs, and different queries can output different queryresults and achieve different precisions and recalls, so expected precision and expected recall is used to evaluate theexpected performance of the query form. Expected proportion of the query results which are interested by the currentuser is the expected precision. Expected proportion of user interested data instances which are returned by the currentquery form is the expected recall. The user interest is estimated based on the user’s click-through on query resultsdisplayed by the query form.
3.2.1 Ranking of Attributes
Suggesting projection components is actually suggesting attributes for projection. The current query form be Fi, thenext query form be Fi+1. Let AFi = {A1,A2, ...,Aj}, and AFi+1 = AFi∪{Aj+1}, j+1 ≤ |A|. Aj+1 is the projection attributesuggest for the Fi+1, which maximizes FScoreE(Fi+1). FScoreE(Fi+1) is obtained as follows:
 
image
Adding a projection component Aj+1 does not affect the selection part of Fi. Hence, σFi+1 = σFi and P(σFi+1|d) = P(σFi|d).Since Fi is already used by the user, estimate P(dAFi+1)P(σFi+1|d) as follows. For each query submitted for form Fi, keepthe query results including all columns in RF. Clearly, for those instances not in query results their P(σFi+1|d) = 0 and donot need to consider them. For each instance d in the query results, simply count the number of times they appear in theresults and P(dAFi+1)P(σFi+1|d) equals the occurrence count divided by N.
Pu(dAj+1|dAFi) is not visible in the runtime data, since dAj+1 has not been used before Fi+1.The conditional probabilityPu(dAj+1|dAFi) is estimated from following approach.
• Workload-Driven based approach: The conditional probability of Pu(dAj+1|dAFi) could be estimated from queryresults of historic queries. If a lot of users queried attributes AFi and Aj+1 together on instance d, then Pu(dAj+1|dAFi) mustbe high.
• Schema-Driven based approach: The database schema implies the relations of the attributes. If two attributesare contained by the same entity, then they are more relevant.
The schema graph is utilized to compute the relevance of two attributes. A database schema graph is denoted by G =(R,FK,ξ,A), in which R is the set of nodes representing the relations, A is the set of attributes, FK is the set of edgesrepresenting the foreign keys, and ξ : A → R is an attribute labelling function to indicate which relation contains theattribute
3.2.2 Ranking Entities
The ranking score of an entity is just the averaged FScoreE(Fi+1) of that entity’s attributes. Intuitively, if one entity hasmany high score attributes, then it should have a higher rank.
3.3 Relevant selection based on user feedback
The selection attributes must be relevant to the current projected entities, otherwise that selection would bemeaningless. Therefore, first the relevant attributes is find out for creating the selection components.
3.3.1 Relevant Attribute Selection
The relevance of attributes in system is measured based on the database schema as follows.
Relevant Attributes: For a database query form F with a schema graph G=(R,FK,ξ,A), the relevant attributes is: Ar(F)= {A|A ∈ A, ∃Aj∈ AF, d(A,Aj) ≤ t}, where t is a user-defined threshold and d(A,Aj) is the schema distance. The choice of t depends on how compact of the schema is designed. For instance, some databases put all attributes ofone entity into a relation, then t could be 1. Some databases separate all attributes of one entity into several relations,then t could be greater than 1.
3.3.2 Ranking Selection Components
For enriching selection form components of a query form, the set of projection components AFis fixed, i.e., AFi+1 = AFi.Therefore, FScoreE(Fi+1) only depends on σFi+1. To find the best selection component for the next query form, the firststep is to query the database to retrieve the data instances. P(σFi+1|d) depends on the previous query conditions σFi. IfP(σFi|d) = 0, P(σFi+1|d) must be 0. In order to compute the P(σFi+1|d) for each d ∈ D, no need to retrieve all datainstances in the database. The set of data instances D' ⊆ D is only needed such that each d ∈ D' satisfies P(σFi|d) >0. Sothe selection of One-Query’s query is the union of query conditions executed in Fi. One-Query adds all the selectionattributes into the projections of the query.
Query Creation:
Data: Q = {Q1, Q2, ….} is the set of previous queries executed on Fi
Result: Qone is the query of One-Query

Methods

beginσone← 0for Q ε Q doσone← σone  σQAone← AFi Ar (Fi)Qone← GenerateQuery (Aone, σone)
When the system receives the result of the query Qone from the database engine, it calls the second algorithm of One-Query to find the best query condition the system. The pseudocode for finding best" ≤ " condition is as follows:Find Finest Less Equation Condition
Data: α is the fraction of instances desired by user, DQone is the query result of Qone, As is the selection attribute.Result: S* is the best query condition of As
Begin// Sort by As into an ordered set DsortedDsorted← Sort (DQone, As)s* ← Ø, fscore* ← 0n← 0, d← αβ2for i ← 1 to |Dsorted| dod← Dsorted [i]s← “As<= dAs”// compute fscore of “As<= dAs”n ← n + Pu (dAFi) P (dAFi) P (σFi | d) P(s|d)d ← d+ P (dAFi) P (σFi | d) P(s|d)fscore ← (1+β2). n/dif fscore >= fscore* thens* ← sfscore* ← fscore
3.4 Evaluation
To evaluate the quality of the query results mainly : precision and recall is used. Different inputs through query formsproduce different queries, and different queries produce different query results and achieve different precisions andrecalls, so to evaluate the expected performance of the query form expected precision and expected recall is used. Theexpected proportion of the query results which are interested by the current user is Expected precision
image
The expected proportion of user interested data instances which are returned by the current query form is calledExpected recall
image
Considering both precision and recall, Fscore is derived as follows:
 
image
 
The user interest is based on the user’s click-through on query results displayed by the query form.

RESULTS

The query form captures user’s feedback. User select components and on that selection basis query result is generated.User can refine component selection for a refined query results. The basic query form is enriched iteratively through theinteraction between the user and the system until the user is satisfied with the query results. Ranking of formcomponents is done and query results are generated
In the above figures, user selects desired column component and row component in query form for the desired results. More refined row component selection is done for the refined query results. Finally after making the relavantcomponent seleciton refined query results is obtained.

CONCLUSION AND FUTURE WORK

A user’s feedback relevant dynamic query form generation is done which helps users dynamically generate queryforms. The vitalnotion is to use a probabilistic model to rank form components based on user preferences. Userpreference is captured using both historical queries and run-time feedback such as click. The ranking of formcomponents makes it easier for users to customize query forms. As future work, this approach can be extended to nonrelationaldata. Multiple methods to capture the user’s interest for the queries besides the click feedback is planned todevelop. For instance, a text-box can be added for users to input some keywords queries.

Figures at a glance

Figure 1 Figure 2 Figure 3 Figure 4 Figure 5
Figure 1 Figure 2 Figure 3 Figure 4 Figure 5
 

References