ISSN ONLINE(2320-9801) PRINT (2320-9798)

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Special Issue Article Open Access

Entity Recognition in a Web Based Join Structure

Abstract

Given a document, the task of Entity Recognition is to identify predefined entities such as person names, products, or locations in this document. With a potentially large dictionary, this entity recognition problem transforms into a Dictionary-based Membership Checking problem called Approximate Membership Extraction (AME) which aims at finding all possible substrings from a document that match any reference in the given dictionary. It generates many redundant matched substrings, thus rendering AME unsuitable for real-world tasks based on entity extraction. Approximate Membership Localization (AML) only aims at locating true mentions of clean references. An important observation is as follows: in real world situations, one word position within a document generally belongs to only one reference-matched substring, meaning that the true matched substrings should not overlap. Therefore, AML targets at locating non-overlapped substrings in a given document that can approximately match any clean reference. In the event where several substrings overlap, only the one with the highest similarity to a clean reference qualifies as a result. Web-based join Structure which is a search-based approach joining two tables using entity recognition from web documents and it is a typical real-world application greatly relying on membership checking. Membership checking is performed by using correlation, Inverse Document Frequency (IDF), Jaccard Similarity, P-Pruning Technique.

J.Kavitha M.Tech, A.Pasca Mary

To read the full article Download Full Article