ISSN ONLINE(2320-9801) PRINT (2320-9798)
R.Santhya1, S.Latha2, Prof.S.Balamurugan3, S.Charanyaa4
|
Related article at Pubmed, Scholar Google |
Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering
This paper details about various methods prevailing in literature for efficient discovery of matching dependencies. The concept of matching dependencies (MDs) has recently been proposed for specifying matching rules for object identification. Similar to the functional dependencies with conditions, MDs can also be applied to various data quality applications such as detecting the violations of integrity constraints. The problem of discovering similarity constraints for matching dependencies from a given database instance is taken into consideration. This survey would promote a lot of research in the area of information mining.
Keywords |
Data Anonymization, Matching Dependencies(MDs), Object, Similarity Constraints, Information Mining. |
INTRODUCTION |
Need for publishing sensitive data to public has grown extravagantly during recent years. Recent days have seen a steep rise in preserving data quality in the database community due to the huge amount of âÃâ¬Ãâ¢dirtyâÃâ¬Ãâ data originated from different. These data often contain duplicates, inconsistencies and conflicts, due to various mistakes of men and machines. In addition to the cost of dealing with the huge volume of data, manually detecting and removing âÃâ¬Ãâ¢dirtyâÃâ¬Ãâ data is definitely out of practice because human proposed cleaning methods may introduce inconsistencies again. Therefore, data dependencies, which have been widely used in the relational database design to set up the integrity constraints. Hence protecting privacy of individuals and ensuring utility of social network data as well becomes a challenging and interesting research topic.. In this paper we have made an investigation on the attacks by matching dependencies and possible solutions proposed in literature and efficiency of the same. |
II. APPROXIMATE INFERENCE OF FUNCTIONAL DEPENDENCIES FROM RELATIONS |
In this paper , the author describes the FD inference problem . The FD inferences problem states that , in this given relation âÃâ¬Ãâr‘ find the set of FD and that is equivalent to the set of all FD holding in r . So the approximate dependency inference is taken over measures the error in a relation . These error value is 0 if the dependency holds and the value is 1 if the dependency don‘t hold . |
During the database design conditions of integrity constraints defines the what database states are allowed . These exist in several classes of dependencies .So the functional dependencies is one of the most important in that class . In this paper only FD are considered and call them just dependencies . |
In this paper the another dependency approximate dependency inference is considered . Where the result no need to be accurate . So this paper contains the two different types of results . The problem of inferring the functional dependencies that hold in a given relation âÃâ¬Ãâr‘ .First shows the three measures of dependency .Secondly demonstrated the output polynomial algorithm with any accuracy . In covers the set of FD that hold in a given relation .The result shows the approximate techniques to achieve the good results in the dependency inference problem. |
III. METRIC FUNCTIONAL DEPENDENCIES |
In the paper the author describes the metric functional dependencies problems while merging the data from differed sources then it will be a small difference in the data format . This will causes the traditional FDs , to be violated , without there being an any of semantics |
FDs that defines the functional relationship between the attributes.In FD key relationship are very special kind of the FDs and these will provide database normalization while processing the design. Conditional function dependencies as well as approximation might not giving the exact result while inherent lack of robustness. So to over come these problems the MFDs are introduced .These will used to capture the small variation in the data. |
In this paper the exact algorithms are specified to verify the the MFDs .specifically for general metrics as well as Euclidean distance space. |
Dom(x) is the domain of an attribute where X is the sequence of attributes X=A1,A2……Ak, then dom(X) = dom(A1 ) *dom(A2) *….. *dom(Ak). |
In this paper, the problem dealing with the robust to data failure and errors. So we introduce the metric FD.The result shows sound and realistic. |
IV. DISCOVERY OF FUNCTIONAL AND FUNCTIONAL DEPENDENCIES IN RELATIONAL DATABASE |
In this paper describes the study of developing the foundation, efficient methods for approximate functional dependence in the given relational database and this is based on the mathematical theory of partition .The minimal nontrivial functional dependencies can be found using the level wise algorithm. The FD defines the relationship between the attributes of a database in the relation. It states that attribute value is uniquely identify by the some other attribute values |
In this paper, the new algorithmic approach is found for the discovery of functional and approximate functional dependencies. This approach is based on the partitions of the rows identification number from the relation and the breadth first or level wise searches are conducted So the partitions and dependencies can be evaluated efficiently |
V. IMPROVING DATA QUALITY THROUGH EFFECTIVE USE OF DATA SEMANTICS |
In this paper, the author shows the problem of data quality issues. It is the increasing and important problem in the recent year. So the discovery of the many âÃâ¬Ãâ¢data qualityâÃâ¬Ãâ or âÃâ¬Ãâ¢data misinterpretationâÃâ¬Ãâ problem i.e problem with data semantics considered in the paper. The COIN(COnetext and INterchange)technology for knowledge storage and knowledge processing approaches are proposed. |
COIN is a knowledge based mediation technology .This will enables meaningful use of the heterogeneous database. This COIN is not only for mediation also for wrapping technology and middle ware services. The wrapping is physical and logical gateway to provide the uniform access to the disparate sources over the network. |
In this paper , the framework for understanding house holding problem is presented .The COIN techniques are used in this paper to store and apply the capture knowledge .The future work is to collect the data and to determine the types of corporate house holding knowledge. Secondly to explore the COIN techniques in the corporate house holding and to extend the COIN techniques for capturing, storing, maintaining and applying the house holding knowledge. |
VI. AUTOMATIC DISCOVERY OF CORRELATION AND SOFT FUNCTIONAL DEPENDENCIES |
In this paper we introduce CORDS, an efficient tool for automatic discovery of correlation and soft FD between column. CORDS searches for column that may useful dependency relation by candidate pair and flexible set of heuristic are used by the pruning unpromising candidates. The CORDS can be used as a data mining tool, producing dependency graphs .So we focus on the use of CORDS in query optimization generally. This approach is relatively easy to implement. CORDs can be used in tandem with query feedback system such as the LEO learning optimization.etc.,. |
VII. EFFICIENT DISCOVERY OF FUNCTIONAL AND APPROXIMATE DEPENDENCIES USING PARTITION |
In this paper the author gives the new approach for finding functional dependencies based on the partitioning the set of rows with respect to their attribute values. These partition makes easy and very efficient and rows are identified easily. The efficient in practice is a new algorithm are used in the experiments. The running time is improved by the several order of magnitude over previous published results . These will applicable for larger database also. |
VIII. AN EFFICIENT ALGORITHM FOR DISCOVERING FUNCTIONAL AND APPROXIMATE DEPENDENCIES |
In this paper the author defines the discovery of functional dependencies. It is an important database analysis techniques .Tane ,an efficient algorithm for finding functional dependencies from large database. Tane is based on the partitioning the rows, this will makes the vality of FD.T his partitions will makes the discovery of FD more easy and efficient. For benchmark database the running times are improved by several order of magnitude over the previous paper. So this algorithm is applicable for large dataset also. |
IX. AN ALGORITHM FOR INFERRING FUNCTIONAL DEPENDENCIES FROM RELATION |
In this paper the author describes the dependency inference problem. It is used to find the set of FD that will hold in a given database relations. The problem is exponential in the number of attributes and application database design, in query optimization, in artificial intelligence So we develop the two algorithm one is reduce the problem of computing the transversal of a hypergraph. The another one is based on the repeatedly sorting the relation with respect to set of attribute. |
X. CONCLUSION AND FUTURE WORK |
This paper detailed about various methods prevailing in literature for efficient discovery of matching dependencies. The concept of matching dependencies (MDs) has recently been proposed for specifying matching rules for object identification. Similar to the functional dependencies (with conditions), MDs can also be applied to various data quality applications such as detecting the violations of integrity constraints. The problem of discovering similarity constraints for matching dependencies from a given database instance is taken into consideration. This survey would promote a lot of research in the area of information mining. |
References |
|