Development of an Enhanced Efficient
Parallel Opinion Mining for Predicting the
Performance of Various Products

K.NATHIYA; Dr.N.K.Sakthivel

Development of an Enhanced Efficient Parallel Opinion Mining for Predicting the Performance of Various Products

K.NATHIYA¹, Dr.N.K.Sakthivel²

PG Scholar, Dept. of CSE, VSB Engineering College, Karur-639111, Tamilnadu, India
Professor, Dept. of CSE, VSB Engineering College, Karur-639111, Tamilnadu, India

Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering

Abstract

with the help of Web 2.0, the centers around user participation, posting online reviews has become an increasingly popular way for people to share their views with other user’s opinions and sentiments toward products and services. It becomes a common practice for e-commerce websites to provide the facilities for people to communicate and publish their reviews between them. These online reviews present a wealth of information on the Services and Products, which will facilitate the improvement of their business. Hence a growing number of recent studies have been focused on the Opinion Mining. A few Opinion Mining based methods have been studied and analyzed. From our study, it is observed that a few opinion mining based TSCAN algorithm had not produced good results due to referring the users and customers opinions features with similar meaning as different. To overcome this issue, the Modified TSCAN Algorithm is proposed here. It is mainly focusing on the experts opinions which overcomes the existing system drawback in terms of referring genuine opinions, so that the readers could understand the content easily. By using this model, the more information can be extracted and associated through their temporal closeness, which will give comprehensible content. This model is involving vital role in the Opinion Mining because users can share their opinions about the products. From our implementation, it is observed that this scheme provides the best suitable solution for the user’s interests and demands. Thus our research work is proposed and implemented an efficient method for Opinion Mining called an Enhanced Efficient Parallel Opinion Mining (EEPOM) based TSCAN Algorithm. It is focusing more websites and it is extracting more information in parallel manner, so that we can get optimized fruitful result with the expert’s opinions. From our results, it is noted that it provides the best suitable solution for the user’s interests and demands and it i s improving the performance of existing technique in terms of Quality of Information, Prediction and obtaining of genuine opinions.

Keywords

Opinion mining, Opinion classification, Summarization, review

INTRODUCTION

Before the Web, when an individual needed to make a decision, he/she typically asked for opinions from friends and families. When an organization wanted to find the opinions or sentiments of the general public about its products and services, it conducted opinion polls, surveys, and focus groups. In many cases, opinions are hidden in long forum posts and blogs. It is difficult for a human reader to find relevant sources, extract related sentences with opinions, read them, summarize them, and organize them into usable forms. Thus, automated opinion discovery and summarization systems are needed. Sentiment analysis, also known as opinion mining, grows out of this need. Opinion mining is the concept under the Data mining, where it is a resulting technique for extracting, classifying, perceptive and assessing the opinions spoken in the different websites, social media insides and other user generated context. The review of customer normally includes the product opinions of a lot of customers uttered in a variety of forms together with natural language sentences. In generally the people usually do not give their opinions in directly. For Ex., some of the products may have the features like “the lens in the camera is good and the lens takes too long time for focusing the object” The main intension of the opinion mining is to predict the opinions for the products and features of those products from the various web resources. Previous studies on opinion mining have applied TSCNA based method for feature extraction and refinement, including NLP and statistical methods. However, these analyses exposed the following problems. it doesn’t focuses on the experts opinion for referring the opinion based on more URL’s. It leads to poor inconsistency of the data. Instead of predicting the user based opinions, referring expert based opinions in many URLs and processing the opinions in those URLs will provide best suitable solution for the users. To resolve these problems, this paper proposes an enhanced method called, enhanced efficient parallel Opinion Mining based Modified T-scan Based Algorithm (EEPOM). The overall process of EEPOM consists of three phases: web collection information, opinion orientation process, and creation of word net tool. In Web collection information the process of analysing the message will be take place. To obtain this the required input data will be given. After this process the opinion orientation will takes place. Here the process of extracting the opinions and opinion types are finalized. Then with the visualization tool the required graph format will be obtained.

RELATED WORK

Mining Hu considered as the prepare work to find the summarization based on feature and opinion. The concept used here is association rule mining and it helps to find frequent item sets, obtained from each sentence noun phrases. To shorten the frequent items they have used different techniques. The infrequent features are identified based on the opinion word present in the sentence. Finally the Summary is consisting of the product feature and the opinion about it has been given in terms of positive and negative [1].

Gamgarn Somprasertsri has proposed an approach for mining product feature and opinion based on the consideration of syntactic and semantic information. They have used dependency relations and ontological knowledge with the probabilistic model. The product ontology method also used here to obtain similar feature with different terminology [2].

Yuanbin Wu et al constructed their own dependency parser, to identify the product features and the opinion on these features from the product reviews. Here the required opinion is identified based on the window size of 5 from the extracted word to the opinion word present in that sentence [3].

Parma Nand proposed an algorithm for resolving anaphora based exclusively on salience weights. He has focused on resolving anaphors particularly in the genre of in short newspaper type articles because it forms part of wider research aimed at building a system for visualization of online newspaper articles. The algorithm used here is having proficient of resolving the anaphors using knowledge-poor approach which is completely based on salience scores. [4]. Chih-Ping Wei et al. used the approach [5], to mine product features and opinion about these features using the semantic based approach. This approach is based on co occurrence of noun phrase and the opinion word.

EXISTING TECHNIQUE

In the existing system the opinions has predicted based on the users opinion and it lead to refer the fake information’s and it is based on obtaining the many URL’s, it doesn’t focuses on obtaining the experts opinion where we can obtain the genuine and correct opinions. While referring the opinions based on users there will made is to chance of referring the fake and irrelevant data’s. In order to overcome this obtaining of experts opinion had founded. it helps us to get the genuine information with data accuracy. The used algorithm in this technique is TSCAN algorithm where it fails to read the sentence fully when doing the process of sentence analysing. i.e., it fails to do the process of sentence boundary detection. While performing the sentence boundary detection the sentence has fully readied for the further process. The next process may be of applying the suitable algorithm for the sentence prediction. In the proposed system these things has overcome with the help of new algorithm MODIFIED TSCAN SCHEME.

A Existing Technique Diagram

This is the systematic architecture diagram for the existing system. Here with the help of parallel opinion mining and TSCAN scheme the process has takes place. The next step to this is web information discover and collection. Here the input data is taken first after that the relevant websites are carried out. The analyzing of messages will do after the obtaining of the relevant websites. In the opinion orientation process the opinion characters will be analyzed. The visualization tool is used for creating the resultant graph format which will be useful format to the users. The obtained information’s can store in the data base for the future reference.

PROPOSED SYSTEM

In this proposed system the existing technique disadvantages has overcome successfully by referring the expert opinions. Instead of getting opinion from the experts, the process of obtaining best suitable opinions from the already available opinions is made the process easy here. The name expert tells us the suitable solution for the customer’s needs and their satisfaction. In order to obtain this the Modified TSCAN algorithm is used here. With the help of this algorithm the sentence boundary detection is fully obtained where the existing system failed to achieve it. In particularly for sentence boundary detection ling pipe sentence boundary detection method is used. The obtaining of experts process is made easily here because of getting the experts opinion from the already available opinions. And the tool used in this technique is Word Net tool. The word net tool is nothing but, it is a lexical database for the English language and it helps to groups the verbs, nouns, adverbs and adjectives namely called synsets. And this tool is very much help to provide the semantic relationship between those synsets. The word net tool is flexible to match the synsets with the lexicon database where in the database we already stored the set of positive words and the negative words .

A Proposed System Architecture

B Algorithm Used

The algorithm used in this proposed technique is Modified TSCAN scheme. It overcomes the disadvantages of existing technique in terms of referring the expert’s opinion. With the help of expert’s opinion it is possible produce the end results to the user. This technique overcomes the in efficiency of reading the full content in the online resources and while executing this algorithm it perfectly performs the sentence reading from left to right. This is step is important because normally in the natural language processing it will end the sentence reading if it contains the punctuation marks. The algorithm used in this proposed technique is Modified TSCAN scheme. It overcomes the disadvantages of existing technique in terms of referring the expert’s opinion. With the help of expert’s opinion it is possible produce the end results to the user. This technique overcomes the in efficiency of reading the full content in the online resources and while executing this algorithm it perfectly performs the sentence reading from left to right. This is step is important because normally in the natural language processing it will end the sentence reading if it contains the punctuation marks. In order to overcome this sentence boundary detection step is used and after this the obtaining of Stanford typed dependencies will be performed. This could be done with the help of stand ford type parser where it will parse the sentence. The parser is defined as, It is the program and it performs the grammatical structure of the sentences, the groups of those together normally called subject or object of a verb. The algorithm is having the following steps.

• Preprocessing

• Feature generation and extraction

• Opinion Direction and Recapitulation

• Pre Processing

• This pre processing step is the basic and essential step for the mining techniques in data mining. Here analyzing of data is fully made so that the resultant will be useable format to the users. In this pre processing it helps to remove incomplete, noisy, irrelevant and in consistent data. The tasks involved here are, Sentence Boundary Detection

• Obtaining Stanford Typed Dependencies

• Sentence Boundary Detection

In this sentence boundary detection step the complete reading of sentence is fully completed so that the possibilities of obtaining the true opinions are more here. In the existing system there was no identification of boundary in the opinion sentences. With this step the finally result is will be of complete sentence reading which lead to refer the genuine information

• Obtaining Stanford Typed Dependencies

In order to obtain the Stanford typed dependencies the Stanford parser will be used here. For this the complete identified boundary will be taken as input to the parser and thus the parser is further parses the given sentence. The output is will be of string output or the tree output. This format is easy to predict the basic form of the given sentences

• Feature Generation and Extraction

The next step to the pre-processing is feature generation and extraction; here the relevant features of the products are obtained. Based on the content what it has. Initially the sentence will be read from left to right by the parser. After that we are obtaining the Stanford typed dependencies, this dependencies are helps us to obtain the opinion form. The process involved here is that the required dependencies which are all obtained from the given sentences will be compared with the opinion lexicon which we have already stored in the data base. The opinion lexicon is nothing but, it is the container for the English language words where we can find the set of positive word and the negative words. After the comparison with the opinion lexicon and the dependencies the resultant opinion types are obtained. The opinion types are as follows.

• Direct opinion

• In direct opinion

• The direct opinion types enable the user to understand the content easily. ie., the available content will be of easy to read and understand to the users who are about to know it

Example

• The indirect opinion types enable the user hard to read and understand the content easily. ie., the available content will be of little tough to understand the content

Example

The battery life of this camera is good and lens in the camera is taking too much time for focusing the object By following the above the required candidate product feature opinion pairs are extracted in an effective and meaning full way using different combinations of dependencies

• Opinion Direction and Recapitulation

This is the final step made in the opinion prediction process. Here extracted forms of opinion types are obtained. Whether it refers to positive or negative. Based on this the graph form will be obtained for the various products.

EXPERIMENTAL RESULTS

The following results are obtained in this proposed system. It helps the users to get the appropriate results with the genuine opinions

COMPARISON WITH EXISTING SYSTEM AND RESULT

The following results illustrate the comparison with the existing systems in terms of three parameters.

The above figure describes the comparison chart for the prediction of opinions between the TSCAN Algorithm and Modified TSCAN Algorithm and here three numbers of tests has been taken in those tests comparing with TSCAN Algorithm the Modified TSCAN Algorithm Yields the better result.

This figure describes the comparison chart for obtaining genuine opinions. In order to obtain the better result three numbers of tests has been taken in those tests has been taken and comparing with TSCAN Algorithm the Modified TSCAN Algorithm Yields the better outcome.

CONCLUSION

In this research work, by referring the expert opinions the genuine information has obtained it helps the users to achieve their intended end results and comparing with the existing system it provides the advantages in terms of Quality of information and prediction of accurate opinions.

Figures at a glance


Figure 1	Figure 2	Figure 3	Figure 4


Figure 5	Figure 6	Figure 7	Figure 8


Figure 9	Figure 10	Figure 11

References

Minqing Hu and Bing Liu, Mining and summarizing customer reviews, In: Proc. of the 10th ACM SIGKDD-2004 international conference onknowledge discovery and data mining, Seattle, pp 168–177

GamgarnSomprasertsri and PattarachaiLalitrojwong Mining Feature-Opinion in Online Customer Reviews for Opinion Summarization,Journal of Universal Computer Science, vol. 16, no. 6 (2010),pp 938-955

YuanbinWu, Qi Zhang, Xuanjing Huang, LideWu, Phrase Dependency Parsing for Opinion Mining,Proceedings of the 2009 Conference onEmpirical Methods in Natural Language Processing. Singapore, 6-7 August 2009, pp1533–1541

Parma Nand, On the use of Salience Weights in Anaphora Resolution, In Proc. of NZCSRSC 2008, April 2008, Christchurch, New Zealand

Chih-Ping Wei, Yen-Ming Chen, Chin-Sheng Yang, Christopher and C. Yang. Understanding what concerns consumers: a semantic approachto product feature extraction from consumer reviews, Springer-Verlag 2009 online reviews

Chien Chin Chen, and Meng Chang Chen, TSCAN: A Content Anatomy Approach to Temporal Topic Summarization, IEEE Transactionson Knowledge And Data Mining, Volume 24, No. 1, January 2012

Domain Xiaohui Yu, and Yang Liu, and Jimmy Xiangji Huang, and Aijun An, Mining Online Reviews for Predicting Sales Performance: ACase Study in the Movie, IEEE Transactions On Knowledge And Data Mining, Volume 24, No. 4, April 2012 Abbasi et al, Affect Analysis ofWeb Forums and Blogs using Correlation Ensembles IEEE Transactions On Knowledge and Data Engineering, volume 20, No. 9, June 2008

Identifying features in opinion mining via intrinsic and extrinsic Domain relevance IEEE Transactions on Knowledge And Data Mining,Volume PP, No. 1, 2013

An ubiquitous domain Driven Data Mining approach for performance monitoring in virtual organizations using 360 Degree data mining &opinion mining, IEEE Conference Publications 2013