SlideShare a Scribd company logo
1 of 7
Download to read offline
ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 148 (2019) 80–86
1877-0509 © 2019 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the scientific committee of the Second International Conference on Intelligent Computing in
Data Sciences (ICDS 2018).
10.1016/j.procs.2019.01.011
© 2019 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the scientific committee of the Second International Conference on Intelligent Computing in
Data Sciences (ICDS 2018).
Keywords: Semantic analysis; Opinion mining; Reputation generation; Machine learning.
1. Introduction
Over the past few years, the web has been growing at an incredible rate. Nowadays, people can buy products,
watch movies and make reservations via the Internet. Before making a decision, most people would like to seek
other people’s opinions on a target entity in order to judge its performances. In this case, one could scan both entity
descriptions and user comments. Hence, people have generally established a certain online custom: first look at other
users’ comments, then make a decision towards the target entity. On the other hand, online sellers would like to collect
the user comments with high praise and put them in the description of their products in order to attract more purchases.
According to recent statistics, the number of users of some famous online shopping centers, e.g., Taobao, Jingdong and
Amazon has exceeded one billion. Each of above commercial websites contains a huge number of product comments.
These comments contain user opinions on the products. As the opinions show the subjective attitudes, evaluations,
and speculations of users expressed in natural languages, this kind of contents contributed by the Internet users has
been well recognized as valuable information. It can be exploited to analyze public opinions on a specific product in
∗ Corresponding author. Tel.: +212632561278.
E-mail address: abdessamad.benlahbib@usmba.ac.ma
Second International Conference on Intelligent Computing in Data Sciences (ICDS 2018)
An Unsupervised Approach for Reputation Generation
Abdessamad Benlahbiba,∗, El Habib Nfaouia
aLIIAN Laboratory, Faculty of Sciences Dhar EL Mehraz, Sidi Mohammed Ben Abdellah University, Fez, Morocco
Abstract
Nowadays, watching a movie, buying a product, making hotel reservations and other e-commerce trades are strung to consulting
other peoples reviews and recommendations on the target entity. Indeed, Amazon, IMDB (Internet Movie Database) as well as
several websites provide a convenient platform where users share freely their opinions and their subjective attitudes towards the
target entity with no restrictions. However, those opinions are too much to be examined one by one, this is why a general reputation
value makes the task of choosing the right product much easier. In this paper, we propose a reputation generation approach based
on opinion clustering and semantic analysis. In our approach, opinions are grouped into a number of clusters that contain opinions
with the same attitude or preference. By aggregating the ratings attached to the clusters, we generate the reputation of an entity.
Experimental results demonstrate the effectiveness of the proposed approach in generating reputation value.
Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 81
order to figure out user likes or dislikes [5]. In this paper, we propose to use LSA (Latent Semantic Analysis) model,
then applying K-means algorithm to cluster opinions based on their semantic relations, and by aggregating the ratings
attached to the fused opinions, we normalize the reputation of an entity. The paper is organized as follows. Section
2 gives a brief review of related work. In Section 3, we present the details of our approach. We show experimental
results followed by additional analysis and discussions in Section 4. Finally, conclusions are presented in Section 5.
2. Related work
Reputation is a measure that is derived from direct or indirect knowledge on earlier interactions of entities and is
used to assess the level of trust an entity puts into another entity [1].
Reputation systems are typically based on public information in order to reflect the community’s opinion in general
[2]. The simplest form of computing reputation scores is simply to sum the number of positive ratings and negative
ratings separately, and to keep a total score as the positive score minus the negative score. This is the principle used in
eBay’s reputation forum which is described in [3]. In [4], a more advanced scheme proposed to compute the reputation
score as the average of all ratings, and this principle is used in the reputation systems of numerous commercial web
sites, such as Epinions and Amazon. Advanced models in this category compute a weighted average of all the ratings,
where the rating weight can be determined by factors such as rater trustworthiness/reputation, age of the rating,
distance between rating and current score etc.
Recently, Zheng et al [5] proposed a novel reputation generation approach based on opinion fusion and mining. In their
approach, opinions are filtered to eliminate unrelated ones, and then grouped into a number of fused principal opinion
sets that contain opinions with a similar or the same attitude or preference. By aggregating the ratings attached to the
fused opinions, they normalize the reputation of an entity. They claimed that: ”No work has explored the opinions
expressed in natural languages, opinion voting, opinion citation and user feedback ratings in a comprehensive way
for reputation generation” [5].
3. Proposed method
In this section, we remember the LSA technique and the K-means algorithm, then we describe in depth our pro-
posed method for reputation generation.
3.1. Latent Semantic Analysis
Latent Semantic Analysis (LSA) is a technique in natural language processing of analyzing relationships between a
set of documents and the terms they contain by producing a set of concepts related to the documents and terms. In [6],
T.K. Landauer, P.W. Foltz and D. Laham describe LSA as follows: ”Latent Semantic Analysis (LSA) is a theory and
method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a
large corpus of text (Landauer and Dumais, 1997). The underlying idea is that the aggregate of all the word contexts
in which a given word does and does not appear provides a set of mutual constraints that largely determines the
similarity of meaning of words and sets of words to each other. The adequacy of LSAs reflection of human knowledge
has been established in a variety of ways. For example, its scores overlap those of humans on standard vocabulary and
subject matter tests; it mimics human word sorting and category judgments; it simulates wordword and passageword
lexical priming data, and, it accurately estimates passage coherence, learnability of passages by individual students,
and the quality of knowledge contained in an essay”.
3.2. K-means Algorithm
K-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster
analysis in data mining. K-means algorithm aims to divide M points in N dimensions into K clusters so that the
within-cluster sum of squares is minimized [7][8].
82 Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86
3.3. System overview
We propose the following procedure to cluster and mine opinions for reputation generation.
1. Opinion data collection and preprocess. During this step, we collect the opinion data about an entity coming
from websites (product, movie, etc). Because there are many types of raw opinion data that contain many words
and symbols, preprocessing of such collected raw data is required, such as filtering word segmentation and stop
words and eliminating useless expressions and pictures, etc.
2. Opinion clustering. After applying LSA model, we cluster opinions into different clusters by using K-means
algorithm. In this step, some statistics can be gained for reputation generation such as the number of opinions,
the sum of similarity and the sum of ratings in each cluster.
3. Reputation generation. This step further aggregates clustered opinions to generate a reputation value by con-
sidering the popularity and other statistics of principal opinions.
3.4. Opinion clustering
The opinion clustering algorithm is shown below (Algorithm 1).
Algorithm 1 Opinion clustering
Begin
Step 1: Apply LSA model (We have used TruncatedSVD from Sklearn library in Python).
Step 2: Set a number of clusters and apply K-means clustering algorithm.
Step 3: Acquire the statistics of each cluster: (the sum of the similarity in a cluster using cosine similarity metric,
the sum of ratings in a cluster and the number of similar reviews in a cluster).
End
By applying Algorithm 1, we cluster opinions into several principal opinion sets after applying LSA model. The
opinions in each cluster hold a similar or same perspective. Once the processing based on Algorithm 1 has been
completed, the opinions are grouped into a number of clusters. Meanwhile, we also get the statistics of the clusters,
i.e., the number of similar opinions in each cluster, the sum of their ratings and the sum of their similarity by using
the cosine similarity metric.
3.5. Reputation generation
Based on the result of opinion clustering, we propose a method for generating a single reputation value of an entity.
In the overall, it is important to show users a concrete scale of reputation expressed by a single value. This reputation
presentation can provide good user experiences, especially for mobile Internet users who use mobile devices with
small screen sizes.
We propose formula (1) to generate the reputation of entity ”A” based on the clustering of the opinions on ”A”.
Rep(A) =
1
n clusters
.
n clusters

k=1
Vk.Sk
Nk.Nk
(1)
We denote:
n clusters : The number of clusters.
Nk : The number of opinions in cluster k.
Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 83
Sk : The sum of the similarity in cluster k.
Vk : The sum of ratings in cluster k.
In (1), we assume that each opinion has a rating on the entity attached to it. In our case, the rating is a number
ranging from 1 to 10 to represent a level of satisfaction.
4. Results and discussion
4.1. Dataset
We have created manually a dataset containing 600 reviews for six different movies by using IMDB website that
contains user reviews and ratings towards movies.
The statistical information of datasets is shown in Table 1.
Table 1. Statistical information of Datasets
The total number of reviews and ratings 600
The number of reviews per movie 100
4.2. Preprocessing reviews
After collecting all reviews, we applied tokenization, stemming and stop words removal in the reviews in order to
use them to carry out opinion clustering.
4.3. Evaluation measures
To measure the effectiveness of our system, we use AE (Absolute Error) and MAE(Mean Absolute Error) which
are defined as follows:
Absolute Error: The difference between the measured or inferred value of a quantity and its actual value.
Mean Absolute Error: The average of the absolute difference between prediction and actual observation.
4.4. Opinion clustering
The reviews can be grouped into a number of clusters. We can also acquire their statistics during clustering, such
as the number of similar opinions, the sum of similarity, and the sum of ratings in each cluster. To illustrate this
process, we provided example results of opinion clustering based on the 100 reviews of a movie in datasets as shown
in Table 2. We provide a python implementation for the clustering step in Github 1
.
For defining the best value of n clusters, we perform many execution with different number of clusters values.
4.5. Reputation generation
In order to evaluate our approach, we compared the final reputation computed by formula (1) with a users weighted
average vote computed by IMDB (IMDBWAV) website to represent a rating for a target movie, which is a number
ranging from 1 to 10 as shown in Fig 1. We varied the number of clusters from 2 to 19. Fig 2 and 3 show the Absolute
Error between IMDBWAV (IMDB users weighted average vote) and reputation value computed by our approach for
all movies.
1 https://github.com/abdessamadbenlahbib/Reputation-generation-K-mean/blob/master/Python_Code.
txt
84 Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86
Table 2. Example results of opinion clustering (Algorithm 1).
Cluster SimSum RatSum Num
C1 12.99 94 13
C2 14.97 120 15
C3 14.94 114 15
C4 9.98 61 10
C5 10.95 91 11
C6 14.96 128 15
C7 9.98 82 10
C8 10.99 90 11
Legend: SimSum: the sum of the similarity in a cluster.
RatSum: the sum of ratings in a cluster.
Num: the number of similar reviews in a cluster.
Fig. 1. IMDBWAV (IMDB users weighted average vote) for The Shawshank Redemption movie
Fig. 2. Absolute Error between IMDBWAV and reputation value computed by our approach for movie 1, 2 and 3
Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 85
Fig. 3. Absolute Error between IMDBWAV and reputation value computed by our approach for movie 4, 5 and 6
As we can see in Fig 2 and 3, the Absolute Error between IMDBWAV and reputation value computed by our
approach is high when n clusters = 2 and n clusters = 3, then it begins to decrease. As described in Al-
gorithm 1, different values of n clusters could lead to different clustering results of reviews, which cause dif-
ferent final reputation values. Therefore, choosing a suitable value of n clusters is particularly important. We
conducted experiments to study the influence of the number of clusters n clusters on reputation generation. We
varied the number of clusters from 2 to 19 and we computed the MAE (Mean Absolute Error) between IMDB-
WAV and the reputation values computed by our approach for all the reviews of datasets for n clusters =
{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
Fig 4 shows the result of our experiments.
Fig. 4. Mae for different n clusters values
Both Fig 4 and Table 3 show that our approach performs best when n clusters = 9, since the MAE between
IMDBWAV and the values computed by (1) using all the reviews of dataset reaches its minimum.
5. Conclusions
In this paper, we have proposed an approach to generate reputation based on opinion clustering. By performing
opinion clustering, we classify various opinions into a number of clusters and gain their popularities, average similarity
and ratings. Thus, it becomes easy to aggregate all clusters to generate a single reputation value.
The experimental results have shown that our approach achieves an accurate reputation value in comparison with the
IMDB weighted average vote towards the target movies by choosing a suitable number of clusters.
86 Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86
Table 3. MAE between IMDBWAV and the reputation values computed by our method using all the reviews of the 6 movies
n clusters Mean Absolute Error
2 0.43478668
3 0.28583178
4 0.18399365
5 0.1158738
6 0.08081721
7 0.06048751
8 0.04663486
9 0.0419782
10 0.0535229
11 0.05158553
12 0.05617364
13 0.07550483
14 0.05620427
15 0.05560771
16 0.05956126
17 0.08015718
18 0.07338574
19 0.04463707
6. References
[1] Z. Yan, Trust Management in Mobile Environments - Usable and Autonomic Models, IGI Global, Hershey,
Pennsylvania, USA, 2013.
[2] Audun Josang, Roslan Ismail, Colin Boyd. A survey of trust and reputation systems for online service provi-
sion. in: Decision Support Systems Volume 43 Issue 2, March, 2007, Pages 618-644. DOI: 10.1016/j.dss.2005.05.019.
[3] P. Resnick and R. Zeckhauser. Trust Among Strangers in Internet Transactions: Empirical Analysis of
eBay’s Reputation System. In M.R. Baye, editor, The Economics of the Internet and E-Commerce, volume 11 of
Advances in Applied Microeconomics. Elsevier Science, 2002.
[4] J. Schneider et al. Disseminating Trust Information in Wearable Communities. In Proceedings of the 2nd
International Symposium on Handheld and Ubiquitous Computing (HUC2K), September 2000.
[5] Zheng Yan , Xu-yang Jing , Witold Pedrycz , Fusing and Mining Opinions for Reputation Generation, In-
formation Fusion (2016), doi: 10.1016/j.inffus.2016.11.011.
[6] Landauer, T. K., Foltz, P. W., Laham, D. (1998). Introduction to Latent Semantic Analysis. Discourse Pro-
cesses, 25, 259-284.
[7] J. A. HARTIGAN and M. A. WONG. Algorithm AS 136: A K-Means Clustering Algorithm. Journal of
the Royal Statistical Society. Series C (Applied Statistics), Vol. 28, No. 1 (1979), pp. 100-108.
[8] John A. Hartigan. Clustering Algorithms. 99th John Wiley  Sons, Inc. New York, NY, USA 1975.
ISBN:047135645X

More Related Content

Similar to An Unsupervised Approach For Reputation Generation

recommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdfrecommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdf13DikshaDatir
 
An Approach To Sentiment Analysis
An Approach To Sentiment AnalysisAn Approach To Sentiment Analysis
An Approach To Sentiment AnalysisSarah Morrow
 
IRJET- Analyzing Sentiments in One Go
IRJET-  	  Analyzing Sentiments in One GoIRJET-  	  Analyzing Sentiments in One Go
IRJET- Analyzing Sentiments in One GoIRJET Journal
 
Framework for Product Recommandation for Review Dataset
Framework for Product Recommandation for Review DatasetFramework for Product Recommandation for Review Dataset
Framework for Product Recommandation for Review Datasetrahulmonikasharma
 
CONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTAL
CONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTALCONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTAL
CONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTALcscpconf
 
Contextual model of recommending resources on an academic networking portal
Contextual model of recommending resources on an academic networking portalContextual model of recommending resources on an academic networking portal
Contextual model of recommending resources on an academic networking portalcsandit
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking ijcseit
 
Analyzing sentiment system to specify polarity by lexicon-based
Analyzing sentiment system to specify polarity by lexicon-basedAnalyzing sentiment system to specify polarity by lexicon-based
Analyzing sentiment system to specify polarity by lexicon-basedjournalBEEI
 
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataSentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataIRJET Journal
 
Using NLP Approach for Analyzing Customer Reviews
Using NLP Approach for Analyzing Customer Reviews Using NLP Approach for Analyzing Customer Reviews
Using NLP Approach for Analyzing Customer Reviews cscpconf
 
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWSUSING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWScsandit
 
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningA Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningIRJET Journal
 
Study of Recommendation System Used In Tourism and Travel
Study of Recommendation System Used In Tourism and TravelStudy of Recommendation System Used In Tourism and Travel
Study of Recommendation System Used In Tourism and Travelijtsrd
 
Forecasting movie rating using k-nearest neighbor based collaborative filtering
Forecasting movie rating using k-nearest neighbor based  collaborative filteringForecasting movie rating using k-nearest neighbor based  collaborative filtering
Forecasting movie rating using k-nearest neighbor based collaborative filteringIJECEIAES
 
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...IJECEIAES
 
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET Journal
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE
 
Online review mining for forecasting sales
Online review mining for forecasting salesOnline review mining for forecasting sales
Online review mining for forecasting saleseSAT Publishing House
 
Online review mining for forecasting sales
Online review mining for forecasting salesOnline review mining for forecasting sales
Online review mining for forecasting saleseSAT Journals
 
IRJET- Opinion Mining and Sentiment Analysis for Online Review
IRJET-  	  Opinion Mining and Sentiment Analysis for Online ReviewIRJET-  	  Opinion Mining and Sentiment Analysis for Online Review
IRJET- Opinion Mining and Sentiment Analysis for Online ReviewIRJET Journal
 

Similar to An Unsupervised Approach For Reputation Generation (20)

recommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdfrecommendationsystem1-221109055232-c8b46131.pdf
recommendationsystem1-221109055232-c8b46131.pdf
 
An Approach To Sentiment Analysis
An Approach To Sentiment AnalysisAn Approach To Sentiment Analysis
An Approach To Sentiment Analysis
 
IRJET- Analyzing Sentiments in One Go
IRJET-  	  Analyzing Sentiments in One GoIRJET-  	  Analyzing Sentiments in One Go
IRJET- Analyzing Sentiments in One Go
 
Framework for Product Recommandation for Review Dataset
Framework for Product Recommandation for Review DatasetFramework for Product Recommandation for Review Dataset
Framework for Product Recommandation for Review Dataset
 
CONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTAL
CONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTALCONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTAL
CONTEXTUAL MODEL OF RECOMMENDING RESOURCES ON AN ACADEMIC NETWORKING PORTAL
 
Contextual model of recommending resources on an academic networking portal
Contextual model of recommending resources on an academic networking portalContextual model of recommending resources on an academic networking portal
Contextual model of recommending resources on an academic networking portal
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking
 
Analyzing sentiment system to specify polarity by lexicon-based
Analyzing sentiment system to specify polarity by lexicon-basedAnalyzing sentiment system to specify polarity by lexicon-based
Analyzing sentiment system to specify polarity by lexicon-based
 
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataSentiment Analysis on Twitter Data
Sentiment Analysis on Twitter Data
 
Using NLP Approach for Analyzing Customer Reviews
Using NLP Approach for Analyzing Customer Reviews Using NLP Approach for Analyzing Customer Reviews
Using NLP Approach for Analyzing Customer Reviews
 
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWSUSING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
USING NLP APPROACH FOR ANALYZING CUSTOMER REVIEWS
 
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine LearningA Survey on Recommendation System based on Knowledge Graph and Machine Learning
A Survey on Recommendation System based on Knowledge Graph and Machine Learning
 
Study of Recommendation System Used In Tourism and Travel
Study of Recommendation System Used In Tourism and TravelStudy of Recommendation System Used In Tourism and Travel
Study of Recommendation System Used In Tourism and Travel
 
Forecasting movie rating using k-nearest neighbor based collaborative filtering
Forecasting movie rating using k-nearest neighbor based  collaborative filteringForecasting movie rating using k-nearest neighbor based  collaborative filtering
Forecasting movie rating using k-nearest neighbor based collaborative filtering
 
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
 
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
 
Online review mining for forecasting sales
Online review mining for forecasting salesOnline review mining for forecasting sales
Online review mining for forecasting sales
 
Online review mining for forecasting sales
Online review mining for forecasting salesOnline review mining for forecasting sales
Online review mining for forecasting sales
 
IRJET- Opinion Mining and Sentiment Analysis for Online Review
IRJET-  	  Opinion Mining and Sentiment Analysis for Online ReviewIRJET-  	  Opinion Mining and Sentiment Analysis for Online Review
IRJET- Opinion Mining and Sentiment Analysis for Online Review
 

More from Kayla Jones

Free Printable Stationery - Letter Size. Online assignment writing service.
Free Printable Stationery - Letter Size. Online assignment writing service.Free Printable Stationery - Letter Size. Online assignment writing service.
Free Printable Stationery - Letter Size. Online assignment writing service.Kayla Jones
 
Critique Response Sample Summary Response Essa
Critique Response Sample Summary Response EssaCritique Response Sample Summary Response Essa
Critique Response Sample Summary Response EssaKayla Jones
 
Definisi Dan Contoh Paragraph Cause And Effe
Definisi Dan Contoh Paragraph Cause And EffeDefinisi Dan Contoh Paragraph Cause And Effe
Definisi Dan Contoh Paragraph Cause And EffeKayla Jones
 
Analysis on A s Laundry Shop A Profit Maximization Approach.pdf
Analysis on A s Laundry Shop  A Profit Maximization Approach.pdfAnalysis on A s Laundry Shop  A Profit Maximization Approach.pdf
Analysis on A s Laundry Shop A Profit Maximization Approach.pdfKayla Jones
 
A Close and Distant Reading of Shakespearean Intertextuality.pdf
A Close and Distant Reading of Shakespearean Intertextuality.pdfA Close and Distant Reading of Shakespearean Intertextuality.pdf
A Close and Distant Reading of Shakespearean Intertextuality.pdfKayla Jones
 
A CRITICAL ANALYSIS OF THE STATUS AND APPLICATION OF THE RESPONSIBILITY TO PR...
A CRITICAL ANALYSIS OF THE STATUS AND APPLICATION OF THE RESPONSIBILITY TO PR...A CRITICAL ANALYSIS OF THE STATUS AND APPLICATION OF THE RESPONSIBILITY TO PR...
A CRITICAL ANALYSIS OF THE STATUS AND APPLICATION OF THE RESPONSIBILITY TO PR...Kayla Jones
 
A Developmental Evolutionary Framework for Psychology.pdf
A Developmental Evolutionary Framework for Psychology.pdfA Developmental Evolutionary Framework for Psychology.pdf
A Developmental Evolutionary Framework for Psychology.pdfKayla Jones
 
A Pedagogical Model for Improving Thinking About Learning.pdf
A Pedagogical Model for Improving Thinking About Learning.pdfA Pedagogical Model for Improving Thinking About Learning.pdf
A Pedagogical Model for Improving Thinking About Learning.pdfKayla Jones
 
5 The epidemiology of obesity.pdf
5 The epidemiology of obesity.pdf5 The epidemiology of obesity.pdf
5 The epidemiology of obesity.pdfKayla Jones
 
Agri-tourism handbook.pdf
Agri-tourism handbook.pdfAgri-tourism handbook.pdf
Agri-tourism handbook.pdfKayla Jones
 
1001 Solved Engineering Fundamentals Problems 3rd Ed..pdf.pdf
1001 Solved Engineering Fundamentals Problems 3rd Ed..pdf.pdf1001 Solved Engineering Fundamentals Problems 3rd Ed..pdf.pdf
1001 Solved Engineering Fundamentals Problems 3rd Ed..pdf.pdfKayla Jones
 
A THEORETICAL FRAMEWORK OF STRESS MANAGEMENT- CONTEMPORARY APPROACHES, MODELS...
A THEORETICAL FRAMEWORK OF STRESS MANAGEMENT- CONTEMPORARY APPROACHES, MODELS...A THEORETICAL FRAMEWORK OF STRESS MANAGEMENT- CONTEMPORARY APPROACHES, MODELS...
A THEORETICAL FRAMEWORK OF STRESS MANAGEMENT- CONTEMPORARY APPROACHES, MODELS...Kayla Jones
 
Assessing Testing Practices with Reference to Communicative Competence in Ess...
Assessing Testing Practices with Reference to Communicative Competence in Ess...Assessing Testing Practices with Reference to Communicative Competence in Ess...
Assessing Testing Practices with Reference to Communicative Competence in Ess...Kayla Jones
 
abstract on climate change.pdf
abstract on climate change.pdfabstract on climate change.pdf
abstract on climate change.pdfKayla Jones
 
A research strategy for text desigbers The role of headings.pdf
A research strategy for text desigbers  The role of headings.pdfA research strategy for text desigbers  The role of headings.pdf
A research strategy for text desigbers The role of headings.pdfKayla Jones
 
An Analysis Of Consumers Perception Towards Rebranding A Study Of Hero Moto...
An Analysis Of Consumers  Perception Towards Rebranding  A Study Of Hero Moto...An Analysis Of Consumers  Perception Towards Rebranding  A Study Of Hero Moto...
An Analysis Of Consumers Perception Towards Rebranding A Study Of Hero Moto...Kayla Jones
 
A Psicologia Da Crian A Jean Piaget
A Psicologia Da Crian A Jean PiagetA Psicologia Da Crian A Jean Piaget
A Psicologia Da Crian A Jean PiagetKayla Jones
 
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...Kayla Jones
 
Addressing Homelessness In Public Parks
Addressing Homelessness In Public ParksAddressing Homelessness In Public Parks
Addressing Homelessness In Public ParksKayla Jones
 
A Critical Analysis Of The Academic Papers Written By Experienced Associate A...
A Critical Analysis Of The Academic Papers Written By Experienced Associate A...A Critical Analysis Of The Academic Papers Written By Experienced Associate A...
A Critical Analysis Of The Academic Papers Written By Experienced Associate A...Kayla Jones
 

More from Kayla Jones (20)

Free Printable Stationery - Letter Size. Online assignment writing service.
Free Printable Stationery - Letter Size. Online assignment writing service.Free Printable Stationery - Letter Size. Online assignment writing service.
Free Printable Stationery - Letter Size. Online assignment writing service.
 
Critique Response Sample Summary Response Essa
Critique Response Sample Summary Response EssaCritique Response Sample Summary Response Essa
Critique Response Sample Summary Response Essa
 
Definisi Dan Contoh Paragraph Cause And Effe
Definisi Dan Contoh Paragraph Cause And EffeDefinisi Dan Contoh Paragraph Cause And Effe
Definisi Dan Contoh Paragraph Cause And Effe
 
Analysis on A s Laundry Shop A Profit Maximization Approach.pdf
Analysis on A s Laundry Shop  A Profit Maximization Approach.pdfAnalysis on A s Laundry Shop  A Profit Maximization Approach.pdf
Analysis on A s Laundry Shop A Profit Maximization Approach.pdf
 
A Close and Distant Reading of Shakespearean Intertextuality.pdf
A Close and Distant Reading of Shakespearean Intertextuality.pdfA Close and Distant Reading of Shakespearean Intertextuality.pdf
A Close and Distant Reading of Shakespearean Intertextuality.pdf
 
A CRITICAL ANALYSIS OF THE STATUS AND APPLICATION OF THE RESPONSIBILITY TO PR...
A CRITICAL ANALYSIS OF THE STATUS AND APPLICATION OF THE RESPONSIBILITY TO PR...A CRITICAL ANALYSIS OF THE STATUS AND APPLICATION OF THE RESPONSIBILITY TO PR...
A CRITICAL ANALYSIS OF THE STATUS AND APPLICATION OF THE RESPONSIBILITY TO PR...
 
A Developmental Evolutionary Framework for Psychology.pdf
A Developmental Evolutionary Framework for Psychology.pdfA Developmental Evolutionary Framework for Psychology.pdf
A Developmental Evolutionary Framework for Psychology.pdf
 
A Pedagogical Model for Improving Thinking About Learning.pdf
A Pedagogical Model for Improving Thinking About Learning.pdfA Pedagogical Model for Improving Thinking About Learning.pdf
A Pedagogical Model for Improving Thinking About Learning.pdf
 
5 The epidemiology of obesity.pdf
5 The epidemiology of obesity.pdf5 The epidemiology of obesity.pdf
5 The epidemiology of obesity.pdf
 
Agri-tourism handbook.pdf
Agri-tourism handbook.pdfAgri-tourism handbook.pdf
Agri-tourism handbook.pdf
 
1001 Solved Engineering Fundamentals Problems 3rd Ed..pdf.pdf
1001 Solved Engineering Fundamentals Problems 3rd Ed..pdf.pdf1001 Solved Engineering Fundamentals Problems 3rd Ed..pdf.pdf
1001 Solved Engineering Fundamentals Problems 3rd Ed..pdf.pdf
 
A THEORETICAL FRAMEWORK OF STRESS MANAGEMENT- CONTEMPORARY APPROACHES, MODELS...
A THEORETICAL FRAMEWORK OF STRESS MANAGEMENT- CONTEMPORARY APPROACHES, MODELS...A THEORETICAL FRAMEWORK OF STRESS MANAGEMENT- CONTEMPORARY APPROACHES, MODELS...
A THEORETICAL FRAMEWORK OF STRESS MANAGEMENT- CONTEMPORARY APPROACHES, MODELS...
 
Assessing Testing Practices with Reference to Communicative Competence in Ess...
Assessing Testing Practices with Reference to Communicative Competence in Ess...Assessing Testing Practices with Reference to Communicative Competence in Ess...
Assessing Testing Practices with Reference to Communicative Competence in Ess...
 
abstract on climate change.pdf
abstract on climate change.pdfabstract on climate change.pdf
abstract on climate change.pdf
 
A research strategy for text desigbers The role of headings.pdf
A research strategy for text desigbers  The role of headings.pdfA research strategy for text desigbers  The role of headings.pdf
A research strategy for text desigbers The role of headings.pdf
 
An Analysis Of Consumers Perception Towards Rebranding A Study Of Hero Moto...
An Analysis Of Consumers  Perception Towards Rebranding  A Study Of Hero Moto...An Analysis Of Consumers  Perception Towards Rebranding  A Study Of Hero Moto...
An Analysis Of Consumers Perception Towards Rebranding A Study Of Hero Moto...
 
A Psicologia Da Crian A Jean Piaget
A Psicologia Da Crian A Jean PiagetA Psicologia Da Crian A Jean Piaget
A Psicologia Da Crian A Jean Piaget
 
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
An Exposition Of The Nature Of Volunteered Geographical Information And Its S...
 
Addressing Homelessness In Public Parks
Addressing Homelessness In Public ParksAddressing Homelessness In Public Parks
Addressing Homelessness In Public Parks
 
A Critical Analysis Of The Academic Papers Written By Experienced Associate A...
A Critical Analysis Of The Academic Papers Written By Experienced Associate A...A Critical Analysis Of The Academic Papers Written By Experienced Associate A...
A Critical Analysis Of The Academic Papers Written By Experienced Associate A...
 

Recently uploaded

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 

Recently uploaded (20)

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 

An Unsupervised Approach For Reputation Generation

  • 1. ScienceDirect Available online at www.sciencedirect.com Procedia Computer Science 148 (2019) 80–86 1877-0509 © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the scientific committee of the Second International Conference on Intelligent Computing in Data Sciences (ICDS 2018). 10.1016/j.procs.2019.01.011 © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the scientific committee of the Second International Conference on Intelligent Computing in Data Sciences (ICDS 2018). Keywords: Semantic analysis; Opinion mining; Reputation generation; Machine learning. 1. Introduction Over the past few years, the web has been growing at an incredible rate. Nowadays, people can buy products, watch movies and make reservations via the Internet. Before making a decision, most people would like to seek other people’s opinions on a target entity in order to judge its performances. In this case, one could scan both entity descriptions and user comments. Hence, people have generally established a certain online custom: first look at other users’ comments, then make a decision towards the target entity. On the other hand, online sellers would like to collect the user comments with high praise and put them in the description of their products in order to attract more purchases. According to recent statistics, the number of users of some famous online shopping centers, e.g., Taobao, Jingdong and Amazon has exceeded one billion. Each of above commercial websites contains a huge number of product comments. These comments contain user opinions on the products. As the opinions show the subjective attitudes, evaluations, and speculations of users expressed in natural languages, this kind of contents contributed by the Internet users has been well recognized as valuable information. It can be exploited to analyze public opinions on a specific product in ∗ Corresponding author. Tel.: +212632561278. E-mail address: abdessamad.benlahbib@usmba.ac.ma Second International Conference on Intelligent Computing in Data Sciences (ICDS 2018) An Unsupervised Approach for Reputation Generation Abdessamad Benlahbiba,∗, El Habib Nfaouia aLIIAN Laboratory, Faculty of Sciences Dhar EL Mehraz, Sidi Mohammed Ben Abdellah University, Fez, Morocco Abstract Nowadays, watching a movie, buying a product, making hotel reservations and other e-commerce trades are strung to consulting other peoples reviews and recommendations on the target entity. Indeed, Amazon, IMDB (Internet Movie Database) as well as several websites provide a convenient platform where users share freely their opinions and their subjective attitudes towards the target entity with no restrictions. However, those opinions are too much to be examined one by one, this is why a general reputation value makes the task of choosing the right product much easier. In this paper, we propose a reputation generation approach based on opinion clustering and semantic analysis. In our approach, opinions are grouped into a number of clusters that contain opinions with the same attitude or preference. By aggregating the ratings attached to the clusters, we generate the reputation of an entity. Experimental results demonstrate the effectiveness of the proposed approach in generating reputation value.
  • 2. Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 81 order to figure out user likes or dislikes [5]. In this paper, we propose to use LSA (Latent Semantic Analysis) model, then applying K-means algorithm to cluster opinions based on their semantic relations, and by aggregating the ratings attached to the fused opinions, we normalize the reputation of an entity. The paper is organized as follows. Section 2 gives a brief review of related work. In Section 3, we present the details of our approach. We show experimental results followed by additional analysis and discussions in Section 4. Finally, conclusions are presented in Section 5. 2. Related work Reputation is a measure that is derived from direct or indirect knowledge on earlier interactions of entities and is used to assess the level of trust an entity puts into another entity [1]. Reputation systems are typically based on public information in order to reflect the community’s opinion in general [2]. The simplest form of computing reputation scores is simply to sum the number of positive ratings and negative ratings separately, and to keep a total score as the positive score minus the negative score. This is the principle used in eBay’s reputation forum which is described in [3]. In [4], a more advanced scheme proposed to compute the reputation score as the average of all ratings, and this principle is used in the reputation systems of numerous commercial web sites, such as Epinions and Amazon. Advanced models in this category compute a weighted average of all the ratings, where the rating weight can be determined by factors such as rater trustworthiness/reputation, age of the rating, distance between rating and current score etc. Recently, Zheng et al [5] proposed a novel reputation generation approach based on opinion fusion and mining. In their approach, opinions are filtered to eliminate unrelated ones, and then grouped into a number of fused principal opinion sets that contain opinions with a similar or the same attitude or preference. By aggregating the ratings attached to the fused opinions, they normalize the reputation of an entity. They claimed that: ”No work has explored the opinions expressed in natural languages, opinion voting, opinion citation and user feedback ratings in a comprehensive way for reputation generation” [5]. 3. Proposed method In this section, we remember the LSA technique and the K-means algorithm, then we describe in depth our pro- posed method for reputation generation. 3.1. Latent Semantic Analysis Latent Semantic Analysis (LSA) is a technique in natural language processing of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. In [6], T.K. Landauer, P.W. Foltz and D. Laham describe LSA as follows: ”Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text (Landauer and Dumais, 1997). The underlying idea is that the aggregate of all the word contexts in which a given word does and does not appear provides a set of mutual constraints that largely determines the similarity of meaning of words and sets of words to each other. The adequacy of LSAs reflection of human knowledge has been established in a variety of ways. For example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates wordword and passageword lexical priming data, and, it accurately estimates passage coherence, learnability of passages by individual students, and the quality of knowledge contained in an essay”. 3.2. K-means Algorithm K-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. K-means algorithm aims to divide M points in N dimensions into K clusters so that the within-cluster sum of squares is minimized [7][8].
  • 3. 82 Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 3.3. System overview We propose the following procedure to cluster and mine opinions for reputation generation. 1. Opinion data collection and preprocess. During this step, we collect the opinion data about an entity coming from websites (product, movie, etc). Because there are many types of raw opinion data that contain many words and symbols, preprocessing of such collected raw data is required, such as filtering word segmentation and stop words and eliminating useless expressions and pictures, etc. 2. Opinion clustering. After applying LSA model, we cluster opinions into different clusters by using K-means algorithm. In this step, some statistics can be gained for reputation generation such as the number of opinions, the sum of similarity and the sum of ratings in each cluster. 3. Reputation generation. This step further aggregates clustered opinions to generate a reputation value by con- sidering the popularity and other statistics of principal opinions. 3.4. Opinion clustering The opinion clustering algorithm is shown below (Algorithm 1). Algorithm 1 Opinion clustering Begin Step 1: Apply LSA model (We have used TruncatedSVD from Sklearn library in Python). Step 2: Set a number of clusters and apply K-means clustering algorithm. Step 3: Acquire the statistics of each cluster: (the sum of the similarity in a cluster using cosine similarity metric, the sum of ratings in a cluster and the number of similar reviews in a cluster). End By applying Algorithm 1, we cluster opinions into several principal opinion sets after applying LSA model. The opinions in each cluster hold a similar or same perspective. Once the processing based on Algorithm 1 has been completed, the opinions are grouped into a number of clusters. Meanwhile, we also get the statistics of the clusters, i.e., the number of similar opinions in each cluster, the sum of their ratings and the sum of their similarity by using the cosine similarity metric. 3.5. Reputation generation Based on the result of opinion clustering, we propose a method for generating a single reputation value of an entity. In the overall, it is important to show users a concrete scale of reputation expressed by a single value. This reputation presentation can provide good user experiences, especially for mobile Internet users who use mobile devices with small screen sizes. We propose formula (1) to generate the reputation of entity ”A” based on the clustering of the opinions on ”A”. Rep(A) = 1 n clusters . n clusters k=1 Vk.Sk Nk.Nk (1) We denote: n clusters : The number of clusters. Nk : The number of opinions in cluster k.
  • 4. Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 83 Sk : The sum of the similarity in cluster k. Vk : The sum of ratings in cluster k. In (1), we assume that each opinion has a rating on the entity attached to it. In our case, the rating is a number ranging from 1 to 10 to represent a level of satisfaction. 4. Results and discussion 4.1. Dataset We have created manually a dataset containing 600 reviews for six different movies by using IMDB website that contains user reviews and ratings towards movies. The statistical information of datasets is shown in Table 1. Table 1. Statistical information of Datasets The total number of reviews and ratings 600 The number of reviews per movie 100 4.2. Preprocessing reviews After collecting all reviews, we applied tokenization, stemming and stop words removal in the reviews in order to use them to carry out opinion clustering. 4.3. Evaluation measures To measure the effectiveness of our system, we use AE (Absolute Error) and MAE(Mean Absolute Error) which are defined as follows: Absolute Error: The difference between the measured or inferred value of a quantity and its actual value. Mean Absolute Error: The average of the absolute difference between prediction and actual observation. 4.4. Opinion clustering The reviews can be grouped into a number of clusters. We can also acquire their statistics during clustering, such as the number of similar opinions, the sum of similarity, and the sum of ratings in each cluster. To illustrate this process, we provided example results of opinion clustering based on the 100 reviews of a movie in datasets as shown in Table 2. We provide a python implementation for the clustering step in Github 1 . For defining the best value of n clusters, we perform many execution with different number of clusters values. 4.5. Reputation generation In order to evaluate our approach, we compared the final reputation computed by formula (1) with a users weighted average vote computed by IMDB (IMDBWAV) website to represent a rating for a target movie, which is a number ranging from 1 to 10 as shown in Fig 1. We varied the number of clusters from 2 to 19. Fig 2 and 3 show the Absolute Error between IMDBWAV (IMDB users weighted average vote) and reputation value computed by our approach for all movies. 1 https://github.com/abdessamadbenlahbib/Reputation-generation-K-mean/blob/master/Python_Code. txt
  • 5. 84 Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 Table 2. Example results of opinion clustering (Algorithm 1). Cluster SimSum RatSum Num C1 12.99 94 13 C2 14.97 120 15 C3 14.94 114 15 C4 9.98 61 10 C5 10.95 91 11 C6 14.96 128 15 C7 9.98 82 10 C8 10.99 90 11 Legend: SimSum: the sum of the similarity in a cluster. RatSum: the sum of ratings in a cluster. Num: the number of similar reviews in a cluster. Fig. 1. IMDBWAV (IMDB users weighted average vote) for The Shawshank Redemption movie Fig. 2. Absolute Error between IMDBWAV and reputation value computed by our approach for movie 1, 2 and 3
  • 6. Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 85 Fig. 3. Absolute Error between IMDBWAV and reputation value computed by our approach for movie 4, 5 and 6 As we can see in Fig 2 and 3, the Absolute Error between IMDBWAV and reputation value computed by our approach is high when n clusters = 2 and n clusters = 3, then it begins to decrease. As described in Al- gorithm 1, different values of n clusters could lead to different clustering results of reviews, which cause dif- ferent final reputation values. Therefore, choosing a suitable value of n clusters is particularly important. We conducted experiments to study the influence of the number of clusters n clusters on reputation generation. We varied the number of clusters from 2 to 19 and we computed the MAE (Mean Absolute Error) between IMDB- WAV and the reputation values computed by our approach for all the reviews of datasets for n clusters = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19} Fig 4 shows the result of our experiments. Fig. 4. Mae for different n clusters values Both Fig 4 and Table 3 show that our approach performs best when n clusters = 9, since the MAE between IMDBWAV and the values computed by (1) using all the reviews of dataset reaches its minimum. 5. Conclusions In this paper, we have proposed an approach to generate reputation based on opinion clustering. By performing opinion clustering, we classify various opinions into a number of clusters and gain their popularities, average similarity and ratings. Thus, it becomes easy to aggregate all clusters to generate a single reputation value. The experimental results have shown that our approach achieves an accurate reputation value in comparison with the IMDB weighted average vote towards the target movies by choosing a suitable number of clusters.
  • 7. 86 Abdessamad Benlahbib et al. / Procedia Computer Science 148 (2019) 80–86 Table 3. MAE between IMDBWAV and the reputation values computed by our method using all the reviews of the 6 movies n clusters Mean Absolute Error 2 0.43478668 3 0.28583178 4 0.18399365 5 0.1158738 6 0.08081721 7 0.06048751 8 0.04663486 9 0.0419782 10 0.0535229 11 0.05158553 12 0.05617364 13 0.07550483 14 0.05620427 15 0.05560771 16 0.05956126 17 0.08015718 18 0.07338574 19 0.04463707 6. References [1] Z. Yan, Trust Management in Mobile Environments - Usable and Autonomic Models, IGI Global, Hershey, Pennsylvania, USA, 2013. [2] Audun Josang, Roslan Ismail, Colin Boyd. A survey of trust and reputation systems for online service provi- sion. in: Decision Support Systems Volume 43 Issue 2, March, 2007, Pages 618-644. DOI: 10.1016/j.dss.2005.05.019. [3] P. Resnick and R. Zeckhauser. Trust Among Strangers in Internet Transactions: Empirical Analysis of eBay’s Reputation System. In M.R. Baye, editor, The Economics of the Internet and E-Commerce, volume 11 of Advances in Applied Microeconomics. Elsevier Science, 2002. [4] J. Schneider et al. Disseminating Trust Information in Wearable Communities. In Proceedings of the 2nd International Symposium on Handheld and Ubiquitous Computing (HUC2K), September 2000. [5] Zheng Yan , Xu-yang Jing , Witold Pedrycz , Fusing and Mining Opinions for Reputation Generation, In- formation Fusion (2016), doi: 10.1016/j.inffus.2016.11.011. [6] Landauer, T. K., Foltz, P. W., Laham, D. (1998). Introduction to Latent Semantic Analysis. Discourse Pro- cesses, 25, 259-284. [7] J. A. HARTIGAN and M. A. WONG. Algorithm AS 136: A K-Means Clustering Algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 28, No. 1 (1979), pp. 100-108. [8] John A. Hartigan. Clustering Algorithms. 99th John Wiley Sons, Inc. New York, NY, USA 1975. ISBN:047135645X