Nlp based heuristics for assessing participants in cscl chats

Autor Conducător științific
Universitatea
Politehnica
București
Facultatea de
Automatică și
Calculatoare
Catedra de
Calculatoare
NLP-Based Heuristics for Assessing
Participants in CSCL Chats
• Costin-Gabriel Chiru • Ştefan Trăuşan-Matu
Costin-Gabriel CHIRU, Traian Rebedea and Stefan Trausan-
Matu
Politehnica University of Bucharest
{costin.chiru, traian.rebedea, stefan.trausan}@cs.pub.ro

Content
• Introduction
• Theoretical Background
• Corpus
• Heuristics
• Evaluation
• Experiments and Results
• Conclusions
NLP-Based Heuristics for Assessing Participants in CSCL
Chats EC-TEL201318.09.2013

Introduction (I)
18.09.2013
• Instant messaging (or chat) is one of the most popular
ways of collaboratively exchanging information over
the Internet and one of the favorite environments for
Computer Supported Collaborative Learning (CSCL)
tasks.
• Chat technologies have been extended with
functionalities (e.g. the explicit referencing mechanism
- Concert Chat).
• Most of the existing tools only aim at facilitating the
chats conversation without offering analysis
instruments.
Chats EC-TEL2013

Introduction (II)
• Goals:
– identifying the criteria for automatically assessing chat
conversations
– providing the possibility to compare and rank different
chat conversations that debate the same topics
– offering the means for ranking learners according to their
performance in this kind of conversations
• How?
– Using a couple of heuristics for conversations evaluation
– Applying these heuristics for evaluating chat conversations
in two possible scenarios: individual analysis or analysis in
the context of a corpus of chat conversations debating the
same subjects
18.09.2013
Chats EC-TEL2013

Theoretical Background
• Based on:
– Bakhtin’s ideas (inter-animation, the dialogism and polyphony in
discourse) in order to solve the problem of parallel discussion
threads arising in multi-party chat conversation.
– Tannen’s observation of the importance of concept repetition in
dialogues.
– Concept repetitions (using lexical chains) express the same idea
= Bakhtin’s voices.
• Using:
– the corpus acquired for the development of PolyCAFe – a
system which analyzes each user contribution and provides
abstraction, visualization and feedback services for supporting
both learners and tutors.
– The heuristics proposed in (Chiru et. al, 2011).
18.09.2013
Chats EC-TEL2013

Corpus
• Chat conversations:
– Participants: 35 senior year undergraduate
students involved in a Human-Computer-
Interaction (HCI) class (5 students/chat)
– Topic: debate about different web-collaboration
technologies highlighting the weaknesses and
strengths of the existing tools and eventually
devising a way to combine them
– Details: 7 chat conversations, 2514 utterances
(248 to 524 utterances/chat)
– Built for the evaluation of LTfLL FP7 project
18.09.2013
Chats EC-TEL2013

Heuristics
• Adapted from the heuristics proposed in (Chiru et. al,
2011) and grouped in 3 categories:
– Learners’ involvement – average of 5 quantitative
heuristics: Number of replies, Activity (the average
number of characters per reply), Absence (the average
number of utterances that could be found between two
interventions of the same user), Persistence (the average
number of consecutive utterances), Repetition of other
participants’ concepts
– Learners’ knowledge: % of the on topic concepts that were
introduced by the participant
– Learners’ innovation: the degree of new information
introduced by a person
• Overall evaluation: average of the above 3 heuristics
18.09.2013
Chats EC-TEL2013

Evaluation
• Participants evaluation - 2 types of evaluation:
– Single chat: focusing only on one chat at a time,
ignoring the content of the others
– Multi-chats: taking into consideration also the
content of the other chats when assessing a single
chat conversation
18.09.2013
Chats EC-TEL2013
• Chats evaluation
– compare the conversations to decide which one of
them has achieved the best results  based on topic
rhythmicity (how often each of the four concepts
‘came on the floor’) and on the conversation’s
vocabulary

Analysis and Results
• For evaluation:
– Chats annotated by 4 HCI experts.
– Participants ranked their colleagues.
– We used PolyCAFe to evaluate the chats.
– Correlations: between tutors-students (average
correlation r = 0.87, σ = 0.19), tutors-PolyCAFe (r =
0.94, σ = 0.05) and students-PolyCAFe (r = 0.85, σ
= 0.16).
18.09.2013
Chats EC-TEL2013

Analysis and Results (I)
• For evaluating the application’s results:
– Chats were annotated by 4 HCI experts.
– Participants ranked their colleagues.
– We also used PolyCAFe to evaluate the chats.
– Correlations: between tutors-students (average correlation r
= 0.87, σ = 0.19), tutors-PolyCAFe (r = 0.94, σ = 0.05) and
students-PolyCAFe (r = 0.85, σ = 0.16).
18.09.2013
Chats EC-TEL2013
Mea
sure
Correlation
with …
overall involvement knowledge innovation
single multi single multi single multi single multi
AVG
Tutors 0.6 0.37 0.9 0.86 -0.6 -0.6 0.9 0.9
PolyCAFe 0.66 0.39 0.9 0.87 -0.57 -0.57 0.93 0.93
Students 0.47 0.24 0.83 0.81 -0.67 -0.67 0.86 0.86
STD
DEV
Tutors 0.22 0.46 0.1 0.15 0.21 0.21 0.06 0.06
PolyCAFe 0.17 0.52 0.1 0.13 0.22 0.22 0.08 0.08
Students 0.18 0.36 0.1 0.16 0.23 0.23 0.1 0.1

Analysis and Results (II)
• The involvement & innovation heuristics proved to be
extremely well correlated with the gold standard
• Knowledge heuristic anti-correlated and with poor correlation
 responsible for the bad results
• Continued our investigation with the 5 factors influencing the
involvement heuristic for identifying which are the good and
which can be ignored  Persistence was anti-correlated with
the gold standard and Activity is not well correlated
18.09.2013
Chats EC-TEL2013
Factors
AVG correlation with STD DEV of the correlation with
Tutors
Poly
CAFe
Students Tutors
Poly
CAFe
Students
No. Utterances 0.83 0.76 0.69 0.21 0.21 0.26
Activity 0.24 0.37 0.37 0.51 0.45 0.58
Absence 0.86 0.8 0.71 0.23 0.22 0.29
Persistence -0.6 -0.51 -0.43 0.33 0.34 0.42
Repetitions 0.77 0.76 0.81 0.26 0.24 0.23

Analysis and Results (III)
• For content evaluation, we had no comparative evaluation of
the considered chats, neither from students nor from tutors
 we compared our results with the ones provided by
PolyCAFe  Spearman’s Rank Correlation of 0.964 
extremely high, especially since the content was evaluated in
2 different ways: Latent Semantic Analysis (LSA) in PolyCAFe
and lexical chains built on top of WordNet in our system.
18.09.2013
Chats EC-TEL2013
Concepts
Rhythm
Chats
chat blog forum wiki
Overall
Ranks
PolyCAFe
Ranks
Chat_1 4.3 3.41 3.79 4.96 3 2
Chat_2 5.23 4.95 4.79 4.75 6 6
Chat_3 6.05 4.49 4.12 4.72 4 4
Chat_4 5.05 3.93 3.43 3.33 2 3
Chat_5 4.58 3.94 4.84 6.1 5 5
Chat_6 4.05 3.29 3.11 2.66 1 1
Chat_7 6.08 5.07 6.47 4.59 7 7

Conclusions
• Methodology of how to identify which heuristics
work best and which are the ones that should be
avoided when assessing chat conversations
• The overall score was not very reliable, but we
found that most of the heuristics perform well
and that the overall results are affected by only a
single factor
• Best heuristics were innovation and involvement,
while the knowledge of the participants was a
poor heuristic
18.09.2013
Chats EC-TEL2013

Questions
18.09.2013
Thank you very much!
Chats EC-TEL2013

Nlp based heuristics for assessing participants in cscl chats

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Nlp based heuristics for assessing participants in cscl chats

Similar to Nlp based heuristics for assessing participants in cscl chats (20)

More from University Politehnica Bucharest

More from University Politehnica Bucharest (20)

Recently uploaded

Recently uploaded (20)

Nlp based heuristics for assessing participants in cscl chats