SlideShare a Scribd company logo
1 of 28
Mia Mohammad Imran
Virginia Commonwealth University
Emotion Classification In Software
Engineering Texts: A Comparative Analysis of
Pre-trained Transformers Language Models
What are Software Engineering Texts
● Chats
● PR comments
● Issue comments
● Commit messages
● GitHub discussions
● Stack Overflow
● Mailing list
“Programmers Have Feelings Too!”
Anger 🤬
Appreciation 🙏
@[USER] Thank you, Stephen. I hope in the
future Angular will become even better and
easier to understand. However, first of all, I
am grateful to Angular for making me grow
as a developer.
Soooooooooooo you’re setting Angular on
fire and saying bold sh*t in bold like the
Angular team don’t care about you cause
you found relative pathing has an issue is an
odd area
How can Understanding Emotions Help?
Awareness
Self-reflect and
seek feedback
01
Empathy
Understand and
respect diverse
perspectives
02
Regulation
Manage emotions
to maintain focus
03
Social Skills
Enhance
communication and
teamwork
04
Motivation
Drive innovation
and consistent
contribution
05
Benefits of Emotional Intelligence
Study Design and Goals
● Purpose: To investigate how PTMs perform in Emotion
Classification task in software engineering text
● Establish a Benchmark against state-of-the-art tool
● Identify strengths, limitations, and error patterns of PTMs in
this domain
● Propose techniques to improve classifications
Research Questions
● RQ1: How accurately can PTMs classify emotions compared to
the state-of-the-art model?
● RQ2: Can integrating polarity features during training improve
PTMs' emotion classification ability?
Shaver’s Emotion Model
Emotion Models
● Theoretical frameworks to represent emotions
● Shaver’s tree-structured model is most commonly used in
Software Engineering Research
○ 6 primary categories, 25 secondary categories and over 100
tertiary categories
Emotion Models: Shaver’s Taxonomy
● 6 primary categories:
○ Anger 😡
○ Love ❤️
○ Fear 😨
○ Joy 😊
○ Sadness 😥
○ Surprise 😲
Shaver’s Taxonomy: Mapping Example
Excitement
Every time you comment I realize
something new about JS or TS.
This is very exciting. 😊
Feel free to file a bug for that -
that code has a history of
breaking :
Joy
Worry Fear
RQ1: How accurately can PTMs
classify emotions compared to
the state-of-the-art model?
State-of-the Art Models
SEntiMoji [1] Transfer learning Neural Network
[3] Chen et al. “Emoji-powered sentiment and emotion detection from software developers' communication data.” TOSEM, 2021
● Studies show that general purpose tools perform poorly in
SE text
● All tools perform one-vs-all predictions for all 6 basic
emotions (Anger, Love, Fear, Joy, Sadness, and Surprise)
Compared Pre-trained Language Models
● BERT: First major transformer model applied to NLP
● RoBERTa: An optimized version of BERT
● ALBERT: Lighter, faster BERT with shared layers
● DeBERTa: Enhanced BERT with disentangled attention
● CodeBERT: BERT model specialized for code
● GraphCodeBERT: CodeBERT enhanced with graph data
Evaluating the Models
● Goal: Assess effectiveness of PTMs against SotA model
● On 2 datasets
○ Stack Overflow Dataset
○ GitHub Dataset
● 80% train set, 20% test set with stratified sampling
[1] Novielli et al., “A gold standard for emotion annotation in stack overflow.” MSR 2018
[2] Imran et al., “Data augmentation for improving emotion recognition in software engineering communication.” ASE 2022
Compared Metric
● F1-score: Harmonic mean of precision and recall
○ For overall performance: micro-averaged and macro-averaged
F1-score
Results (Average F1-score)
Model Micro Avg. Macro Avg.
SEntiMoji 0.530 0.521
BERT 0.585 0.591
RoBERTa 0.575 0.590
ALBERT 0.538 0.539
DeBERTa 0.610 0.608
CodeBERT 0.545 0.555
GraphCodeBERT 0.549 0.549
Model Micro Avg. Macro Avg.
SEntiMoji 0.714 0.530
BERT 0.754 0.588
RoBERTa 0.758 0.599
ALBERT 0.747 0.584
DeBERTa 0.756 0.607
CodeBERT 0.728 0.567
GraphCodeBERT 0.722 0.552
GitHub Stack Overflow
Error Analysis
● Error Categorization by Novielli et al. [1]
[1] Novielli, Nicole et al. "A benchmark study on sentiment analysis for software engineering research." 2018 MSR.
General Error
Implicit Sentiment Polarity
Pragmatics
Figurative Language
Politeness
Polar Facts
Subjectivity in Annotation
Error Analysis on GitHub Dataset
General Error: the inability to recognize lexical cues that occur in the text
Nice, this is more slick 👍
Implicit Sentiment Polarity: humans use common knowledge to recognize
emotions that the models miss
Patiently waiting for any updates. […]
Surprisingly! Presence of Emojis
And yes, there should be tests 😱😱😱
RQ2: Can integrating polarity
features during training improve
PTMs' emotion classification
ability?
RQ2 Methodology
● Integrate polarity features through token-level attention
adjustment
● Assign greater significance to tokens linked with polarity words
during fine-tuning
RQ2 Methodology
Results (Avg F1-score) - GitHub Dataset
Model Micro Avg. Macro Avg.
BERT
BERT-Polarity
0.585
0.619 (+5.99%)
0.591
0.621 (+5.04%)
RoBERTa
RoBERTa-Polarity
0.575
0.603 (+4.94%)
0.590
0.606 (+2.75%)
ALBERT
ALBERT-Polarity
0.538
0.580 (+7.86%)
0.539
0.581 (+7.65%)
DeBERTa
DeBERTa-Polarity
0.610
0.620 (+1.75%)
0.608
0.614 (+1.04%)
CodeBERT
CodeBERT-Polarity
0.545
0.595 (+9.16%)
0.555
0.601 (+8.37%)
GraphCodeBERT
GraphCodeBERT-Polarity
0.549
0.563 (+2.52%)
0.549
0.568 (+3.38%)
Results (Avg F1-score) - Stack Overflow Dataset
Model Micro Avg. Macro Avg.
BERT
BERT-Polarity
0.754
0.762 (+1.0%)
0.588
0.607 (+3.17%)
RoBERTa
RoBERTa-Polarity
0.758
0.767 (+1.20%)
0.599
0.646 (+7.78%)
ALBERT
ALBERT-Polarity
0.747
0.757 (+1.36%)
0.584
0.616 (+10.23%)
DeBERTa
DeBERTa-Polarity
0.756
0.766 (+1.37%)
0.607
0.624 (+2.89%)
CodeBERT
CodeBERT-Polarity
0.728
0.742 (+1.91%)
0.567
0.586 (+3.32%)
GraphCodeBERT
GraphCodeBERT-Polarity
0.722
0.732 (+1.29%)
0.552
0.569 (+3.11%)
Error Analysis on GitHub Dataset
● In RQ1, 67 cases all models made mistakes
○ After Polarity enhancement, 27/67 cases - at least one model made correct
prediction
● Most improved categories:
○ General error (13/29 cases resolved)
○ Implicit polarity (9/18 cases resolved)
○ Politeness (2/3 cases resolved)
● Least improved categories:
○ Pragmatics (6/7 cases remained unresolved)
○ Figurative Language (6/9 remains unresolved)
● Still considerate amount of misclassified utterances have presence of
Emojis
Key Takeaways
● General PTMs excel in emotion classification within SE texts
compared to SE-specific models
● Polarity features enhance performance consistently
○ Challenges persist especially with negative emotions
● No single model excels across all emotions and metrics
● Common error categories are usually context dependant:
implicit polarity, figurative language, pragmatics
● Challenges in handling emojis
Future Directions
● Establish more benchmark datasets
● Investigate hierarchical emotion classification (2 step)
○ Enhance performance by identifying broad emotional valence before
specific categories
● Investigate aspect-based sentiment analysis (ABSA)-enhanced PTMs
● Fusion of text and emoji cues during pre-training/fine-tuning
● Explore generative language models for emotion detection
○ Utilize zero-shot and few-shot learning for data augmentation and
prompting techniques
● Focus on detecting emotions that may harm productivity (e.g.,
Frustration)
Questions/Thoughts/Collaboration Ideas to: Mia Mohammad Imran, imranm3@vcu.edu
Thank You!
Question?

More Related Content

Similar to Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models

Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!University of Córdoba
 
Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing InformationThomas Zimmermann
 
2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.NetBruno Capuano
 
2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.NetBruno Capuano
 
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningSentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningIRJET Journal
 
Predicting Tweet Sentiment
Predicting Tweet SentimentPredicting Tweet Sentiment
Predicting Tweet SentimentLucinda Linde
 
포스터_아미르호세인그다르지_2010-11804
포스터_아미르호세인그다르지_2010-11804포스터_아미르호세인그다르지_2010-11804
포스터_아미르호세인그다르지_2010-11804Amir Goudarzi
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxSurendra Gusain
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxSurendra Gusain
 
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?Mustafa Ekim
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
Google Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerGoogle Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerLewis Lin 🦊
 
2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.NetBruno Capuano
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Neal Lathia
 
Humane assessment on cards
Humane assessment on cardsHumane assessment on cards
Humane assessment on cardsTudor Girba
 
Customer review using sentiment analysis.pptx
Customer review using sentiment analysis.pptxCustomer review using sentiment analysis.pptx
Customer review using sentiment analysis.pptxTarunKalkar
 
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMSDevDay.org
 
From c# Into Machine Learning
From c# Into Machine LearningFrom c# Into Machine Learning
From c# Into Machine LearningDev Raj Gautam
 
Code Quality Makes Your Job Easier
Code Quality Makes Your Job EasierCode Quality Makes Your Job Easier
Code Quality Makes Your Job EasierTonya Mork
 

Similar to Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models (20)

Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!Applying AI to software engineering problems: Do not forget the human!
Applying AI to software engineering problems: Do not forget the human!
 
Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net2020 09 24 - CONDG ML.Net
2020 09 24 - CONDG ML.Net
 
2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net2020 04 10 Catch IT - Getting started with ML.Net
2020 04 10 Catch IT - Getting started with ML.Net
 
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningSentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
 
Predicting Tweet Sentiment
Predicting Tweet SentimentPredicting Tweet Sentiment
Predicting Tweet Sentiment
 
포스터_아미르호세인그다르지_2010-11804
포스터_아미르호세인그다르지_2010-11804포스터_아미르호세인그다르지_2010-11804
포스터_아미르호세인그다르지_2010-11804
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
 
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
Yenikod Yazılım Kursu - Kodlama Öğrenebilir Miyim? Kodlama Bana Göre Mi?
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Google Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerGoogle Interview Prep Guide Software Engineer
Google Interview Prep Guide Software Engineer
 
2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net2020 04 04 NetCoreConf - Machine Learning.Net
2020 04 04 NetCoreConf - Machine Learning.Net
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)
 
Humane assessment on cards
Humane assessment on cardsHumane assessment on cards
Humane assessment on cards
 
Customer review using sentiment analysis.pptx
Customer review using sentiment analysis.pptxCustomer review using sentiment analysis.pptx
Customer review using sentiment analysis.pptx
 
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
[DevDay2019] How do I test AI models? - By Minh Hoang, Senior QA Engineer at KMS
 
From c# Into Machine Learning
From c# Into Machine LearningFrom c# Into Machine Learning
From c# Into Machine Learning
 
Code Quality Makes Your Job Easier
Code Quality Makes Your Job EasierCode Quality Makes Your Job Easier
Code Quality Makes Your Job Easier
 
Introduction To Pc Security
Introduction To Pc SecurityIntroduction To Pc Security
Introduction To Pc Security
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models

  • 1. Mia Mohammad Imran Virginia Commonwealth University Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models
  • 2. What are Software Engineering Texts ● Chats ● PR comments ● Issue comments ● Commit messages ● GitHub discussions ● Stack Overflow ● Mailing list
  • 3. “Programmers Have Feelings Too!” Anger 🤬 Appreciation 🙏 @[USER] Thank you, Stephen. I hope in the future Angular will become even better and easier to understand. However, first of all, I am grateful to Angular for making me grow as a developer. Soooooooooooo you’re setting Angular on fire and saying bold sh*t in bold like the Angular team don’t care about you cause you found relative pathing has an issue is an odd area
  • 4. How can Understanding Emotions Help?
  • 5. Awareness Self-reflect and seek feedback 01 Empathy Understand and respect diverse perspectives 02 Regulation Manage emotions to maintain focus 03 Social Skills Enhance communication and teamwork 04 Motivation Drive innovation and consistent contribution 05 Benefits of Emotional Intelligence
  • 6. Study Design and Goals ● Purpose: To investigate how PTMs perform in Emotion Classification task in software engineering text ● Establish a Benchmark against state-of-the-art tool ● Identify strengths, limitations, and error patterns of PTMs in this domain ● Propose techniques to improve classifications
  • 7. Research Questions ● RQ1: How accurately can PTMs classify emotions compared to the state-of-the-art model? ● RQ2: Can integrating polarity features during training improve PTMs' emotion classification ability?
  • 9. Emotion Models ● Theoretical frameworks to represent emotions ● Shaver’s tree-structured model is most commonly used in Software Engineering Research ○ 6 primary categories, 25 secondary categories and over 100 tertiary categories
  • 10. Emotion Models: Shaver’s Taxonomy ● 6 primary categories: ○ Anger 😡 ○ Love ❤️ ○ Fear 😨 ○ Joy 😊 ○ Sadness 😥 ○ Surprise 😲
  • 11. Shaver’s Taxonomy: Mapping Example Excitement Every time you comment I realize something new about JS or TS. This is very exciting. 😊 Feel free to file a bug for that - that code has a history of breaking : Joy Worry Fear
  • 12. RQ1: How accurately can PTMs classify emotions compared to the state-of-the-art model?
  • 13. State-of-the Art Models SEntiMoji [1] Transfer learning Neural Network [3] Chen et al. “Emoji-powered sentiment and emotion detection from software developers' communication data.” TOSEM, 2021 ● Studies show that general purpose tools perform poorly in SE text ● All tools perform one-vs-all predictions for all 6 basic emotions (Anger, Love, Fear, Joy, Sadness, and Surprise)
  • 14. Compared Pre-trained Language Models ● BERT: First major transformer model applied to NLP ● RoBERTa: An optimized version of BERT ● ALBERT: Lighter, faster BERT with shared layers ● DeBERTa: Enhanced BERT with disentangled attention ● CodeBERT: BERT model specialized for code ● GraphCodeBERT: CodeBERT enhanced with graph data
  • 15. Evaluating the Models ● Goal: Assess effectiveness of PTMs against SotA model ● On 2 datasets ○ Stack Overflow Dataset ○ GitHub Dataset ● 80% train set, 20% test set with stratified sampling [1] Novielli et al., “A gold standard for emotion annotation in stack overflow.” MSR 2018 [2] Imran et al., “Data augmentation for improving emotion recognition in software engineering communication.” ASE 2022
  • 16. Compared Metric ● F1-score: Harmonic mean of precision and recall ○ For overall performance: micro-averaged and macro-averaged F1-score
  • 17. Results (Average F1-score) Model Micro Avg. Macro Avg. SEntiMoji 0.530 0.521 BERT 0.585 0.591 RoBERTa 0.575 0.590 ALBERT 0.538 0.539 DeBERTa 0.610 0.608 CodeBERT 0.545 0.555 GraphCodeBERT 0.549 0.549 Model Micro Avg. Macro Avg. SEntiMoji 0.714 0.530 BERT 0.754 0.588 RoBERTa 0.758 0.599 ALBERT 0.747 0.584 DeBERTa 0.756 0.607 CodeBERT 0.728 0.567 GraphCodeBERT 0.722 0.552 GitHub Stack Overflow
  • 18. Error Analysis ● Error Categorization by Novielli et al. [1] [1] Novielli, Nicole et al. "A benchmark study on sentiment analysis for software engineering research." 2018 MSR. General Error Implicit Sentiment Polarity Pragmatics Figurative Language Politeness Polar Facts Subjectivity in Annotation
  • 19. Error Analysis on GitHub Dataset General Error: the inability to recognize lexical cues that occur in the text Nice, this is more slick 👍 Implicit Sentiment Polarity: humans use common knowledge to recognize emotions that the models miss Patiently waiting for any updates. […] Surprisingly! Presence of Emojis And yes, there should be tests 😱😱😱
  • 20. RQ2: Can integrating polarity features during training improve PTMs' emotion classification ability?
  • 21. RQ2 Methodology ● Integrate polarity features through token-level attention adjustment ● Assign greater significance to tokens linked with polarity words during fine-tuning
  • 23. Results (Avg F1-score) - GitHub Dataset Model Micro Avg. Macro Avg. BERT BERT-Polarity 0.585 0.619 (+5.99%) 0.591 0.621 (+5.04%) RoBERTa RoBERTa-Polarity 0.575 0.603 (+4.94%) 0.590 0.606 (+2.75%) ALBERT ALBERT-Polarity 0.538 0.580 (+7.86%) 0.539 0.581 (+7.65%) DeBERTa DeBERTa-Polarity 0.610 0.620 (+1.75%) 0.608 0.614 (+1.04%) CodeBERT CodeBERT-Polarity 0.545 0.595 (+9.16%) 0.555 0.601 (+8.37%) GraphCodeBERT GraphCodeBERT-Polarity 0.549 0.563 (+2.52%) 0.549 0.568 (+3.38%)
  • 24. Results (Avg F1-score) - Stack Overflow Dataset Model Micro Avg. Macro Avg. BERT BERT-Polarity 0.754 0.762 (+1.0%) 0.588 0.607 (+3.17%) RoBERTa RoBERTa-Polarity 0.758 0.767 (+1.20%) 0.599 0.646 (+7.78%) ALBERT ALBERT-Polarity 0.747 0.757 (+1.36%) 0.584 0.616 (+10.23%) DeBERTa DeBERTa-Polarity 0.756 0.766 (+1.37%) 0.607 0.624 (+2.89%) CodeBERT CodeBERT-Polarity 0.728 0.742 (+1.91%) 0.567 0.586 (+3.32%) GraphCodeBERT GraphCodeBERT-Polarity 0.722 0.732 (+1.29%) 0.552 0.569 (+3.11%)
  • 25. Error Analysis on GitHub Dataset ● In RQ1, 67 cases all models made mistakes ○ After Polarity enhancement, 27/67 cases - at least one model made correct prediction ● Most improved categories: ○ General error (13/29 cases resolved) ○ Implicit polarity (9/18 cases resolved) ○ Politeness (2/3 cases resolved) ● Least improved categories: ○ Pragmatics (6/7 cases remained unresolved) ○ Figurative Language (6/9 remains unresolved) ● Still considerate amount of misclassified utterances have presence of Emojis
  • 26. Key Takeaways ● General PTMs excel in emotion classification within SE texts compared to SE-specific models ● Polarity features enhance performance consistently ○ Challenges persist especially with negative emotions ● No single model excels across all emotions and metrics ● Common error categories are usually context dependant: implicit polarity, figurative language, pragmatics ● Challenges in handling emojis
  • 27. Future Directions ● Establish more benchmark datasets ● Investigate hierarchical emotion classification (2 step) ○ Enhance performance by identifying broad emotional valence before specific categories ● Investigate aspect-based sentiment analysis (ABSA)-enhanced PTMs ● Fusion of text and emoji cues during pre-training/fine-tuning ● Explore generative language models for emotion detection ○ Utilize zero-shot and few-shot learning for data augmentation and prompting techniques ● Focus on detecting emotions that may harm productivity (e.g., Frustration)
  • 28. Questions/Thoughts/Collaboration Ideas to: Mia Mohammad Imran, imranm3@vcu.edu Thank You! Question?