SlideShare a Scribd company logo
1 of 7
When Crowd Meets Persona: Creating a Large-
Scale Open-Domain Persona Dialogue Corpus
Nov. 2022. @HCOMP (WiP)
Won Ik Cho¹*, Yoon Kyung Lee¹*, Seoyeon Bae¹, Jihwan Kim¹,
Sangah Park², Moosung Kim³, Sowon Hahn¹ and Nam Soo Kim¹
Seoul National University¹, DeepNatural AI², Smilegate AI³
Motivation
• Creating dialogue dataset
 Multiple participants
 High degree of freedom
• Difficulties of crowdsourcing
 Researchers, moderators, and crowdworkers
 Considerate scheduling and conflict resolution required
• Persona dialogue
 Challenging and time-
consuming project
 What should the task
managers keep in mind?
1
Our study
• Setting
 Persona participants (actors) talk with user participants (workers)
 Actors are hired, while workers are crowdsourced
 User initiates the conversation, but persona leads the role
• Collection
 Recruiting workers from crowdsourcing platform
 Chat interface developed by the platform
2
Our study
• Project flow
3
Discussion
• Overview
 RQ1: What should be considered in accommodating the construction
of a successful dialogue dataset?
• The organizer should acknowledge that it differs a lot from usual conversation
and it is crucial to handle unexpected and unwanted situations
 RQ2: What is the role of the moderator in large-scale dialogue dataset
construction?
• Resolve conflicts after constructing a rapport with participants
• Be aware on the points participants feel uncomfortable, empathizing and
understanding the struggles
• Recruitment and financial support that affects the atmosphere
 RQ3: Will such considerations help reach an intended goal of
construction?
• Shown indirectly using survey results, textual analysis, and generative model-
based experiments (to be further investigated)
4
Conclusion
• Dataset
 https://github.com/smilegate-ai/OPELA
• Acknowledgement
 Smilegate AI (funding and discussions)
 DeepNatural AI (crowdsourcing and moderation)
 Kudos to all our crowdworkers 
• Full paper and analyses
 To be disclosed
5
Thank you
6

More Related Content

Similar to 2211 HCOMP

Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Mark_Childs
 
European Communication School: Social Media Session 5
European Communication School: Social Media Session 5European Communication School: Social Media Session 5
European Communication School: Social Media Session 5Richard Stacy
 
Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...berhanu taye
 
#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference OverviewLaura Pasquini
 
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningMental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningDaniel Eizans
 
CorporateCommunityOWF2010
CorporateCommunityOWF2010CorporateCommunityOWF2010
CorporateCommunityOWF2010Connect'up
 
Zen and the Art of UX Planning
Zen and the Art of UX PlanningZen and the Art of UX Planning
Zen and the Art of UX PlanningCorey Allenbach
 
Redistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationRedistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationKurt Luther
 
Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Karen S Calhoun
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementctedds
 
The Birth of the HUGE UX School
The Birth of the HUGE UX SchoolThe Birth of the HUGE UX School
The Birth of the HUGE UX SchoolMichal Pasternak
 
Project Management Base Camp
Project Management Base CampProject Management Base Camp
Project Management Base Campeph-hr
 
Some perspectives from the Astropy Project
Some perspectives from the Astropy ProjectSome perspectives from the Astropy Project
Some perspectives from the Astropy ProjectKelle Cruz
 
Project management.docx communiction
Project management.docx communictionProject management.docx communiction
Project management.docx communictionberhanu taye
 
Open Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupOpen Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupChris Aniszczyk
 

Similar to 2211 HCOMP (20)

Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2Supporting online collaboration for design pt 2
Supporting online collaboration for design pt 2
 
COMP 4026 - Lecture 1
COMP 4026 - Lecture 1COMP 4026 - Lecture 1
COMP 4026 - Lecture 1
 
Mg6088 spm unit-4
Mg6088 spm unit-4Mg6088 spm unit-4
Mg6088 spm unit-4
 
Report
ReportReport
Report
 
European Communication School: Social Media Session 5
European Communication School: Social Media Session 5European Communication School: Social Media Session 5
European Communication School: Social Media Session 5
 
Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...Project management.docx communictionLecture notes Training for Trainers in Ge...
Project management.docx communictionLecture notes Training for Trainers in Ge...
 
Sakai Development Process
Sakai Development ProcessSakai Development Process
Sakai Development Process
 
#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview#nacada12 Pre-Conference Overview
#nacada12 Pre-Conference Overview
 
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and PlanningMental Modeling For Content Work: Contextual Inquiry, Personas and Planning
Mental Modeling For Content Work: Contextual Inquiry, Personas and Planning
 
CorporateCommunityOWF2010
CorporateCommunityOWF2010CorporateCommunityOWF2010
CorporateCommunityOWF2010
 
Proyectos Investigación y Desarrollo
Proyectos Investigación y DesarrolloProyectos Investigación y Desarrollo
Proyectos Investigación y Desarrollo
 
Zen and the Art of UX Planning
Zen and the Art of UX PlanningZen and the Art of UX Planning
Zen and the Art of UX Planning
 
Redistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative CollaborationRedistributing Leadership in Online Creative Collaboration
Redistributing Leadership in Online Creative Collaboration
 
Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410Project Management in Libraries for UCLA IS 410
Project Management in Libraries for UCLA IS 410
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangement
 
The Birth of the HUGE UX School
The Birth of the HUGE UX SchoolThe Birth of the HUGE UX School
The Birth of the HUGE UX School
 
Project Management Base Camp
Project Management Base CampProject Management Base Camp
Project Management Base Camp
 
Some perspectives from the Astropy Project
Some perspectives from the Astropy ProjectSome perspectives from the Astropy Project
Some perspectives from the Astropy Project
 
Project management.docx communiction
Project management.docx communictionProject management.docx communiction
Project management.docx communiction
 
Open Source Lessons from the TODO Group
Open Source Lessons from the TODO GroupOpen Source Lessons from the TODO Group
Open Source Lessons from the TODO Group
 

More from WarNik Chow

2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inpersonWarNik Chow
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech datasetWarNik Chow
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2eWarNik Chow
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminarWarNik Chow
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH WarNik Chow
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categoriesWarNik Chow
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate SpeechWarNik Chow
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLPWarNik Chow
 

More from WarNik Chow (20)

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inperson
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
 
2106 JWLLP
2106 JWLLP2106 JWLLP
2106 JWLLP
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccT
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
 

Recently uploaded

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 

Recently uploaded (20)

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 

2211 HCOMP

  • 1. When Crowd Meets Persona: Creating a Large- Scale Open-Domain Persona Dialogue Corpus Nov. 2022. @HCOMP (WiP) Won Ik Cho¹*, Yoon Kyung Lee¹*, Seoyeon Bae¹, Jihwan Kim¹, Sangah Park², Moosung Kim³, Sowon Hahn¹ and Nam Soo Kim¹ Seoul National University¹, DeepNatural AI², Smilegate AI³
  • 2. Motivation • Creating dialogue dataset  Multiple participants  High degree of freedom • Difficulties of crowdsourcing  Researchers, moderators, and crowdworkers  Considerate scheduling and conflict resolution required • Persona dialogue  Challenging and time- consuming project  What should the task managers keep in mind? 1
  • 3. Our study • Setting  Persona participants (actors) talk with user participants (workers)  Actors are hired, while workers are crowdsourced  User initiates the conversation, but persona leads the role • Collection  Recruiting workers from crowdsourcing platform  Chat interface developed by the platform 2
  • 5. Discussion • Overview  RQ1: What should be considered in accommodating the construction of a successful dialogue dataset? • The organizer should acknowledge that it differs a lot from usual conversation and it is crucial to handle unexpected and unwanted situations  RQ2: What is the role of the moderator in large-scale dialogue dataset construction? • Resolve conflicts after constructing a rapport with participants • Be aware on the points participants feel uncomfortable, empathizing and understanding the struggles • Recruitment and financial support that affects the atmosphere  RQ3: Will such considerations help reach an intended goal of construction? • Shown indirectly using survey results, textual analysis, and generative model- based experiments (to be further investigated) 4
  • 6. Conclusion • Dataset  https://github.com/smilegate-ai/OPELA • Acknowledgement  Smilegate AI (funding and discussions)  DeepNatural AI (crowdsourcing and moderation)  Kudos to all our crowdworkers  • Full paper and analyses  To be disclosed 5

Editor's Notes

  1. Hi, we are joint team of Seoul national university, Deep natural AI, and smilegate AI, from South korea. Today we are going to present our work-in-progress project on persona dialogue creation with hired persona actors and crowdsourced users.
  2. Our work first considers an innate difficulty of making up dialogue corpus, that two or more participants are necessarily involved with the construction process, and such process has so high degree of freedom that the quality control of the output may not be feasible. Also, in many corpus creation work these days corporate with crowdsourcing companies and the moderators there, who recruit the workers and manage their overall load and compensation. That is, the role of researchers, moderators and crowdworkers are all slightly different concerning the goal and scale, which requires a considerate scheduling and conflict resolution. In this light, we’ve come to a question that how should the persona dialogue corpus generation should be managed in practice.
  3. In our study, we let persona participants, namely the actors, talk with user participants, the workers. Actors are hired here, while workers are crowdsourced. For every dialogue, the user initiates the conversation, but persona actors lead the role while they talk. The collection is processed by recruiting workers from the community of crowdsourcing platform, using the chat interface developed by the platform so as to check and manage the progress of the conversation. Freedom of conversation was guaranteed as much as possible, but users who make actors uneasy or feel eerieness were reported and set aside from the project. After the collection was finished, we analyzed the survey and interview done with participants and the moderator, and furthermore analyzed the constructed data.
  4. We demonstrate the overall project flow. First, guidelines for the conversation are created by researchers, and the platform and moderator recruit actors and workers based on the guidelines. Here, actor plays the perfona they first decided, and the user initiate the conversation with the persona based on the profile they face, only if the pass the test prepared for user participants. When the conversation starts, The conversation lasts over 15 turns, and it is terminated by actors or workers if they feel fatigued or feel bored. They finish a survey after each conversation, and the reward is given afterward according to the amount of dialogue.
  5. After the whole collection phase, we answered our research questions briefly. First, In accommodating the construction of a successful persona dialogue dataset, the organizer should acknowledge that it differs a lot from usual conversation and it is crucial to handle unexpected and unwanted situations, which could be moderated by a expertise moderator. To look more into this, the moderator should resolve conflicts after constructing a rapport with participants so that they can report whatever they feel uncomfortable, at the same time empathizing and understanding their struggles. Recruiting them and managing finance is also a crucial role in that such environments can deter or boost the atmosphere of the project. We've also found that the whole process led to high quality generation of the persona dialogue dataset and recently disclosed it online, but our work is to be further investigated with more thorough experimental criteria, and to be presented as a more mature work afterwards.
  6. Our work is currently disclosed in the github of our funding agency, smilegate AI. also, we thank deep natural AI for building up the chat interface, recruiting participants from the worker pool, and moderating the whole process. Finally, we thank all our crowdworkers, including actors and users, who made up the whole dialogues and went through the survey and interviews. Since our work is in progress, we will soon disclose the whole analysis results with our full paper.
  7. Thank you for listening 