Can crowdsourcing be a reliable source of data for speech technology?
By DefinedCrowd. Presented at Crowdsourcing Week Global 2016. Learn more and join the next event: www.crowdsourcingweek.com
5. April 2016 definedcrowd5
What it takes to get there
Large amounts of data Deep Learning
3000+ hours speech recordings + transcription
200+ words with pronunciations
0.5M natural language variants + semantic annotation
Language and Product dependent!
6. April 2016 definedcrowd6
DefinedCrowd landscape
We serve the data needs for AI and
ML landscape.
We’re a SaaS company that collects
and enriches training data for AI,
combining crowdsourcing and ML.
7. April 2016 definedcrowd7
The world before DefinedCrowd
Louis, Speech Scientist
Wants to test if the
Chinese acoustic model
works for Mandarin
speakers in Singapore
User Goal
Hires:
• Few vendors
• 1PM
• 1 Dev
• 1 Chinese LE in-house
What does he do?
50 hours of raw speech with…
• Poor quality (~20% of garbage)
• Unknown sources
• Long wait
What does he get?
8. April 2016 definedcrowd8
The world after DefinedCrowd
Andy, Speech Scientist
Wants to test if the
Chinese acoustic model
works for Mandarin
speakers in Singapore
User Goal
Subscribes our
platform
What does he do?
50 hours of pure speech
with…
• High-quality
• 100% transparency
• 50% faster throughput
What does he get?
• Picks a template
• Adjusts settings and
picks the crowd
• Launches the job
• Collects the data
How does he do it?
13. April 2016 definedcrowd13
We learn from metadata to provide recommendations to customers and crowd members
How we use Machine Learning
14. April 2016 definedcrowd14
How we detect spam
Raw data
• Logging system
• Behavior measures
Data Processing
•Clean data
•Transform data
Feature Extraction
• Task-related measures
(e.g. average duration)
• Session Duration
• Execution peaks
• Consensus score
• Real-time audits
Classification &
Analysis
• Detect outliers/
anomalies
• Predict task / job
duration
18. April 2016 definedcrowd18
Quality in our platform
1. Combined score of Qualification Tests
2. Real-time Audits and Reviews
3. Majority Vote
4. Overall Majority
5. Worker Expertise
6. Task Subjectiveness
7. …
19. April 2016 definedcrowd19
Other predictions using Machine Learning
Best quality / budget tradeoff
Best match between job and crowd member
Expected quality
When will a job finish (even before it starts)
Quality Time
Cost
Does this sound familiar?
How many of you have tried a speech recognition system?
How many of you are still using it?
Different dialects, age and gender balanced
Being Speech and NLP our core expertize. But we’re moving fast to other industry verticals.
Customer side
Students have proven to be more reliable; work flexibility; managed crowd – we validate who they are
School they attend
Course they are enrolled
City, country where they live
Languages they speak
Age
Gender
Dialect
Performance in the job
- Qualifications tests, RTAs, agreement scores
Jobs they took
Speed
Activity in our platform