SlideShare a Scribd company logo
1 of 25
No Hardware. No Software. No Hassle MT.
New Breakthroughs in Machine Translation Technology
in association with#KantanWebinar
KantanMT.Com
NO HARDWARE. NO SOFTWARE. NO HASSLE MT
Tony O’Dowd
Founder & Chief Architect
New Breakthroughs in Machine
Translation Technology
What we aim to cover today?
What is KantanMT.com?
Challenges of the L10N Industry
 Making the right Project Management decisions
 Going beyond the baseline of MT quality
Conclusions
15 minutes
What is KantanMT.com?
Statistical MT System
 Cloud-based =
 Highly scalable
 Inexpensive to operate
 Quick to deploy
Our Vision
 To put Machine Translation:
 Customization
 Improvement
 Deployment
 …into your hands
Active KantanMT Engines
6,191
Training Words Uploaded
28,243,234,615
Member WordsTranslated
427,526,741
Fully Operational 15 months
Initial Steps of any project are:
 Determine Scope
 How long will it take?
 How much will it cost?
 What is my margin?
 Determine resources
 How many Translators will I need?
Introducing KantanAnalytics™
 …think Fuzzy-Match report and you’ve got it in one!
Challenge #1
How can Project Managers ‘manage’ Post-
Editing Projects?
KantanAnalytics™
Kantan TotalRecall – Advanced TM
% of TM hits in this job
KantanMT – automated translations
% of automated translations for this job
Range of QE Scores
QE range defined to match existing fuzzy match ranges used by
L10N industry
Quality Estimation Scores
Segment level QE scores – akin to fuzzy match scores
Word Counts – Project Stats
Can be used to develop Project TimeLine and Tiered Pricing Model
for Post-Editing Projects
Placeholder & Tag Counts
Used by PM for complexity sur-charges
KantanAnalytics embeds QE scores
into
 TRADOS Studio
 MemoQ
 XLIFF
KantanAnalytics™
Helping PMs make the right business
decisions!
KantanAnalytics™ - Helping PMs make the right decisions
Challenge #2: Going beyond the baseline and developing
production ready MT!
Easy to build 1st baseline engine
 Aggregate Training Data – TM, Mono, Stock, Terminology
 Use Cloud-based platform, like KantanMT.com
Real Challenge:
 How do these platforms go beyond the baseline engine and achieve
higher levels of production quality
Introducing Kantan BuildAnalytics
 Data analytics and visualisation providing insights into the
customisation of SMT engines.
Kantan BuildAnalytics™
Rapidly develop production ready engines
 Summary Report
 Training Rejects Reports
 F-Measure Analysis
 BLEU Analysis
 TER Analysis
 GAP Analysis
 Timeline Report
 Deep Tuning
Kantan BuildAnalytics™
F-Measure Score
Measures word recall & precision of KantanMT engines
Distributions
Provides distribution of F-Measure scores across all reference
translations
Kantan Insight™
Holistic analysis of score and advice on how to improve this for
KantanMT engines
Detailed Analysis
Segment level F-Measure analysis to help SMT Developers
improve training material
Kantan BuildAnalytics™
Detailed Reports for: F-Measure, BLEU and TER
Kantan BuildAnalytics™
Gap Analysis – quickest way of improving fluency
Kantan BuildAnalytics™
Training Rejects Report – Improve training data rapidly
Kantan BuildAnalytics™
Timeline – Tracks history of KantanMT engines
Kantan BuildAnalytics™ - Rapid MT Customisation
bmmt GmbH and KantanMT:
The Real-World Use
of Machine Translation
Maxim Khalilov
Technical Lead
bmmt GmbH
maxim.khalilov@machine-translation.eu
KantanMT webinar
April 10, 2014
MT in industry: context and rationale
The combination of these two technologies, well-established TM and cutting-edge MT, plus
post-editing allows the creation of a high-quality translation that reads just as well as a
“classically” produced translation.
MT in industry: what about cost?
The cost structure changes when machine translation is integrated into the translation pipeline.
When machine translation is adopted, the data preparation and quality assurance (editing) costs rise
whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is
reduced dramatically as illustrated.
MT case study
 Customer: big German machine manufacturer
 Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.
 Settings: the files were processed through Trados Studio 2011.
 Implementation: KantanMT
 Description: Roughly 7,000 words came from TM as high matches. The remainder went through
MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the
same level of quality as in an all-human translation.
 Training material: Our customer had not worked in this language combination before, so there was
no TM to go on. But we knew that the English authors based their work on material that the
customer had previously translated from German into English. Thus we reversed the language
direction of the TM and trained a customer-specific engine with this TM.
 Results: As a result, 44,000 words were post-edited to a final quality level that the customer was
very happy with.
 Cost savings > 30%.
MT: benefits of KantanMT solution
 Fully automated system training
 One-click system customization
 Automatic data pre-processing
 Fully automated translation
 Automatic pre- and post-processing
 Quality assessment
 KantanWatch
 Gap Analysis
 Reject Report
 No worry about maintenance and infrastructure
MT: benefits of KantanMT solution
 Transparent file format conversion
 Training material conversion: TM conversion, monolingual material
 Documents to translate: TMS format into MTable format
 SDLXliff
 Smooth terminology integration
 Consistent terminology
 Tag handling and mark-up transfer
Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8
SWord 9</g>
Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g
id="16481">Number</g>
bmmt GmbH
 Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology
solutions
 Three operations centers in Germany: Munich, Berlin and Stuttgart
 bmmt GmbH heavily relies on KantanMT services from 2013
 Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT
 Types of documents: workshop texts, product catalogues & other highly repetitive information documents
 Primary source language: German
 Integration: SDL Trados, SDL WorldServer and others
 Find more: www.machine-translation.eu
Berlin
Alt-Moabit 92
10559 Berlin
Phone: +49 30-3117505-15
Fax: +49 30-3117505-20
Munich
Bernhard-Wicki-Straße 5
80636 Munich
Phone: +49 89 2000037-17
Fax: +49 89 2000037-11
Stuttgart
Ruppmannstraße 33b
70565 Stuttgart
Phone: +49 711 16646-66
Fax: +49 711 16646-50
bmmt GmbH
info@machine-translation.eu
Thank you
No Hardware. No Software. No Hassle MT.
New Breakthroughs in Machine Translation Technology
in association with#KantanWebinar
Tony O’Dowd, tonyod@kantanmt.com
Maxim Khalilov, maxim.khalilov@machine-translation.eu
Speakers

More Related Content

What's hot

Gestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de BarcelonaGestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de BarcelonaManuel Herranz
 
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)TAUS - The Language Data Network
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)Konstantin Savenkov
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)Konstantin Savenkov
 
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017Manuel Herranz
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyIconic Translation Machines
 

What's hot (8)

Gestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de BarcelonaGestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de Barcelona
 
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
 
Lesson 1 introduction to programming
Lesson 1 introduction to programmingLesson 1 introduction to programming
Lesson 1 introduction to programming
 
CAN FD Stack Introduction & Related FAQ
CAN FD Stack Introduction & Related FAQCAN FD Stack Introduction & Related FAQ
CAN FD Stack Introduction & Related FAQ
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 

Viewers also liked

Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETLganblues
 
Git, Beginner to Advanced Survey
Git, Beginner to Advanced SurveyGit, Beginner to Advanced Survey
Git, Beginner to Advanced SurveyRafal Rusin
 
Apache HISE + Apache Camel
Apache HISE + Apache CamelApache HISE + Apache Camel
Apache HISE + Apache CamelRafal Rusin
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldabaux singapore
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 

Viewers also liked (7)

Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETL
 
Git, Beginner to Advanced Survey
Git, Beginner to Advanced SurveyGit, Beginner to Advanced Survey
Git, Beginner to Advanced Survey
 
Apache HISE + Apache Camel
Apache HISE + Apache CamelApache HISE + Apache Camel
Apache HISE + Apache Camel
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 

Similar to New Breakthroughs in Machine Transation Technology

Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16kantanmt
 
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)kantanmt
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationPoulomi Choudhury
 
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMTTAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMTTAUS - The Language Data Network
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...kantanmt
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivitykantanmt
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS - The Language Data Network
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...SDL
 
iMT Language Solutions
iMT Language SolutionsiMT Language Solutions
iMT Language SolutionsSDL
 
Gestión proyectos traducción en la Universitat Autònoma de Barcelona
Gestión proyectos traducción en la Universitat Autònoma de BarcelonaGestión proyectos traducción en la Universitat Autònoma de Barcelona
Gestión proyectos traducción en la Universitat Autònoma de BarcelonaManuel Herranz
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLoriThicke
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)TAUS - The Language Data Network
 
KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowdkantanmt
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...Konstantin Savenkov
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...SDL
 
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)TAUS - The Language Data Network
 

Similar to New Breakthroughs in Machine Transation Technology (20)

Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive Translation
 
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMTTAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
 
KantanMT
KantanMT KantanMT
KantanMT
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivity
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
 
KantanMT Brochure
KantanMT BrochureKantanMT Brochure
KantanMT Brochure
 
KantanMT for Automotive
KantanMT for AutomotiveKantanMT for Automotive
KantanMT for Automotive
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
 
iMT Language Solutions
iMT Language SolutionsiMT Language Solutions
iMT Language Solutions
 
Gestión proyectos traducción en la Universitat Autònoma de Barcelona
Gestión proyectos traducción en la Universitat Autònoma de BarcelonaGestión proyectos traducción en la Universitat Autònoma de Barcelona
Gestión proyectos traducción en la Universitat Autònoma de Barcelona
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
 
KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowd
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
 

More from kantanmt

KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskaskantanmt
 
Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2kantanmt
 
Kantanfest: Laura Casanellas
Kantanfest: Laura CasanellasKantanfest: Laura Casanellas
Kantanfest: Laura Casanellaskantanmt
 
Kantanfest: Dimitar Shterionov - Part 1
Kantanfest: Dimitar Shterionov - Part 1Kantanfest: Dimitar Shterionov - Part 1
Kantanfest: Dimitar Shterionov - Part 1kantanmt
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Waykantanmt
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answerkantanmt
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systemskantanmt
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translationkantanmt
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...kantanmt
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016kantanmt
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translationkantanmt
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translationkantanmt
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up businesskantanmt
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?kantanmt
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translationkantanmt
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTkantanmt
 
Breaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerceBreaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommercekantanmt
 
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCUCloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCUkantanmt
 
How to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EURHow to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EURkantanmt
 
How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014 How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014 kantanmt
 

More from kantanmt (20)

KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskas
 
Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2
 
Kantanfest: Laura Casanellas
Kantanfest: Laura CasanellasKantanfest: Laura Casanellas
Kantanfest: Laura Casanellas
 
Kantanfest: Dimitar Shterionov - Part 1
Kantanfest: Dimitar Shterionov - Part 1Kantanfest: Dimitar Shterionov - Part 1
Kantanfest: Dimitar Shterionov - Part 1
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Way
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answer
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translation
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translation
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translation
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up business
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translation
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMT
 
Breaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerceBreaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerce
 
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCUCloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
 
How to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EURHow to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EUR
 
How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014 How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014
 

Recently uploaded

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

New Breakthroughs in Machine Transation Technology

  • 1. No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with#KantanWebinar
  • 2. KantanMT.Com NO HARDWARE. NO SOFTWARE. NO HASSLE MT Tony O’Dowd Founder & Chief Architect New Breakthroughs in Machine Translation Technology
  • 3. What we aim to cover today? What is KantanMT.com? Challenges of the L10N Industry  Making the right Project Management decisions  Going beyond the baseline of MT quality Conclusions 15 minutes
  • 4. What is KantanMT.com? Statistical MT System  Cloud-based =  Highly scalable  Inexpensive to operate  Quick to deploy Our Vision  To put Machine Translation:  Customization  Improvement  Deployment  …into your hands Active KantanMT Engines 6,191 Training Words Uploaded 28,243,234,615 Member WordsTranslated 427,526,741 Fully Operational 15 months
  • 5. Initial Steps of any project are:  Determine Scope  How long will it take?  How much will it cost?  What is my margin?  Determine resources  How many Translators will I need? Introducing KantanAnalytics™  …think Fuzzy-Match report and you’ve got it in one! Challenge #1 How can Project Managers ‘manage’ Post- Editing Projects?
  • 6. KantanAnalytics™ Kantan TotalRecall – Advanced TM % of TM hits in this job KantanMT – automated translations % of automated translations for this job Range of QE Scores QE range defined to match existing fuzzy match ranges used by L10N industry Quality Estimation Scores Segment level QE scores – akin to fuzzy match scores Word Counts – Project Stats Can be used to develop Project TimeLine and Tiered Pricing Model for Post-Editing Projects Placeholder & Tag Counts Used by PM for complexity sur-charges KantanAnalytics embeds QE scores into  TRADOS Studio  MemoQ  XLIFF
  • 7. KantanAnalytics™ Helping PMs make the right business decisions!
  • 8. KantanAnalytics™ - Helping PMs make the right decisions
  • 9. Challenge #2: Going beyond the baseline and developing production ready MT! Easy to build 1st baseline engine  Aggregate Training Data – TM, Mono, Stock, Terminology  Use Cloud-based platform, like KantanMT.com Real Challenge:  How do these platforms go beyond the baseline engine and achieve higher levels of production quality Introducing Kantan BuildAnalytics  Data analytics and visualisation providing insights into the customisation of SMT engines.
  • 10. Kantan BuildAnalytics™ Rapidly develop production ready engines  Summary Report  Training Rejects Reports  F-Measure Analysis  BLEU Analysis  TER Analysis  GAP Analysis  Timeline Report  Deep Tuning
  • 11. Kantan BuildAnalytics™ F-Measure Score Measures word recall & precision of KantanMT engines Distributions Provides distribution of F-Measure scores across all reference translations Kantan Insight™ Holistic analysis of score and advice on how to improve this for KantanMT engines Detailed Analysis Segment level F-Measure analysis to help SMT Developers improve training material
  • 12. Kantan BuildAnalytics™ Detailed Reports for: F-Measure, BLEU and TER
  • 13. Kantan BuildAnalytics™ Gap Analysis – quickest way of improving fluency
  • 14. Kantan BuildAnalytics™ Training Rejects Report – Improve training data rapidly
  • 15. Kantan BuildAnalytics™ Timeline – Tracks history of KantanMT engines
  • 16. Kantan BuildAnalytics™ - Rapid MT Customisation
  • 17. bmmt GmbH and KantanMT: The Real-World Use of Machine Translation Maxim Khalilov Technical Lead bmmt GmbH maxim.khalilov@machine-translation.eu KantanMT webinar April 10, 2014
  • 18. MT in industry: context and rationale The combination of these two technologies, well-established TM and cutting-edge MT, plus post-editing allows the creation of a high-quality translation that reads just as well as a “classically” produced translation.
  • 19. MT in industry: what about cost? The cost structure changes when machine translation is integrated into the translation pipeline. When machine translation is adopted, the data preparation and quality assurance (editing) costs rise whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is reduced dramatically as illustrated.
  • 20. MT case study  Customer: big German machine manufacturer  Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.  Settings: the files were processed through Trados Studio 2011.  Implementation: KantanMT  Description: Roughly 7,000 words came from TM as high matches. The remainder went through MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the same level of quality as in an all-human translation.  Training material: Our customer had not worked in this language combination before, so there was no TM to go on. But we knew that the English authors based their work on material that the customer had previously translated from German into English. Thus we reversed the language direction of the TM and trained a customer-specific engine with this TM.  Results: As a result, 44,000 words were post-edited to a final quality level that the customer was very happy with.  Cost savings > 30%.
  • 21. MT: benefits of KantanMT solution  Fully automated system training  One-click system customization  Automatic data pre-processing  Fully automated translation  Automatic pre- and post-processing  Quality assessment  KantanWatch  Gap Analysis  Reject Report  No worry about maintenance and infrastructure
  • 22. MT: benefits of KantanMT solution  Transparent file format conversion  Training material conversion: TM conversion, monolingual material  Documents to translate: TMS format into MTable format  SDLXliff  Smooth terminology integration  Consistent terminology  Tag handling and mark-up transfer Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8 SWord 9</g> Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g id="16481">Number</g>
  • 23. bmmt GmbH  Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology solutions  Three operations centers in Germany: Munich, Berlin and Stuttgart  bmmt GmbH heavily relies on KantanMT services from 2013  Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT  Types of documents: workshop texts, product catalogues & other highly repetitive information documents  Primary source language: German  Integration: SDL Trados, SDL WorldServer and others  Find more: www.machine-translation.eu
  • 24. Berlin Alt-Moabit 92 10559 Berlin Phone: +49 30-3117505-15 Fax: +49 30-3117505-20 Munich Bernhard-Wicki-Straße 5 80636 Munich Phone: +49 89 2000037-17 Fax: +49 89 2000037-11 Stuttgart Ruppmannstraße 33b 70565 Stuttgart Phone: +49 711 16646-66 Fax: +49 711 16646-50 bmmt GmbH info@machine-translation.eu Thank you
  • 25. No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with#KantanWebinar Tony O’Dowd, tonyod@kantanmt.com Maxim Khalilov, maxim.khalilov@machine-translation.eu Speakers