Tony O’Dowd takes us through some of the most innovative technologies offered on the KantanMT.com platform which are helping a growing community of KantanMT users to develop and self-manage custom Machine Translation engines in the cloud.
Maxim Khalilov then illustrates bmmt’s journey with Machine Translation on KantanMT. He discusses what they have achieved so far in terms of MT engine development and showcases the value that his team is bringing to their growing international client base through the use of Machine Translation.
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
New Breakthroughs in Machine Transation Technology
1. No Hardware. No Software. No Hassle MT.
New Breakthroughs in Machine Translation Technology
in association with#KantanWebinar
2. KantanMT.Com
NO HARDWARE. NO SOFTWARE. NO HASSLE MT
Tony O’Dowd
Founder & Chief Architect
New Breakthroughs in Machine
Translation Technology
3. What we aim to cover today?
What is KantanMT.com?
Challenges of the L10N Industry
Making the right Project Management decisions
Going beyond the baseline of MT quality
Conclusions
15 minutes
4. What is KantanMT.com?
Statistical MT System
Cloud-based =
Highly scalable
Inexpensive to operate
Quick to deploy
Our Vision
To put Machine Translation:
Customization
Improvement
Deployment
…into your hands
Active KantanMT Engines
6,191
Training Words Uploaded
28,243,234,615
Member WordsTranslated
427,526,741
Fully Operational 15 months
5. Initial Steps of any project are:
Determine Scope
How long will it take?
How much will it cost?
What is my margin?
Determine resources
How many Translators will I need?
Introducing KantanAnalytics™
…think Fuzzy-Match report and you’ve got it in one!
Challenge #1
How can Project Managers ‘manage’ Post-
Editing Projects?
6. KantanAnalytics™
Kantan TotalRecall – Advanced TM
% of TM hits in this job
KantanMT – automated translations
% of automated translations for this job
Range of QE Scores
QE range defined to match existing fuzzy match ranges used by
L10N industry
Quality Estimation Scores
Segment level QE scores – akin to fuzzy match scores
Word Counts – Project Stats
Can be used to develop Project TimeLine and Tiered Pricing Model
for Post-Editing Projects
Placeholder & Tag Counts
Used by PM for complexity sur-charges
KantanAnalytics embeds QE scores
into
TRADOS Studio
MemoQ
XLIFF
9. Challenge #2: Going beyond the baseline and developing
production ready MT!
Easy to build 1st baseline engine
Aggregate Training Data – TM, Mono, Stock, Terminology
Use Cloud-based platform, like KantanMT.com
Real Challenge:
How do these platforms go beyond the baseline engine and achieve
higher levels of production quality
Introducing Kantan BuildAnalytics
Data analytics and visualisation providing insights into the
customisation of SMT engines.
10. Kantan BuildAnalytics™
Rapidly develop production ready engines
Summary Report
Training Rejects Reports
F-Measure Analysis
BLEU Analysis
TER Analysis
GAP Analysis
Timeline Report
Deep Tuning
11. Kantan BuildAnalytics™
F-Measure Score
Measures word recall & precision of KantanMT engines
Distributions
Provides distribution of F-Measure scores across all reference
translations
Kantan Insight™
Holistic analysis of score and advice on how to improve this for
KantanMT engines
Detailed Analysis
Segment level F-Measure analysis to help SMT Developers
improve training material
17. bmmt GmbH and KantanMT:
The Real-World Use
of Machine Translation
Maxim Khalilov
Technical Lead
bmmt GmbH
maxim.khalilov@machine-translation.eu
KantanMT webinar
April 10, 2014
18. MT in industry: context and rationale
The combination of these two technologies, well-established TM and cutting-edge MT, plus
post-editing allows the creation of a high-quality translation that reads just as well as a
“classically” produced translation.
19. MT in industry: what about cost?
The cost structure changes when machine translation is integrated into the translation pipeline.
When machine translation is adopted, the data preparation and quality assurance (editing) costs rise
whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is
reduced dramatically as illustrated.
20. MT case study
Customer: big German machine manufacturer
Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.
Settings: the files were processed through Trados Studio 2011.
Implementation: KantanMT
Description: Roughly 7,000 words came from TM as high matches. The remainder went through
MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the
same level of quality as in an all-human translation.
Training material: Our customer had not worked in this language combination before, so there was
no TM to go on. But we knew that the English authors based their work on material that the
customer had previously translated from German into English. Thus we reversed the language
direction of the TM and trained a customer-specific engine with this TM.
Results: As a result, 44,000 words were post-edited to a final quality level that the customer was
very happy with.
Cost savings > 30%.
21. MT: benefits of KantanMT solution
Fully automated system training
One-click system customization
Automatic data pre-processing
Fully automated translation
Automatic pre- and post-processing
Quality assessment
KantanWatch
Gap Analysis
Reject Report
No worry about maintenance and infrastructure
22. MT: benefits of KantanMT solution
Transparent file format conversion
Training material conversion: TM conversion, monolingual material
Documents to translate: TMS format into MTable format
SDLXliff
Smooth terminology integration
Consistent terminology
Tag handling and mark-up transfer
Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8
SWord 9</g>
Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g
id="16481">Number</g>
23. bmmt GmbH
Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology
solutions
Three operations centers in Germany: Munich, Berlin and Stuttgart
bmmt GmbH heavily relies on KantanMT services from 2013
Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT
Types of documents: workshop texts, product catalogues & other highly repetitive information documents
Primary source language: German
Integration: SDL Trados, SDL WorldServer and others
Find more: www.machine-translation.eu
25. No Hardware. No Software. No Hassle MT.
New Breakthroughs in Machine Translation Technology
in association with#KantanWebinar
Tony O’Dowd, tonyod@kantanmt.com
Maxim Khalilov, maxim.khalilov@machine-translation.eu
Speakers