SlideShare a Scribd company logo
1 of 17
Download to read offline
Assessing the accuracy and teachers’ impressions of Google
Translate: A study of primary L2 writers in Hong Kong
Paul Stapleton*, Becky Leung Ka Kin
Department of English Language Education, The Education University of Hong Kong, 10 Lo Ping Rd., Taipo, Hong Kong
a r t i c l e i n f o
Article history:
1. Introduction
Most people have had the experience of encountering a text in a foreign language and wanting to know what it says. Since
the early 2000s machine translation (MT) has been available at the click of a link using providers such as Google Translate
(GT), Babel Fish and the like. Concurrently, these same platforms have been available to international students who may wish
to read and write in their native tongue using MT to assist with their understanding of a text or assignment submissions.
With regard to foreign language education programs and courses, the use of MT has been a particular concern, at least for
courses requiring written assignments, because automated translators have the potential to eliminate the motivation for L2
students to learn to write in a target language. Until recently, however, this concern has been mitigated by the quality of
translations generated by MT, which were either poorly constructed or even gibberish, and as such, instantly recognizable as
the products of MT. Recent advances, however, in the methods used to generate automated translations may be changing this
landscape.
In the present study, two sets of scripts written by the same primary students were compared with each other – one set
written in English, and the other set written in their native Chinese to the same prompt, and then translated into English using
GT. Teachers who were unaware that some of the scripts were products of GT graded them and were interviewed about their
impressions.
2. Machine translation
MT has a lengthy history, emerging during the Cold War in the 1950s with attempts made at automating translations for
the United States to gain an upper hand against the Soviet Union. Initially, these ventures were well funded by the US
government, but by the early 1970s, without any significant advances, funding dried up and most MT projects were aban-
doned, although they continued in some countries (Slocum, 1988).
In the 1980s, with growing computer power, new approaches to MT emerged. In a seminal article, Nagao (1984) explained
that rather than continuing the deep linguistic analysis of morphological, semantic and syntactic information that had been
used in prevailing systems, an approach that matches strings of text in a bilingual corpus with parallel sets of texts offered
better quality results. In the 1990s and afterwards, with more powerful computing available, statistical approaches, which still
analyzed phrases in bilingual corpora, were viewed as the way forward for MT.
* Corresponding author
E-mail addresses: paulstapleton@gmail.com (P. Stapleton), rec09nor@gmail.com (B. Leung Ka Kin).
Contents lists available at ScienceDirect
English for Specific Purposes
journal homepage: http://ees.elsevier.com/esp/default.asp
https://doi.org/10.1016/j.esp.2019.07.001
0889-4906/Ó 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
English for Specific Purposes 56 (2019) 18–34
Most recently, because of intense competition between technology giants such as Google and Facebook whose success
partially depends upon being able to facilitate communication among speakers of different languages, new systems have been
developed largely based on a neural network approach as opposed to the previous phrased-based system, which translated
chunks of text separately. The new neural system takes enormous amounts of human-translated text and trains the system,
creating a digital representation of the word or phrase and its accompanying context. It then statistically chooses the closest
probable match in the target language. GT claims that although their new neural system “can still make significant errors that
a human translator would never make, like dropping words and mistranslating proper names or rare terms, and translating
sentences in isolation rather than considering the context of the paragraph or page” (Google AI Blog, 2016), it is still a sig-
nificant improvement over earlier systems. In the meantime, Facebook claims to have developed an alternative neural
network approach for language translation “that achieves state-of-the-art accuracy at nine times the speed of recurrent
neural systems” (Gehring & Auli, 2017, p.1). Because of these recent improvements in the quality of MT’s translations, coupled
with our own positive, albeit informal, experience using GT to translate from English to Chinese in pilot exercises, the present
study focuses on the outputs of GT. Because Facebook is embedded in a social media app, it appears not to have been used by
many as a stand-alone translation tool.
3. MT as a pedagogical tool
The recent advances in MT suggest that pedagogical studies conducted just a few years ago may no longer be as relevant
except as indicators of MT’s earlier inadequacies. Sheppard (2011), for example, describes translations from French into
English as a “risky business” that is “riddled with mistakes or worse” (p. 566). In another study (Kirchhoff, Turner, Axelrod, &
Saavedra, 2011), using GT to translate from English to Spanish, when the fluency of the translation of 385 sentences was rated
on a five-point scale from “flawless” (5) to “incomprehensible” (1) by two native-Spanish speakers, a mean fluency score of
only 3.73 resulted. Similarly, van Rensburg, Snyman, and Lotz (2012) found that the quality of GT translations from Afrikaans
to English and vice versa scored by five raters needed substantial improvement and post-editing. These studies indicate that
until a few years ago, GT had reached a level of quality where the translations may have been useful, but were still far from
perfect.
Two more recent papers by Groves and Mundt (2015), Mundt and Groves (2016) go some distance towards providing
background into the use of GT by learners of English in an academic context. In the first (2015), they investigated the linguistic
accuracy of texts originally written by students in their L1 (Malay and Chinese) and then translated into English using GT. The
results revealed that although the translations had a relatively high rate of grammatical errors, this rate was similar to the
minimum level of accuracy required for university entrance. Their 2016 paper discussed some of the ramifications of GT,
especially as its translations become more accurate. For example, they weighed whether GT falls into a different category than
other technological advances that have improved writing, such as the spell and grammar checkers that arrive with word
processing programs.
One issue raised by Mundt and Groves (2016) is the matter of discourse competence. They suggest that while MT may be
approaching the grammatical level of competence of certain learners of English, it lacks the human ability to satisfy the norms
of a discourse community in features that go beyond the sentence level. Another issue they raise is plagiarism. If a student
uses GT to translate a source text from another language into English, plagiarism detection software such as Turnitin may not
detect it as matching text. There is also the issue of whether the GT translation of a text written in a student’s native tongue
can really be called the student’s authentic product. On this latter point, Mundt and Groves (2016) conclude that because the
student is producing the original ideas, any MT translation should be considered the student’s own work. All of these vexing
issues, however, remain largely hypothetical if, or until, MT reaches a level of accuracy where readers of its translations are not
immediately aware that the text was translated by a machine.
Apart from the quality of GT’s translations in a pedagogical context, a number of studies have investigated how GT can be
used as a language learning tool. Bahri and Mahadi (2016) in a questionnaire-based study asked 17 Malaysian students what
their attitudes were regarding using GT as supplementary language learning tool. Their students collectively responded that
GT was a useful tool; additionally, students who had stronger positive feelings towards GT as a language learning tool also
scored higher in an examination used as a benchmark in the study. However, we learn little from this study about the learning
strategies used by the students. In an earlier attitudinal study of 46 Swedish students learning English, Josefsson (2011) found
that 90% of students were using GT as part of their usual practice when dealing with English and most had positive attitudes
towards it, although many complained about its inaccuracies. More recent studies (Farzi, 2016; Lee, 2019; O’Neill, 2019) have
had similar mixed findings regarding students’ use of GT.
As for instructors’ attitudes towards the use of MT, several studies have concluded that teachers are largely skeptical about
using MT in the language classroom. In one of the larger scale studies investigating students’ and teachers’ attitudes towards
MT, Clifford, Merschel, and Munne (2013) found that although students tended to find MT helpful, the 43 instructors they
questioned at an American university were “skeptical of a positive impact on language learning” (p. 116). Similarly, in a study
of 41 instructors at a Spanish university, while most teachers believed using MT to translate individual words was within
ethical bounds, most also believed using MT to translate paragraphs or passages was completely unethical (Jolley & Maimone,
2015).
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 19
4. Research inquiry
The studies described above indicate that despite improvements in MT over time, translations still include a large number
of errors. They also highlight students’ and teachers’ skepticism towards MT. However, given the recent improvements in GT
and the lack of studies on the pedagogical implications of these improvements since GT’s latest upgrade, the time appears
appropriate to explore the quality and nature of GT’s translations and consider their implications. The present study, similar to
those described above, sought to recruit learners of a second language because, as noted by others (Bahri & Mahadi, 2016;
Josefsson, 2011; Mundt & Groves, 2016), GT can be used as a language learning tool. For this reason, we designed an
exploratory study that inquired whether GT had reached a level of quality under which students could use their native tongue
(in this case Chinese) to write a passage and then use GT to translate it into English and have it pass unrecognized as a product
of MT by teachers who grade the scripts. We were also interested to know teachers’ reactions to GT translations (once being
informed (post-grading) that the scripts were MT-generated). Thus, the following research questions are pertinent:
1. How do teachers grade the grammar, vocabulary and comprehensibility of two sets of parallel essays composed by L2
primary students – one written in English and the other in Chinese and subsequently translated by GT into English?
2. What are teachers’ reactions upon learning that they have read MTs from students?
3. Do teachers believe GT has a role as a pedagogical tool?
4. What areas of language in the GT-generated product, if any, stand out as being unnatural, i.e., especially advanced for L2
primary students, or erroneous?
5. Method
To answer the research questions, we conducted a mixed-method study in three data collection phases: 1) collect com-
positions written by Primary 6 students; 2) recruit teachers to grade the translated scripts and the English compositions; 3)
interview teachers and inquire about their general impressions of the scripts and their attitudes towards GT as a pedagogical
tool. Normal ethical procedures were followed throughout including securing parental permission and assuring anonymity.
5.1. Script collection
In the first phase, 26 English compositions and 22 Chinese compositions were collected in June 2018. All the compositions
were written by Primary 6 students from a local school in Hong Kong. All of the students were 11–12 years old and native
Cantonese speakers. At the time of their participation in the study, they had finished six years of primary school education and
were about to enter junior secondary school. They had received daily English language lessons during this time at school and
were able to write short compositions in various genres, such as narratives and descriptions. This group of students was
chosen because they had received instruction in argumentative writing in the previous year and were familiar with the genre.
Choosing from among a list of prompts that were provided by the authors in advance, the students’ teachers decided that
the prompt, “Is half-day school a good idea,” (and its Chinese equivalent, “半日制學校好嗎? 試談談你的看法”) was suitable and
interesting for students. Then, we left teachers to design the writing class in the way it is normally conducted. Before writing,
teachers reviewed the argumentative genre for 30 minutes with students. A worksheet was given to students to draft an
outline. Then, the students were given 60 minutes to complete an English composition. Several days later, the Chinese
teachers conducted a writing class in a similar way. This time, however, the students had their English composition returned
and they were told to write a Chinese composition based on the parallel Chinese prompt (above) in which they could refer to
their English composition. This design had a twofold purpose: 1) by allowing the students to view their original composition,
we felt these 11-year-old students would be more motivated to complete a task that on the surface they may have felt to be
redundant; 2) when comparing the parallel scripts, this method allowed us to identify, compare and analyze the scripts of
those students who decided to translate sentence by sentence and word by word. At the end of the first phase, 22 Chinese
compositions and 26 English compositions were collected.
5.2. Data cleaning
Before the scripts were machine translated, data cleaning was carried out on the Chinese scripts. Two kinds of mis-written
characters were found: 1) non-words (錯字) which are mistaken forms of standard Chinese characters and 2) confused words
(別字).1
Table 1 shows that nearly all the confused words were homophonic to the correct word. The correction of two
confused words, “自你能力” (literally: self-you-ability, from script 11) and “飯箱” (literally: rice-box, from script 12), was
deemed necessary because they are not commonly confused for the respective standard words. Keeping the original would
1
Confused words are standard Chinese characters that are written with a similar wrong character in standard word combinations. They are usually
homophonic, e.g., when “家賓” (home-guest) is confused for “嘉賓” (honorable-guest), or in a similar formation of the correct word, e.g., when “農歷”
(lunar-history) is confused for “農曆” (lunar-calendar). Confused words are wrong because they are semantically incompatible (Li, 2004).
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34
20
produce confusing translations from the original meaning in Chinese. Four non-words (as shown in Table 2) were also cleaned
because they could not be generated by any Chinese input methods. Thus, the corresponding standard characters were input
to fill in the gaps. In total, the cleaning rate of the Chinese text was limited to 0.093%.
5.3. Script assessment
In the second phase, the 22 Chinese compositions were translated into English by GT and then randomly interspersed with
the students’ 26 English compositions for teachers to grade.
All the scripts were compiled into an online survey (https://www.surveymonkey.com/r/VGGDX3B) on SurveyMonkey
software. Twelve teachers were recruited via personal contacts for this grading exercise. Six teachers were Cantonese-
speaking English teachers and six were native English speaking teachers (called “NETs,” an official term used by the local
Education Bureau) at various primary schools in Hong Kong. When asked to describe the academic level of their school in
relative terms, the teachers claimed their school was about average, i.e., not especially high- or low-level compared to other
schools. All of the teachers had multiple years of experience grading and correcting compositions. They were given a rubric to
grade the scripts based on three criteria: grammar, vocabulary and comprehensibility. For each criterion, teachers gave a
grade of A, B, C or D, based on the rubric which had accompanying descriptors for each grade (Appendix A). They were told in
the instructions to ignore content and organization as these two elements fall outside of MT’s capabilities and could serve as a
Table 2
Non-words found in Chinese scripts.
No. Non- word Correct word
1 另
ling6
“another”
2 遲
ci4
“late”
3 駁
bok3
“another”
4 許
heoi2
“quite (many)”
Table 1
Confused words found in Chinese scripts.
No. Confused word Correct word Chinese sentence Meaning
1 小
“small”
少
“little/few”
學小
小 一點知識 learn less knowledge
2 太小
小 時間 too little time
3 課堂就會小
小 一點 fewer lessons
4 功課都會小
小 一點 less homework
5 工
“efforts”
功
“task”
做工
工 課 do homework
6 已
“already”
以
“before”
已
已 前香港 Hong Kong in the past
7 應 jing1
“should”
認 jing6
“deem”
我應
應 為 I think.
8 許
“allow”
需
“need”
許
許 要休息 need rest
9 培
“cultivate”
陪
“accompany”
培
培 伴子女 accompany (their) children
10 舒
“relax”
紓
“relieve”
舒
舒 解壓力 relieve stress
11 你
“you”
理
“care”
自你
你 能力 self-care ability
12 箱
“container”
商
“merchant”
飯箱
箱 lunch supplier
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 21
distraction from the present study’s focus. The 12 teachers were neither told about the purpose of the study, nor were they
told that some of the scripts were translated. Teachers in the first phase of script collection were not recruited for the grading
exercise as they were aware of the study’s design and the involvement of GT.
5.4. Interview
In the final phase, within a few days after completing their scoring, semi-structured interviews were conducted indi-
vidually with the 12 teachers by telephone using the following guiding questions:
1. What were your general impressions of the scripts?
2. Did you notice anything unusual about the writing in the scripts?
3. (After revealing that about half of the scripts were machine translated from the original Chinese). Did you notice this
and if so, what were the signals that suggested so?
4. What are your feelings about students using GT? Do you think it has a place in L2 pedagogy?
A total of 174 min of recordings were collected and then transcribed by one author. The transcriptions were then sum-
marized and sent to respective teachers to check for accuracy and further comment.
5.5. Data analysis
To answer the first research question, we needed to transform the raw, ordinal data into scalable measurements to make
statistical comparison possible. Although a marking rubric was provided, we realized that every teacher would grade scripts
somewhat differently based on their own interpretation of the rubric’s descriptors. To solve these two issues, Rasch modelling
was applied. Rasch measurement transforms ordinal grades, e.g., “A” to “D” in this case, into logits which enables researchers
to compare teachers’ grades on a linear scale (Boone, Staver, & Yale, 2013). The Rasch model also checks for misfits. Raters (our
grading teachers in this case) who are too lenient or strict are indicated by an out-of-range infit mean square. When such
cases occur, a raters’ scores are excluded, and the data can be re-run to generate sufficiently reliable grades for the purpose of
a study. Then, an independent-sample t-test can be conducted to determine whether there is any statistical significance.
This study adopted a two-faceted design, raters (teachers) and items (scripts). The software Minifac (Facets) Rasch was
used to generate the Rasch scores. Then, these scores were extracted and input into SPSS to perform a significance test. Rasch
measurement was repeated three times to generate data for the three criteria: grammar, vocabulary and comprehensibility.
For the second and third research questions enquiring about the teachers’ reaction to the GT-translated scripts and their
beliefs about students’ use of GT, the transcripts were reviewed by the first author, which led to four codes being generated.
These codes aligned closely with the interview questions, except for the final code, which emerged from the coding process
(List 1).
List 1 Codes emerging from the interviews
1) Beliefs about the general quality of the scripts (before being informed about GT)
2) Reaction upon being informed about GT
3) Beliefs about using GT in the classroom or as a pedagogical tool
4) Beliefs about the accuracy of GT
Once these main themes or codes were established, the second author, after a training session on applying the codes,
independently coded three of the 12 transcripts. Agreement was numerically calculated by noting agreement/disagreement
at the level of each individual exchange between the interviewer and interviewee over the total number of codes assigned
under each of the four codes. Agreement between the two coders (first and second authors) on those three transcripts
reached a level of 70%. To further ensure a satisfactory level of reliability, after further training, two more transcripts were
coded by the second author which resulted in the level of agreement rising to 87%. This level satisfies standard norms;
Smagorinsky (2008, p. 401) suggests reaching a level above 80% over 15% of the data. Differences were resolved via discussion
between the two authors.
To answer the fourth research question, we reviewed the GT-translated scripts for instances of what we deemed both
exceptionally advanced grammar and vocabulary for the Primary 6 learners of English, as well as instances of misuses or
errors.
6. Results
6.1. RQ1
Table 3 shows teachers’ grading in Rasch scores. One local and one NET teacher’s scores in grammar fell outside the
acceptable limit of fits which is between the range of MnSq 0.5 and 1.5 (Boone et al., 2013: 166), with infit mean squares of 1.7
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34
22
and 0.4 respectively. This indicated that their scores did not fit the model; thus, those scores were excluded and the analysis
was processed again to generate the results shown in Table 3. The fit statistics of the three data sets (grammar, vocabulary and
comprehensibility) show that the teachers’ scores can be taken as sufficiently reliable for the purposes of this study. As shown
in Table 3, teachers differed in leniency in each data set. The difference for the grammar scale was larger than the other two
scales, with the strictest marker at þ2.13 logits and the most lenient at 1.91 logits.
Table 4 compares the scores given by local English teachers and NETs to see if there was an inter-group difference. In all
three scales, t-test results showed there were no significant inter-group differences (p > .05). Moreover, the effect sizes were
small, as indicated by d < 0.3. As the results show, local teachers and NETs judged the scripts similarly in grammar, vocabulary
and comprehensibility.
The table in Appendix E contains the Rasch scores for each script. Data in this table were extracted and entered into SPSS to
perform an independent-sample t-test. The results of the t-test are shown in Table 5.
Table 5 shows non-GT script scores have positive mean measures in both grammar and vocabulary while those for GT
scripts are negative. The positive logits indicate that the teachers graded non-GT scripts lower on average, and the negative
logits mean that teachers gave GT scripts higher grades. The difference for grammar was significant (t(46) ¼ 3.79, p ¼ .000)
and the effect size was large (d ¼ 1.105). Therefore, teachers considered GT-scripts as significantly better than non-GT scripts
in grammar. Although the difference for vocabulary was not significant (t(46) ¼ -1.98, p ¼ 0.54), the effect size was medium
(d ¼ 0.567).
As the t-test results on comprehensibility show, non-GT scripts had a lower mean measure and they were scored higher.
However, the difference was not statistically significant (t(46) ¼ .093, p ¼ .926). Moreover, the effect size was small
(d ¼ 0.028). Thus, GT and non-GT scripts seemed equally comprehensible to teachers.
6.2. RQ2 and RQ3 interview findings
Findings from the interview data are presented here with illustrative excerpts based on the four codes. In themes where
differences were noted between the local and NET teachers, distinctions are made; however, for the most part, few important
differences were observed between the two groups of teachers.
6.2.1. Beliefs about the general quality of the scripts (before being informed about GT)
The main, albeit underlying, purpose for asking the teachers for their general impressions of the scripts was to ascertain
whether they had recognized that GT had been used to generate some of the scripts. Among the 12 teachers (pseudonyms
used throughout), only two mentioned GT without prompting. NET Ryan volunteered, “[there were] phrases I’ve never come
across before. And I think, when I looked at some of them . what I notice is that a lot of them are using Google Translate.”
Table 3
Leniency-severity level of teachers (N ¼ 12).
Rater Group Grammar Vocabulary Comprehensibility
Measure (logits) Model Error Infit MnSq Measure (logits) Model Error Infit MnSq Measure (logits) Model Error Infit MnSq
1 LET 0.10 0.25 0.75 þ1.60 0.23 1.03
2 LET 0.24 0.23 1.24 0.22 0.25 1.17 þ0.18 0.22 1.09
3 LET þ0.48 0.23 0.71 þ0.20 0.25 0.91 þ0.41 0.22 1.10
4 LET 1.91 0.26 1.38 1.25 0.25 1.13 0.60 0.22 1.13
5 LET þ2.13 0.25 1.01 þ1.10 0.25 0.77 þ2.12 0.25 1.39
6 LET þ0.02 0.23 1.15 þ0.74 0.25 1.32 þ0.41 0.22 0.94
7 NET 0.94 0.24 1.16 0.58 0.25 1.18 þ0.59 0.22 1.36
8 NET þ0.43 0.23 0.74 þ1.16 0.25 1.39 þ0.88 0.22 0.80
9 NET 0.08 0.23 0.85 þ0.62 0.25 0.67 þ1.13 0.23 0.90
10 NET þ0.23 0.23 0.67 þ0.98 0.25 0.67 þ0.69 0.22 0.53
11 NET þ1.11 0.23 1.04 þ1.66 0.25 1.20 þ0.64 0.22 0.95
12 NET þ1.35 0.25 0.59 þ1.49 0.23 0.47
M þ0.12 0.24 1.00 0.10 0.25 0.98 0.79 0.22 0.97
SD þ1.03 0.01 0.23 0.22 0.25 0.27 0.69 0.01 0.27
Table 4
Comparison of scores given by local English teachers and NETs.
Scale Teacher group N M SD t df p Cohen’s d
Grammar Local 5 þ0.10 1.45 .074 8 .943 0.043
NET 5 þ0.15 .75
Vocabulary Local 6 þ0.08 .82 1.690 10 .122 0.087
NET 6 þ0.87 .79
Comprehensibility Local 6 þ0.69 1.00 .503 10 .626 0.280
NET 6 þ0.90 .35
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 23
Likewise local teacher Mandy immediately stated, “After I marked it - after I graded a few of them, I realized that ... you’re
doing something like [using] a translation engine, or like Google Translation.”
In explaining her suspicions about the use of GT, Mandy correctly noted one script used the phrase “talk about the sky”
(original sentence: “Just like playing chess games together, talking about the sky, [an idiomatic expression in Cantonese
meaning “making idle chit-chat”] can make each other’s relationship better.”), which did not make sense to her in English.
However, the Chinese lexical equivalent, “聊天”, occurred to her, and she realized it was a direct translation for “chatting.” She
also noted a student’s use of the term “small homework.” She guessed students might have mis-typed the Chinese word for
“little” (少) (for quantity) as “small” (小) (for size), and as the phrase was entered into GT, “small homework” was generated.
Mandy also noted some odd grammatical structures, which triggered her to suspect the use of GT. NET Ryan also noted
strange phrases like “live to be old and learn to be old” and “school rice merchants” and wondered how Primary 6 students
had come across these phrases, and thus suspected the use of GT.
However, apart from these two, the remaining 10 teachers focused on technical aspects of the texts, particularly the content,
which they were again told was not meant to be part of the assessment. Aside from the teachers’ comments on the content,
which were eliminated from the analysis, the majority of these teachers commented that the overall quality of the scripts
ranged from “typical” (George, Gary, Ryan and Doris) of the level they are familiar with, to “very good” (Rosa, Megan and Steve)
and “impressive” (Justin). It should be noted here that these comments were made about all 48 scripts in general, indicating
that the teachers were unaware, at that point, that there were two distinct sets of scripts – GT-translated and non GT-translated.
Notable in the remarks coded under the “beliefs about the quality” theme, were two teachers besides Mandy and NET Ryan
who mentioned what they thought were expressions translated directly from Chinese. Rosa, for example, noted, “I can
comprehend [the text]. I know they’re translating [in their minds] from Chinese to English.” NET George stated, “it looks like
they’ve been given words that looked at the direct translation and then used the vocabulary incorrectly.” In these two ex-
amples, however, it is uncertain whether their references are to GT-translated texts or not.
6.2.1.1. Vocabulary. Teachers commented with widely divergent views on the range and quality of vocabulary in the scripts, as
well as lexical mistakes made. For example, NET Justin thought the vocabulary range was wide. Among local teachers,
Michelle thought students could write with basic words only, lacking variety. Rosa was impressed to find advanced vocab-
ulary was being used in five to 10 scripts. In her class, she claimed only two to three students (out of roughly 25) were able to
use such words. However, Gary found some advanced words were situated in poorly constructed sentence structures.
6.2.1.2. Grammar. Generally speaking, the teachers as a whole believed the grammar in the scripts was either at an average or
above-average level compared to their own students. Rosa, for example, reported she did not find many grammatical mistakes
in the scripts. She could see students had a good understanding of basic grammar rules. NET George spotted a few good
compositions with flawless grammar. NET Justin thought the grammar was “pretty good generally.” However, he said there
was considerable disparity in the grammar level. He thought it was due to differences in students’ English ability. NET Steve
found there were some grammatical problems, but he thought they were minor as they did not hinder comprehensibility.
6.2.1.3. Comprehensibility. Local teachers had no problem understanding the scripts. Mandy pointed out that local teachers
may comprehend students’ compositions better than the NETs since many of the sentences resembled direct translations
from Chinese. However, most of the NETs did not report any difficulty with comprehensibility. NET Steve did not give the
lowest grade, D, to any scripts in terms of comprehensibility, although NET Martin found some scripts were ambiguous
because the context and connections between sentences was sometimes lacking as they presumed readers would know what
they were arguing about.
6.2.2. Teachers’ reaction upon being informed about GT
Apart from the two teachers who suspected GT was used for some of the scripts, five local teachers and five NETs did not
suspect the use of GT during the grading exercise. After the methodology of the study was revealed, their reactions ranged
from “surprised” (George and Martin) to “amazed” (Doris) to “shocked” (Megan), although this knowledge appeared to trigger
latent thoughts. NETs Martin and Doris did not suspect the use of GT because they noticed their own students would mentally
translate Cantonese to English in a literal manner, rather than formulating their ideas in terms of English. NET Justin recalled
some irregularities (e.g., a disparity in grammar level, strange phrases, and colloquialisms being used in the wrong context),
but he thought these mistakes were also typical of non-MT-translated writing from his students.
Table 5
Comparison between non-GT and GT scripts on grammar, vocabulary and comprehensibility.
Scale Scripts M SD t df p Cohen’s d
Grammar GT 0.78 1.24 3.79 46 .000 1.105
Non-GT þ0.67 1.38
Vocabulary GT 0.40 1.41 1.98 46 .054 0.567
Non-GT þ0.34 1.19
Comprehensibility GT þ0.02 1.21 .093 46 .926 0.028
Non-GT 0.01 .95
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34
24
Upon being informed about the GT scripts, several NET teachers quickly recalled having seen some irregularities that made
them suspect the use of GT, although such irregularities may not necessarily have been in the GT scripts. NET Steve, for
example, suspected some ungrammatical sentences such as, “They’re at home can play games” and “half-day school have less
lessons,” might have been generated by GT. Although both instances recalled by NET Steve were in fact written in English as
confirmed by a later search, it should be noted that it did not mean similar mistakes could not be found in GT scripts. NET
Justin also suspected GT could be responsible for some grammar errors, strange phrases and colloquialisms used in the wrong
context. “Now that you’ve told me Google Translate [was used], I’m thinking back and I can see how some of the scripts were
probably translated through it.” In retrospect, NET George was the only teacher who suspected advanced words, rather than
erroneous phrases, were Google translated. “All of a sudden they have this word that seemed out of place that was very –
quite an advanced word for their writing. My thought was that they probably would have used Google Translate.”
Among local teachers, Michelle, recalled that some scripts seemed like direct translations from Chinese and she thought
those might have been generated by GT. Rita also found strange phrases in some scripts; however, GT did not cross her mind
because she thought those mistakes were normal even for her own students. “I didn’t notice, but I did see some strange
phrasing in sentences, [which] is quite normal for students... [but] I didn’t have the thought that it’s Google translated.” After
the GT element was revealed to him, Gary reported that he guessed students might have gotten help from online translators
or from teachers, but it had not occurred to him that any whole script had been machine translated.
These remarks from the teachers, taken as a whole, suggest that the GT-translated scripts did not really stand out from
those written in English by the students.
One point to note is that although the majority of teachers could not distinguish GT-translated scripts from the English scripts
written by students, it did not necessarily mean that the teachers felt the translations were better. Megan suggested that GTscripts
could be as flawed as students’ writing. Mandy had the impression that machine translators and students both tended to commit
the same grammatical mistakes, such as subject-verb agreement, confusion between singular and plural nouns, and tenses.
6.2.3. Teachers’ beliefs about using GT in the classroom or as a pedagogical tool
All of the teachers clearly stated that GT should not be used by students to translate passages or sentences they had written
in their L1. The following excerpts provide a flavor for this belief:
NET Martin: If they’re kind of copying whole passages into [GT], they should be trained to - ideally not to use it.
Gary: I don’t like it when I give them a writing, and they translate the whole thing and give it back to me.
On the other hand, nine out of 12 teachers were not against GT as a learning tool. From his observations, Gary deemed it
normal for students to use machine translators and he appreciated it when GT was used for strengthening students’ language
skills. NET Justin described GT as “a powerful educational tool.” NET George also thought GT was “a great tool.” He thought
technology was something students in this generation were privileged to have and he considered it “silly” not to use it at all.
Mandy thought online translators were acceptable to be used as a tool and she claimed to be delighted to see students make
use of many tools available to them for their own good. “Using it like a tool, I would encourage students to do it because we no
longer rely on paper dictionaries, and if they have the motivation and the will to use some – any tools available to them, then I
will be more than happy.”
Although the teachers generally were in favor of students using online translators to assist learning, they were cautious
about the extent to which GT should be exploited as a learning tool. For many of them, where to draw the line lay where
students could usefully benefit. Three teachers (Mandy, Gary, NET Martin) thought the use of GT should be confined to the
word level. For example, Mandy and Gary thought it was acceptable for students to translate single words that they did not
know and put them back into sentences. NET Martin thought it was fine if students had the words in their mind and used GT
to check whether they were correct.
Although NET Justin thought GT was a “powerful tool,” he claimed to be “conflicted” about the use of GT, a belief noted by
several other teachers. He thought it could only be beneficial if it were used correctly. Students had to know the grammar
rules and how to correct the errors generated by GT. Yet, he believed if students were to become proficient in English writing,
they had to be able to “do it from scratch” (aside from using GT for noticing and correcting errors). Two teachers (NET Justin
and Rosa) thought the use of GT should depend on students’ language proficiency. NET Justin claimed GT could be useful as an
initial step for beginners. However, Rosa opposed students of lower ability using GT. For some teachers, the boundaries for
acceptable use of GT depended on the purpose of learning. NET Steve would be against using GT if the purpose was to check
students’ grammar, spelling and sentence structure. If the purpose was to see if students could present their ideas in a
reasonable and logical manner, however, he thought GT could be applied. NET Doris held a similar stance. She thought it was
acceptable for students to use GT to “get an idea.” However, she still worried students would become reliant on GT.
Some teachers suggested that schools and teachers should provide guidance for students to ensure GT was properly used
for learning. NET Martin thought that students should “get some English out of GT if they were to use it.” In order to achieve
this, he believed schools and teachers should train students to check translation results, change them and edit them. Similarly,
Rosa thought students should use what they have learnt in class to check GT translation results. NET Ryan suggested GT could
be used by teachers to discuss mistakes and to help students improve their English. In other words, teachers could explain
why a translated sentence is wrong, how it is wrong and how students can improve it.
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 25
On the other hand, three local female teachers (Megan, Michelle and Rita) were completely against students’ using GT.
Their beliefs are best summarized by Rita, who was opposed to the use of GT even for single-word translations. “I would not
recommend... students to use Google Translate even just to know the meaning of an English word in Chinese because there
are Chinese-English dictionaries online.” Rita and Megan worried the students would not be motivated to learn English if they
could get immediate translations. Michelle thought students would copy from GT rather than thinking about how to write in
English. Megan claimed GT made it too easy for students to get translations and this would discourage students from learning
English by themselves. This negative view of GT was partly due to doubts about its ability to generate accurate translations.
6.2.4. Beliefs about the accuracy of GT
Many of the teachers volunteered comments about the accuracy of GT’s translations. Among those who commented, eight
teachers firmly held the view that GT has too many inaccuracies to be trusted. NET Martin’s belief largely summarizes those of
the other seven: “You’re not going to get a direct translation that’s comprehensible to native speakers through platforms like
GT.” George added, “the technology definitely has a long way to go.” However, upon having it brought to the 10 teachers’
attention that they had not noticed that half of the scripts were GT-translated, some teachers were reflective. Justin com-
mented, “I think Google Translate is fairly powerful right now.” Steve admitted that “perhaps, there have been improvements
made in Google Translate in recent years.” Most teachers, however, held to their original beliefs that GT’s translations were
inaccurate.
In sum, despite indications by most of the teachers that GT could be used as a tool, all teachers had concerns about GT
negatively affecting students’ learning. In general, they worried about the inability of GT to generate grammatically correct
translations, and further, that GT’s convenience would lead students astray. They also worried that students would use it as a
“shortcut” (NET Martin) to get translations right away rather than going through the language learning process. Yet over half
of them saw that GT had a place in L2 teaching. Instead of banning the use of GT outright, some of them suggested training on
the correct use of GT should be provided in schools so that students could benefit from it.
6.3. Areas of language in the GT generated product that stood out
Tables 6 and 7 show instances of vocabulary and grammar found in 12 different GT scripts that we deemed at a level
normally more advanced than would typically be expected for Primary 6 L2 students (taken from 11 different scripts). In the
case of advanced grammar, we especially noted the use of participial clauses (Table 7).
However, despite these examples where GT’s translation appeared to correctly enhance the students’ English, in other
places, errors were made, even when the students’ original Chinese was correct. Table 8 shows examples of these (ques-
tionable translations in italics).
7. Discussion
7.1. Addressing the research questions
The recent advance in the quality of translations generated by MT, particularly GT, was the impetus for the present study.
We surmised that if MT-translations did not stand out as having inferior quality when interspersed among a set of parallel
compositions originally written in the target language, it would provide evidence that MT may be reaching a higher level of
quality than that noted in recent studies, although we are aware that our study was conducted in very confined circum-
stances, i.e., relating to students’ level, language and topic. Based on both the quantitative and qualitative results, our sup-
position appears to have been confirmed.
Our study was driven by four research questions, all concerning the quality and nature of GT’s product focusing on scripts
written in Chinese by primary school students and then translated into English. The first research question in some senses
was a proxy for another broader question, namely, whether GT’s translations would pass unrecognized as MT. Given that the
grades from teachers (both local and NET) on the GT translations were not significantly different than those of the students’
scripts written in English, and where they were, the GT scripts were actually scored higher, we conclude that the mechanical
inaccuracies and lexical errors reported in earlier studies on GT may no longer be appearing at the same rate, at least for the
level, genre and the two languages involved in the present study.
Table 6
Instances of vocabulary deemed exceptionally advanced.
Original Chinese GT generated English (script #)
在課堂上打瞌睡 doze off in class 15
實施全日制 implement (full-time) 03
自由度高 high degree of freedom 07
分配時間 allocate time 08
緩解(. 壓力) alleviate 27
跟父母很疏遠 Alienate from their parents 29
*保充體力 Replenish your energy 13
減少家長負擔 Reduce the burden on their parents 33
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34
26
Further underscoring this point, and referencing our second research question, only two out of 12 teachers suspected
the use of MT. In other words, broadly speaking, the GT-translated scripts appear to have reached a comparable level (and
in the case of grammar, a higher level) to those written in English. However, despite this, it appeared that several teachers,
both NET and local, continued to believe that GT is still generating poor translations, indicating their beliefs may not have
been updated from earlier weaker versions of GT, and MT in general, which have prevailed for decades.
Our third research question regarding teachers’ beliefs about using GT as a pedagogical tool led to mixed reactions. While
three of the 12 teachers thought GT had no place at all in the classroom, the others saw varying uses. However, all the teachers
drew a firm line against students using GT to generate translations from writing in their native tongue beyond the single word
level. This latter finding aligns with other studies (Kirchhoff et al., 2011; Sheppard, 2011; van Rensburg et al., 2012) that cast
doubt on the quality of MT’s output. Although some of the teachers in the present study had similar reservations about the
quality of GT’s output, another concern was the larger threat posed by advancing artificial intelligence. An undercurrent in the
interviews was a deep-seated resistance towards GT because its use could undermine the students’ motivation and reason to
learn how to write in English, and by extension, even have a negative effect on their chosen teaching career.
Our fourth question, which led us to investigate areas of GT’s output that were either impressively good or error-ridden,
resulted in conflicting findings. There were instances of vocabulary and grammatical constructions generated by GT that
were probably more advanced than the student authors could have produced when writing in English, and this was
confirmed by back checking the students’ original scripts written in English. To cite just one example, in script 15, the
student wrote in English, “[Students] will be tired doing the lessons.” However, the same student’s Chinese was translated
by GT as “Students will doze off in class.” This latter sentence translated by GT from the student’s native tongue presumably
better captures the nuanced meaning the student intended. This finding suggests that at least in some cases, the process of
writing in the native language and then machine translating into English resulted in not only a correct translation, but also
one with more advanced language and nuanced meaning than the students would normally be capable of when writing in
English. This apparent improvement is underscored by the significantly higher scores given by the teachers to the GT scripts
for grammar.
On the other hand, some GT translation errors could be viewed as constructive feedback for improving GT. A few lexical
errors resulted from the mistranslation of Chinese homonyms. For example, “改” carries two different meanings: 1) change
“改變” (verb-object compound); and 2) mark (homework) “批改” (compounding verbs). In script 6 (as referred to in Table 8),
ambiguity arose as the homonym “改” was written with neither a post-modifying object nor a pre-modifying compounding
verb. Despite the presence of “workbook” (習作簿) as a contextual clue, GT mis-selected the translated word, “change,” over
“mark.” This exposed one shortcoming of assigning priority to probability over contextual understanding when machine-
translated choices are made.
Table 7
Instances of grammar deemed exceptionally advanced.
Chinese English
老師可以在下午時間改同學的習作簿和準備下一天的課堂內
容, 不用在晚上做。
The teacher can change the classmate’s workbook and prepare the next day’s class
content in the afternoon, without having to do it at night. 06
這樣的學習制度自由度高, 讓我們在課堂上學習更專注. This kind of learning system has a high degree of freedom, allowing us to learn more in
the classroom 07
雖然有很多人覺得全日制較好, 因為可以學多點東西和回家不
知可以做什麼。
Although many people feel that full-time is better, because they can learn more things
and go home, I don’t know what to do. 08
還有, 半日制學校的課程較少, 這正正能給學生一個機會來讓他
們自學, 養成主動學習、追求新知識的習慣。這能令他們即
使在長大後仍然持續學習.
Also, there are fewer courses in half-day schools, which is a good opportunity for
students to self-study and develop the habit of actively learning and pursuing new
knowledge. 09
如果是半日制學校, 課時不夠會導致學生不能完全深入地理解
課堂內容, 所有知識都只不過是一知半解。
If it is a half-day school, insufficient class time will result in students not being able to
fully understand the content of the class. All knowledge is only a half-baked. 27
Table 8
Instances of GT translation errors.
Original Chinese GT generated English
很多學生都希望自己的學校採用 半日制上課時間。 Many students want their school to use half-day class time. 01
有些人又說, 半日制給同學太多時間, 他們會去玩遊戲而不是溫習。 Some people say that half-day gives students too much time, they will go
to play games instead of reviewing. 04
老師可以在下午時間改 同學的習作簿. . The teacher can change the classmate’s workbook.in the afternoon.
06
半日制學校有很多好處 Half-day schools have many benefits. 14
但如果是全日制學校, 老師單是開會也要開到七至八點, 還有可能在晚一點。 However, if it is a full-time school, the teacher will only open seven to
eight points in the meeting, and it may be late. 31
連睡眠時間都不足, 又怎麼會有精神去上課呢? Even if the sleep time is not enough, how can there be a spirit to go to
class? 31
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 27
Certain Cantonese sentence structures also appeared to be difficult for GT to translate into English, even though the
students’ Chinese was correct. In the following example from script 31, mistranslation resulted from verb repetition con-
struction of a verb-object (V-O) compound.
The verb “開會” is a V-O compound in which “開” and “會” carry the meanings of “open” and “meeting” respectively. When
combined as a V-O compound, “開會” has a different resulting meaning of “have a meeting,” as with many other V-O com-
pounds in Cantonese (Matthews & Yip, 1994). In the emphatic sentence structure “verb 也要 verb 到七至八點,” the verb is
repeated in the construction to emphasize the prolongation of the event up to a certain point intime. If the verb is a V-O
compound, as in our case, only the verb component is repeated. This partial reduplication of “V-O 也要V到七至八點,” however,
caused parsing difficulties for GT because the words “開會” and “開” are seen as two separate words by the machine, rather
than a verb repetition construction. This also affected its analysis on the complement “到七至八點” (until 7 to 8 o’clock). Thus,
a nonsensical translation was generated.
The fact that Cantonese existential sentences and possessives are both introduced by the word “有” and have the same
sentence structure of “noun þ 有 þ noun” (Matthews & Yip, 1994) makes them difficult to be distinguished by GT. The
following examples from our dataset show that GT tended to translate both sentences into possessives.
Since the subject “half-day schooling (半日制學校)” in (2) was inanimate, pairing it with a possessive verb seems inap-
propriate, although the comprehensibility of the sentence is not hindered. Similar mistakes were also commonly found in the
students’ scripts when they wrote in English: “One day have 24 h” or “it will just have less homework.”
7.2. Implications
The implications of the broad improvements in GT are difficult to determine at this early stage. Groves and Mundt (2015)
speculate that MT will not replace second language acquisition in the near future. However, they claim that any regulation of
MT “to conform to a previously held world view” (p.119) will not succeed. We agree. Preventing students who are learning foreign
languages from using GToutside classrooms (or even inside) to translate their assignments from their native tongue may become
increasingly fraught with difficulties. Where we disagree with Groves and Mundt (2015) is in their claim that MT is unlikely to be
able to cope with discourse features, such as hedging. Certainly capturing nuances across languages is challenging, if not
impossible in some cases. However, because MT uses existing and growing banks of texts that have been generated by humans,
there should be no reason why MT-generated results cannot be as good as human translations in the coming decade or two.
A greater challenge may be related to language student motivation. If accurate translations into the target language are
instantlyavailable to studentsupon entering text in their native tongue, the motivation to learn towrite (and alsoread) in a foreign
language could decline. Similar advances in the past that have simplified arduous cognitive tasks have seen the quick adoption of
new technologies. An earlier generation of students quickly graduated from the slide rule (or in the case of Chinese students – the
abacus) to the calculator. Likewise, statistics courses now focus on principles and the variety of tests available while largely
ignoring formulas and calculations (now performed by SPSS), that consumed the time and attention of an earlier generation. In
other words, shifting to new technologies that provide faster and reliable results leads to changes in behavior that may be difficult
to ignore orcontrol. In an era of rapidlyadvancing artificial intelligence that theyoung generation has been reared on, there is little
reason to believe that language learners will not take full advantage of the tools available to them.
Thus, teaching strategies for reading and writing in a foreign language that incorporate GT need to be devised assuming
language teaching continues in its present form. These may include using GT as a tool for checking or enhancing both lan-
guage and comprehension after students have either written or read a passage (Lee, 2019; O’Neill, 2019). Another possibility,
as noted by some teachers in the present study, is to use GT to look up individual words or phrases; however, such a usage is
only a short step away from the temptation to enter sentences or even whole texts into GT from the native tongue for
translation into the target language. And this again raises the issue of motivation. Thus, teachers may increasingly need to
(1) 老師 單是 開會 也要 開 到 七至八點.
Teachers only open-meeting (emphatic particle) open until 7 to 8 o’clock.
Meaning: “Just for meetings alone, teachers have meetings until 7 to 8 o’clock.”
GT Translation: “the teacher will only open seven to eight points in the meeting”
(2) Script 14: 半日制學校有很多好處 (existential)
Meaning: “There are many benefits with half-day schooling.”
GT Translation: “Half-day schools have many benefits.”
(3) Script 29: 家長們可以有更多時間去陪伴子女 (possessive)
Meaning and GT translation: Parents can have more time to spend with their children
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34
28
devise strategies, or new specific purposes to encourage students to learn to write in a foreign language in the face of
technology that can provide instant, accurate translations.
Theassumption that language teachingwillcontinue in itspresentform,mentioned above,isanissueraised byCrossley(2018),
whospeculatesonGT’s impact.Specifically,heseesthepotentialforGTto“replac[e]theneedanddemandforFL[foreignlanguage]
learning in general.[as well as] FL teachers” (p. 542) because of a lack of instrumental motivation on the part of students in FL (as
opposed to second language) contexts. Although such a scenario would take years, or more likely, decades to transpire given
entrenched curriculums that are wedded to English in FL contexts, in the meantime the disruptive impact MT has on students’
motivation and behavior towards foreign language learning may increasingly need to be addressed.
7.3. Limitations
The design of the present study, which had teachers scoring two sets of randomly distributed scripts, meant that teachers
were unaware of the two sources. Thus, the teachers could grade the scripts naively and we could then observe their fresh
impressions upon revealing the source of the two sets of scripts in order to answer one of our research questions. Inherent in
this design, however, was our inability to gather accurate data about which set of scripts, GT-translated texts or not, the
teachers’ general references referred to. It was entirely possible that during interviews, when teachers referred to a usage they
had recalled in a script, it could have been from either source.
Other shortcomings concern possible advantages that the study design gave to GT’s translations. For example, having the
students respond to the same topic twice raises the likelihood that the second version, which in this case was written in the native
tongueand thentranslated byGT,would naturallybebetter. However, wefeel that mostimprovementswould havebeenrelatedto
content, which was specificallyexcluded from the grading system. Another possible advantage we gave to GT wasourcleaningof a
very small number of characters written by the students in Chinese, which if left uncorrected would have resulted in gibberish if
translated by GT. However, one could argue that they were already gibberish in the original version.
Teachers’ grades and interview comments provided valuable evidence for this study. Nevertheless, these two kinds of
qualitative and quantitative data are subjective in nature. As some teachers pointed out, there may be a natural inclination to
compare the quality of the scripts with their own students. Therefore, the grades reflected a relative, rather than an absolute
quality. This may explain why grades differed to a large extent even within the same group of teachers. One teacher found
himself conflicted in his grading between simpler compositions and more complex ones. For the former, it was easy to grade
them higher as fewer mistakes were found. With an intention to reward students who challenged themselves to write in
more complicated language, he found himself trying to grade these scripts higher even though they contained more errors.
The concern of this teacher brings up the problem of equating “fewer errors” with “higher quality” when assessing students’
compositions. Future studies may investigate this issue further and include positive evidence such as language styles, for-
mality and syntactic diversity when compiling marking rubrics.
As is customary in small-scale studies such as the present one, the findings are specific to the narrow group of student
participants, as well as the subject area of the prompt and the languages being translated and assessed; thus, the findings are
indicative only. Nevertheless, given these indications, coupled with the rapid improvements in MT, particularly with com-
panies such as Google, Facebook and now Baidu (Dai, 2018) racing to perfect their translation technology, we believe MT will
continue to have an increasing impact on L2 writing in the coming years.
One further limitation was that we confined our focus to writing without exploring MT’s potential in the L2 reading class.
With an increasing amount of reading material consumed in a digital form, the possibility to instantly use GT to translate large
chunks of text may also be tempting for students.
8. Conclusion
GT uses a corpus comprised of a large number of texts widely ranging from official documents to detective novels that are
readily available in multiple languages to serve its purpose (Bellos, 2011). Thus, in some cases, the language that GT generated
in our study was more formal and sophisticated than what Primary 6 students would normally produce. Comparing GT scripts
with students’ compositions helps us to ask what it is that makes good writing good, and how to take advantage of GT’s
strengths to help L2 students learn to write in their target language.
Finally, our era of rapidly advancing artificial intelligence, which includes MT, could bring disruptive changes to language
learning and teaching. Presently, although it may be understandable that pedagogy lags behind as teachers struggle to learn
how to best use and manage the new tools that appear, the recent improvements in MT as indicated in this study suggest that
teachers of foreign languages need to quickly develop a broader realisation about a form of technology that is likely to have a
significant impact on the teaching and learning of the written word in L2 contexts. Given the unlikelihood that students will
not take advantage of such a useful technology, there may be a need to rethink L2 writing pedagogy in which teachers will be
forced to consider how to adopt MT as a pedagogical tool.
Acknowledgements
The authors would like to thank the teachers at the Education University of Hong Kong Jockey Club Primary School. We
also thank Jinxin Zhu for his help with statistics.
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 29
Appendix A. Rubric
Appendix B. Raw grades given by teachers on grammar
Grammar (assessing the
accuracy of usage including
common difficulties such as
tense, subject-verb
agreement, articles, plurals,
complex sentences, etc.)
A
Error-free or
minimal number of
errors
B
Some error-free sentences but
only minor errors in most
sentences
C
Most sentences include minor
errors with some major errors
D
Many errors throughout
both minor and major
Vocabulary (assessing the
range, appropriateness, and
accuracy
A
Uses a wide range
of vocabulary
appropriately and
accurately
B
Uses a range of vocabulary
mostly appropriately and
accurately with a few errors
C
Uses a limited range of words
with many minor errors and a
few major errors
D
Uses a very limited range of
words with many errors
both minor and major
Comprehensibility (assessing
the clarity of message)
A
Completely
understandable
B
Mostly understandable with
some minor ambiguities
C
Partially understandable with a
few major ambiguities or
incomprehensible expressions
D
Some sentences are
understandable, but much
of the script is beyond
comprehension
No. Script type Local English teachers Native English teachers
L1 L2 L3 L4 L5 L6 N1 N2 N3 N4 N5 N6
1 Non-GT A C C D A C C C D C C C
2 Non-GT A B B C A A B A B B C B
3 Non-GT B D C D A D D C D C C C
4 Non-GT C C B D B B B B B C B B
5 Non-GT A A C D B B B C C C C C
6 Non-GT B B B B A B C B C B A B
7 Non-GT C C B C B B B B B C A B
8 Non-GT C C B C A C D C C C C B
9 Non-GT A B B D A B C B B B B C
10 Non-GT D D D D C B C C C C C C
11 Non-GT B C C D B C D C C C C C
12 Non-GT C D D D C B D C B C B C
13 Non-GT A A A A B C A A C B B A
14 Non-GT B B A C A B D B B C A B
15 Non-GT C C C D B C C C C C C C
16 Non-GT D D C D A C C C C C C C
17 Non-GT B D B C B C D B B C C C
18 Non-GT C D C D C C D C C C C C
19 Non-GT C B A B B C B B B B B B
20 Non-GT A B C C A C C C C C A B
21 Non-GT D C B D B C C C C C B B
22 Non-GT B C C D A B C C C C C B
23 Non-GT C D C D C C D C D C D C
24 Non-GT C B C D C C D B D C B C
25 Non-GT C C C C C C C C C B D C
26 Non-GT D D D D C D D C D C D C
27 GT B A A A A A B B B A A B
28 GT C C B B A A C B C B A B
29 GT C A C C B C A A A B B B
30 GT C C B D A B C A B C A B
31 GT D C C D C C B B B C B B
32 GT B B B C A B B B A B A B
33 GT C C A C A C B B A B A B
34 GT C C B D B B C B B C A B
35 GT B A A B A A B B A A A B
36 GT C B C D B C D B C C B B
37 GT C C B C A B C C A B A B
38 GT C B B C B C D B C B B C
39 GT B A B A A D A C B A C B
40 GT D B B A A C C C B A B B
41 GT A A B B A B A A B B A B
42 GT B D A C A B C B B A B B
43 GT D D B D C D D C C C B C
44 GT B A C C A B C C C B A B
45 GT D C B C B C C C C A B B
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34
30
Appendix C. Raw grades given by teachers on vocabulary
(continued )
No. Script type Local English teachers Native English teachers
L1 L2 L3 L4 L5 L6 N1 N2 N3 N4 N5 N6
46 GT C C B D A A B B C C A B
47 GT D B C D B C C B C B B C
48 GT D C B D B B D B C C B C
No. Script type Local English teachers Native English teachers
L1 L2 L3 L4 L5 L6 N1 N2 N3 N4 N5 N6
1 Non-GT C C C D C C D C C C B C
2 Non-GT A B B B A B B A B B A B
3 Non-GT B D C D C D D C C C B C
4 Non-GT C B B C B A B B B B B B
5 Non-GT B A C C B C B B B B B B
6 Non-GT B C C C B B C B C C A B
7 Non-GT C B B B B A B B B B A A
8 Non-GT C B C B B C C B B B B B
9 Non-GT B C A C A B B C B B B B
10 Non-GT B B C C B A C A B B B B
11 Non-GT C B C D B C C C C C C B
12 Non-GT C C C C B A C B B B B B
13 Non-GT C C C C C B A A C B B B
14 Non-GT B B A B B A D B B C A B
15 Non-GT B B C C B B B C B B B B
16 Non-GT C C C D B C C C C C C B
17 Non-GT C C C C B C C B B B B B
18 Non-GT B C C C D C D B C B C B
19 Non-GT C B A B B C B A C B B B
20 Non-GT B B C D B B B C B B B B
21 Non-GT C B C D C C C B C C B C
22 Non-GT B B B C B B B B B C B B
23 Non-GT C C C D C C D C C C B B
24 Non-GT C C C D B C D B C B B B
25 Non-GT C C C C C B C C D B C B
26 Non-GT C C C D C C D C B C C B
27 GT B A A A B A B A B B A B
28 GT B B B B A A B C B B A A
29 GT C B B B B C A A B A A C
30 GT C C B C B B C A B B B B
31 GT C B C C C B B B B B B B
32 GT C C C C B C A B B B A B
33 GT B C A B A C B A A A A B
34 GT C D B D C A C B B B A B
35 GT A A A B A A B B A A A A
36 GT C D C D C C D B C C B C
37 GT C D C C B C C B B C B C
38 GT C D B B B B C A C B A B
39 GT B B B B A C B C A B C B
40 GT C B C A C C C B B A C B
41 GT B A C B A B B A C B A B
42 GT A B A B A A B C B A C B
43 GT C D C D C C D B C C C B
44 GT B C C C A B C C B B C B
45 GT C B C B B C C C B A B A
46 GT C C B D B A B B B C B B
47 GT C D C D C B D C B B A B
48 GT C C A D B B C A B B C B
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 31
Appendix D. Raw grades given by teachers on comprehensibility
Appendix E. Results of non-GT (N [ 26) and GT scripts (N [ 22) across three grading criteria
No. Script type Local English teachers Native English teachers
L1 L2 L3 L4 L5 L6 N1 N2 N3 N4 N5 N6
1 Non-GT A B C B A B C B C B B B
2 Non-GT A A B B A A A A A B B A
3 Non-GT B C B C A C C B B C A B
4 Non-GT B B B C A A A B A C B B
5 Non-GT A A B C A C A B B B B B
6 Non-GT B B B B A B C B B B A B
7 Non-GT C B C B A A B B B C B A
8 Non-GT A B B B B C C B B B B A
9 Non-GT A A A C A B C B A B B B
10 Non-GT C C C C C C B B C B B B
11 Non-GT A A C C A B C C B C B B
12 Non-GT B D D B A B C C C C C B
13 Non-GT A B A B C A B A C A A A
14 Non-GT A A A B A B C B B B A A
15 Non-GT B B B C A B A C B B C B
16 Non-GT B D B D A C C C B C B B
17 Non-GT A C B B A C B A A B B B
18 Non-GT A C C C C C C C B B D B
19 Non-GT C C A C B B B B B B C A
20 Non-GT A B C C A B B B B B A B
21 Non-GT B C C D B C A B B C B B
22 Non-GT A C B B A B B B C B B A
23 Non-GT B C C D B B C C C C B B
24 Non-GT B C C D A C D B C C B B
25 Non-GT B B B B C B A C B B C B
26 Non-GT B C D C C C D C B C C B
27 GT B A A A A A A A A A A B
28 GT B C B A A B C C C B B B
29 GT C A C D B B A A A B B B
30 GT B C B D A C C A B B C B
31 GT C B D D C C A B B B C B
32 GT A B C C A B A B A B B B
33 GT C B A D A D B A A B B B
34 GT B D B C C B B B A C C B
35 GT A A A B A A A A B A B B
36 GT C C D C C D D B B C D B
37 GT C C C C A C B B B B B B
38 GT B D B C B C D B B B C B
39 GT A A A A B C A C A A C B
40 GT A B B A B C B B B A B B
41 GT A A C B A B A A B B A B
42 GT B D A C A B C C B B C B
43 GT C D C D C C B B B C B B
44 GT B C C C A B C C B B C B
45 GT A C B B B B B B B A B B
46 GT B C B D B A B C C C A B
47 GT A C C C B C B B B B B B
48 GT C B B D B B C C C C D B
No. Script type Grammar Vocabulary Comprehensibility
Measure (logits) Measure (logits) Measure (logits)
1 Non-GT þ0.25 0.48 0.68
2 Non-GT þ1.27 þ2.13 0.09
3 Non-GT þ3.68 þ1.88 þ1.71
4 Non-GT þ1.27 0.24 0.09
5 Non-GT 0.68 0.71 0.89
6 Non-GT þ1.55 þ1.39 þ0.1
7 Non-GT þ2.82 þ1.88 þ1.18
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34
32
Appendix F. Supplementary data
Supplementary data to this article can be found online at https://doi.org/10.1016/j.esp.2019.07.001.
References
Bahri, H., & Mahadi, T. S. T. (2016). Google translate as a supplementary tool for learning Malay: A case study at Universiti Sains Malaysia. Advances in
Language and Literary Studies, 7(3), 162-167.
Bellos, D. (2011). Is that a fish in your ear?: Translation and the meaning of everything. New York: Farrar, Straus and Giroux.
Boone, W. J., Staver, J. R., & Yale, M. S. (2013). Rasch analysis in the human sciences. Retrieved from https://doi.org/10.1007/978-94-007-6857-4_20.
Clifford, J., Merschel, L., & Munné, J. (2013). Surveying the landscape: What is the role of machine translation in language learning? Revista d’innovació
Educativa, 10, 108-121. https://doi.org/10.7203/attic.10.2228.
Crossley, S. A. (2018). Technological disruption in foreign language teaching: The rise of simultaneous machine translation. Language Teaching, 51(4), 141-
152.
Dai, S. (2018, October 24). Baidu to debut simultaneous machine translation in latest challenge to Google. Retrieved from https://www.scmp.com/tech/start-
ups/article/2169832/baidu-debut-simultaneous-machine-translation-latest-challenge-google.
Farzi, R. (2016). Taming translation technology for L2 writing: Documenting the use of free online translation tools by ESL students in a writing course. Doctoral
dissertation. Université d’Ottawa/University of Ottawa.
Gehring, J., & Auli, M. (2017). A novel approach to neural machine translation. Retrieved from https://code.fb.com/ml-applications/a-novel-approach-to-
neural-machine-translation/.
Google AI Blog. (2016). A neural network for machine translation, at production scale. Retrieved from https://ai.googleblog.com/2016/09/a-neural-network-
for-machine.html.
Groves, M., & Mundt, K. (2015). Friend or foe? Google translate in language for academic purposes. English for Specific Purposes, 37(1), 112-121. https://doi.
org/10.1016/j.esp.2014.09.001.
(continued )
No. Script type Grammar Vocabulary Comprehensibility
Measure (logits) Measure (logits) Measure (logits)
8 Non-GT þ0.02 0.01 0.47
9 Non-GT þ1.27 þ1.63 þ1.00
10 Non-GT 1.38 1.96 2.22
11 Non-GT þ2.15 þ2.39 þ0.29
12 Non-GT þ0.75 þ0.45 0.68
13 Non-GT 1.14 þ0.22 0.47
14 Non-GT þ1.27 þ1.15 þ1.18
15 Non-GT þ1.55 0.01 þ1.18
16 Non-GT 0.68 1.44 0.28
17 Non-GT 1.14 0.71 þ0.1
18 Non-GT þ0.75 0.48 0.47
19 Non-GT þ0.75 0.01 0.28
20 Non-GT 0.91 0.95 1.35
21 Non-GT þ0.75 þ1.39 þ0.64
22 Non-GT 0.22 0.71 0.68
23 Non-GT þ2.15 þ1.39 þ1.18
24 Non-GT þ1.84 0.71 þ1.00
25 Non-GT 1.90 0.01 1.35
26 Non-GT þ1.27 þ1.39 þ0.10
27 GT þ0.02 0.24 0.47
28 GT 1.14 0.01 0.68
29 GT 1.14 1.69 þ0.10
30 GT 3.26 3.63 2.22
31 GT 0.68 þ0.22 þ0.64
32 GT þ0.25 þ1.15 þ0.29
33 GT 0.68 0.01 þ0.47
34 GT þ1.84 þ2.13 þ1.35
35 GT 2.49 1.44 1.35
36 GT 0.68 0.24 þ0.47
37 GT 1.38 1.19 0.28
38 GT þ0.5 0.01 þ1.35
39 GT 0.91 1.69 þ0.47
40 GT 3.26 2.52 3.11
41 GT 0.22 þ0.22 þ0.64
42 GT þ0.02 0.24 þ1.00
43 GT 1.38 1.96 0.09
44 GT þ0.75 þ2.13 þ2.09
45 GT þ0.50 0.01 þ1.00
46 GT 1.63 0.24 0.68
47 GT 0.91 þ1.15 þ0.47
48 GT 1.38 0.71 1.11
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 33
Jolley, J. R., & Maimone, L. (2015). Free online machine translation: Use and perceptions by Spanish students and instructors. In A. J. Moeller (Ed.), Selected
papers from the 2015 Central states Conference on the teaching of foreign languagesLearn languages, explore cultures, transform lives (pp. 181-200). WI: Eau
Claire.
Josefsson, E. (2011). Contemporary approaches to translation in the classroom: A study of students’ attitudes and strategies. Retrieved from http://urn.kb.se/
resolve?urn¼urn:nbn:se:du-5929.
Kirchhoff, K., Turner, M., Axelrod, A., & Saavedra, F. (2011). Application of statistical machine translation to public health information: A feasibility study.
Journal of the American Informatics Association, 18(4), 473-478. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3128406/.
Lee, S. M. (2019). The impact of using machine translation on EFL students’ writing. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.
2018.1553186.
Li, Y. T. (2004). Writing characteristics of Taiwanese students with handwriting difficulties. Journal of National Taiwan Normal University: Education, 49(2),
43-64. https://doi.org/10.3966/2073753X2004104902003.
Matthews, S., & Yip, V. (1994). Cantonese: A comprehensive grammar. London: Routledge.
Mundt, K., & Groves, M. J. (2016). A double edged sword: The merits and the policy implications of Google translate in higher education. European Journal of
Higher Education, 6(3), 1-15. https://doi.org/10.1080/21568235.2016.1172248.
Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In A. Elithorn, & R. Banerji (Eds.), Artificial
and human intelligence. Retrieved from http://www.mt-archive.info/Nagao-1984.pdf.
O’Neill, E. M. (2019). Online translator, dictionary, and search engine use among L2 students. CALL-EJ, 20(1), 154-177. Retrieved from http://callej.org/
journal/20-1/O’Neill2019.pdf.
van Rensburg, A., Snyman, C., & Lotz, S. (2012). Applying Google Translate in a higher education environment: Translation products assessed. Southern
African Linguistics and Applied Language Studies, 3(4), 511-524. Retrieved from https://doi.org/10.2989/16073614.2012.750824.
Sheppard, F. (2011). Medical writing in English: The problem with Google translate. La Presse Médicale, 40(6), 565-566. Retrieved from http://www.em-
consulte.com/en/article/293595.
Slocum, J. (1988). Machine translation systems. New York: Cambridge University Press.
Smagorinsky, P. (2008). The method section as conceptual epicenter in constructing social science research reports. Written Communication, 25, 389-411.
Paul Stapleton is an Associate Professor at the Education University of Hong Kong.
Leung Ka Kin Becky is a Research Assistant at the Education University of Hong Kong.
P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34
34

More Related Content

Similar to Assessing The Accuracy And Teachers Impressions Of Google Translate A Study Of Primary L2 Writers In Hong Kong

Natural Language Processing and Language Learning
Natural Language Processing and Language LearningNatural Language Processing and Language Learning
Natural Language Processing and Language Learningantonellarose
 
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...AJSERJournal
 
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...AJSERJournal
 
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docxDirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docxcuddietheresa
 
Design and Implementation of a Language Assistant for English – Arabic Texts
Design and Implementation of a Language Assistant for English – Arabic TextsDesign and Implementation of a Language Assistant for English – Arabic Texts
Design and Implementation of a Language Assistant for English – Arabic TextsIJCSIS Research Publications
 
Revised Research Perspectives 2.pptx
Revised Research Perspectives  2.pptxRevised Research Perspectives  2.pptx
Revised Research Perspectives 2.pptxalidolati5
 
Using Instant Messaging For Collaborative Learning
Using Instant Messaging For  Collaborative LearningUsing Instant Messaging For  Collaborative Learning
Using Instant Messaging For Collaborative LearningElly Lin
 
An Analysis Of Errors In Written English Sentences A Case Study Of Thai EFL ...
An Analysis Of Errors In Written English Sentences  A Case Study Of Thai EFL ...An Analysis Of Errors In Written English Sentences  A Case Study Of Thai EFL ...
An Analysis Of Errors In Written English Sentences A Case Study Of Thai EFL ...Natasha Grant
 
A Corpus-based Study of EFL Learners Errors in IELTS Essay Writing.pdf
A Corpus-based Study of EFL Learners  Errors in IELTS Essay Writing.pdfA Corpus-based Study of EFL Learners  Errors in IELTS Essay Writing.pdf
A Corpus-based Study of EFL Learners Errors in IELTS Essay Writing.pdfSarah Marie
 
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONA ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
 
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONA ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
 
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESMULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESmlaij
 
EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN)
EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN) EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN)
EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN) ijaia
 
Work in progress: ChatGPT as an Assistant in Paper Writing
Work in progress: ChatGPT as an Assistant in Paper WritingWork in progress: ChatGPT as an Assistant in Paper Writing
Work in progress: ChatGPT as an Assistant in Paper WritingManuel Castro
 
Language Needs Analysis for English Curriculum Validation
Language Needs Analysis for English Curriculum ValidationLanguage Needs Analysis for English Curriculum Validation
Language Needs Analysis for English Curriculum Validationinventionjournals
 
Ict final written assignment
Ict final written assignmentIct final written assignment
Ict final written assignmentAngelUTN
 
Using Second Life to Assist EFL Teaching: We Do Not Have to Sign In to the Pr...
Using Second Life to Assist EFL Teaching: We Do Not Have to Sign In to the Pr...Using Second Life to Assist EFL Teaching: We Do Not Have to Sign In to the Pr...
Using Second Life to Assist EFL Teaching: We Do Not Have to Sign In to the Pr...wanzahirah
 
Using ontology based context in the
Using ontology based context in theUsing ontology based context in the
Using ontology based context in theijaia
 

Similar to Assessing The Accuracy And Teachers Impressions Of Google Translate A Study Of Primary L2 Writers In Hong Kong (20)

Natural Language Processing and Language Learning
Natural Language Processing and Language LearningNatural Language Processing and Language Learning
Natural Language Processing and Language Learning
 
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
 
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
The Effects of Communicative Language Teaching approach (CLT) on Grammar Teac...
 
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docxDirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
 
The Use Of Google Translate In Translating The Afrikaans Language Into Englis...
The Use Of Google Translate In Translating The Afrikaans Language Into Englis...The Use Of Google Translate In Translating The Afrikaans Language Into Englis...
The Use Of Google Translate In Translating The Afrikaans Language Into Englis...
 
Design and Implementation of a Language Assistant for English – Arabic Texts
Design and Implementation of a Language Assistant for English – Arabic TextsDesign and Implementation of a Language Assistant for English – Arabic Texts
Design and Implementation of a Language Assistant for English – Arabic Texts
 
Revised Research Perspectives 2.pptx
Revised Research Perspectives  2.pptxRevised Research Perspectives  2.pptx
Revised Research Perspectives 2.pptx
 
Using Instant Messaging For Collaborative Learning
Using Instant Messaging For  Collaborative LearningUsing Instant Messaging For  Collaborative Learning
Using Instant Messaging For Collaborative Learning
 
An Analysis Of Errors In Written English Sentences A Case Study Of Thai EFL ...
An Analysis Of Errors In Written English Sentences  A Case Study Of Thai EFL ...An Analysis Of Errors In Written English Sentences  A Case Study Of Thai EFL ...
An Analysis Of Errors In Written English Sentences A Case Study Of Thai EFL ...
 
A Corpus-based Study of EFL Learners Errors in IELTS Essay Writing.pdf
A Corpus-based Study of EFL Learners  Errors in IELTS Essay Writing.pdfA Corpus-based Study of EFL Learners  Errors in IELTS Essay Writing.pdf
A Corpus-based Study of EFL Learners Errors in IELTS Essay Writing.pdf
 
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONA ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
 
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONA ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
 
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESMULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
 
EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN)
EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN) EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN)
EXTENDING A MODEL FOR ONTOLOGY-BASED ARABIC-ENGLISH MACHINE TRANSLATION (NAN)
 
Work in progress: ChatGPT as an Assistant in Paper Writing
Work in progress: ChatGPT as an Assistant in Paper WritingWork in progress: ChatGPT as an Assistant in Paper Writing
Work in progress: ChatGPT as an Assistant in Paper Writing
 
Language Needs Analysis for English Curriculum Validation
Language Needs Analysis for English Curriculum ValidationLanguage Needs Analysis for English Curriculum Validation
Language Needs Analysis for English Curriculum Validation
 
Ict final written assignment
Ict final written assignmentIct final written assignment
Ict final written assignment
 
Using Second Life to Assist EFL Teaching: We Do Not Have to Sign In to the Pr...
Using Second Life to Assist EFL Teaching: We Do Not Have to Sign In to the Pr...Using Second Life to Assist EFL Teaching: We Do Not Have to Sign In to the Pr...
Using Second Life to Assist EFL Teaching: We Do Not Have to Sign In to the Pr...
 
Using ontology based context in the
Using ontology based context in theUsing ontology based context in the
Using ontology based context in the
 
The Impact of Error Analysis and Feedback in English Second Language Learning
The Impact of Error Analysis and Feedback in English Second Language LearningThe Impact of Error Analysis and Feedback in English Second Language Learning
The Impact of Error Analysis and Feedback in English Second Language Learning
 

More from Allison Thompson

Mla Format Citation For Website With No Author - Fo
Mla Format Citation For Website With No Author - FoMla Format Citation For Website With No Author - Fo
Mla Format Citation For Website With No Author - FoAllison Thompson
 
Free Images Writing, Word, Keyboard, Vintage, Antique, Retro
Free Images Writing, Word, Keyboard, Vintage, Antique, RetroFree Images Writing, Word, Keyboard, Vintage, Antique, Retro
Free Images Writing, Word, Keyboard, Vintage, Antique, RetroAllison Thompson
 
How To Do Quotes On An Argumentative Essay In MLA Format Synonym
How To Do Quotes On An Argumentative Essay In MLA Format SynonymHow To Do Quotes On An Argumentative Essay In MLA Format Synonym
How To Do Quotes On An Argumentative Essay In MLA Format SynonymAllison Thompson
 
Writing A Successful College Essay - S
Writing A Successful College Essay - SWriting A Successful College Essay - S
Writing A Successful College Essay - SAllison Thompson
 
Essay On Books Books Essay In English Essay -
Essay On Books Books Essay In English Essay -Essay On Books Books Essay In English Essay -
Essay On Books Books Essay In English Essay -Allison Thompson
 
Best Research Paper Sites Intr
Best Research Paper Sites IntrBest Research Paper Sites Intr
Best Research Paper Sites IntrAllison Thompson
 
Freedom Writers Movie Review Essay Materidikla
Freedom Writers Movie Review Essay MateridiklaFreedom Writers Movie Review Essay Materidikla
Freedom Writers Movie Review Essay MateridiklaAllison Thompson
 
Wordvice Ranked Best College Essay Editing Service In Essay Editor
Wordvice Ranked Best College Essay Editing Service In Essay EditorWordvice Ranked Best College Essay Editing Service In Essay Editor
Wordvice Ranked Best College Essay Editing Service In Essay EditorAllison Thompson
 
Final Student Evaluation Essay
Final Student Evaluation EssayFinal Student Evaluation Essay
Final Student Evaluation EssayAllison Thompson
 
Help Me Write My Paper, I Need Writing Assistance To Help Me With A
Help Me Write My Paper, I Need Writing Assistance To Help Me With AHelp Me Write My Paper, I Need Writing Assistance To Help Me With A
Help Me Write My Paper, I Need Writing Assistance To Help Me With AAllison Thompson
 
The Five Steps Of Writing An Essay, Steps Of Essay Writing.
The Five Steps Of Writing An Essay, Steps Of Essay Writing.The Five Steps Of Writing An Essay, Steps Of Essay Writing.
The Five Steps Of Writing An Essay, Steps Of Essay Writing.Allison Thompson
 
Writing A College Paper Format. How To Make A Pa
Writing A College Paper Format. How To Make A PaWriting A College Paper Format. How To Make A Pa
Writing A College Paper Format. How To Make A PaAllison Thompson
 
022 Essay Example Writing Rubrics For High School E
022 Essay Example Writing Rubrics For High School E022 Essay Example Writing Rubrics For High School E
022 Essay Example Writing Rubrics For High School EAllison Thompson
 
015 Transitional Words For Resumes Professional Res
015 Transitional Words For Resumes Professional Res015 Transitional Words For Resumes Professional Res
015 Transitional Words For Resumes Professional ResAllison Thompson
 
Literary Essay Outline Sample - English 102 Writi
Literary Essay Outline Sample - English 102 WritiLiterary Essay Outline Sample - English 102 Writi
Literary Essay Outline Sample - English 102 WritiAllison Thompson
 
Robot Writing Paper By Teachers Time Store Tea
Robot Writing Paper By Teachers Time Store TeaRobot Writing Paper By Teachers Time Store Tea
Robot Writing Paper By Teachers Time Store TeaAllison Thompson
 
Winner Announcement Of Online Essay Writing Competition
Winner Announcement Of Online Essay Writing CompetitionWinner Announcement Of Online Essay Writing Competition
Winner Announcement Of Online Essay Writing CompetitionAllison Thompson
 
Writing A Paper In Scientific Format
Writing A Paper In Scientific FormatWriting A Paper In Scientific Format
Writing A Paper In Scientific FormatAllison Thompson
 
010 How To Write Creativeay Report Example Sample Coll
010 How To Write Creativeay Report Example Sample Coll010 How To Write Creativeay Report Example Sample Coll
010 How To Write Creativeay Report Example Sample CollAllison Thompson
 

More from Allison Thompson (20)

Mla Format Citation For Website With No Author - Fo
Mla Format Citation For Website With No Author - FoMla Format Citation For Website With No Author - Fo
Mla Format Citation For Website With No Author - Fo
 
Free Images Writing, Word, Keyboard, Vintage, Antique, Retro
Free Images Writing, Word, Keyboard, Vintage, Antique, RetroFree Images Writing, Word, Keyboard, Vintage, Antique, Retro
Free Images Writing, Word, Keyboard, Vintage, Antique, Retro
 
How To Do Quotes On An Argumentative Essay In MLA Format Synonym
How To Do Quotes On An Argumentative Essay In MLA Format SynonymHow To Do Quotes On An Argumentative Essay In MLA Format Synonym
How To Do Quotes On An Argumentative Essay In MLA Format Synonym
 
Writing Essays In Exams
Writing Essays In ExamsWriting Essays In Exams
Writing Essays In Exams
 
Writing A Successful College Essay - S
Writing A Successful College Essay - SWriting A Successful College Essay - S
Writing A Successful College Essay - S
 
Essay On Books Books Essay In English Essay -
Essay On Books Books Essay In English Essay -Essay On Books Books Essay In English Essay -
Essay On Books Books Essay In English Essay -
 
Best Research Paper Sites Intr
Best Research Paper Sites IntrBest Research Paper Sites Intr
Best Research Paper Sites Intr
 
Freedom Writers Movie Review Essay Materidikla
Freedom Writers Movie Review Essay MateridiklaFreedom Writers Movie Review Essay Materidikla
Freedom Writers Movie Review Essay Materidikla
 
Wordvice Ranked Best College Essay Editing Service In Essay Editor
Wordvice Ranked Best College Essay Editing Service In Essay EditorWordvice Ranked Best College Essay Editing Service In Essay Editor
Wordvice Ranked Best College Essay Editing Service In Essay Editor
 
Final Student Evaluation Essay
Final Student Evaluation EssayFinal Student Evaluation Essay
Final Student Evaluation Essay
 
Help Me Write My Paper, I Need Writing Assistance To Help Me With A
Help Me Write My Paper, I Need Writing Assistance To Help Me With AHelp Me Write My Paper, I Need Writing Assistance To Help Me With A
Help Me Write My Paper, I Need Writing Assistance To Help Me With A
 
The Five Steps Of Writing An Essay, Steps Of Essay Writing.
The Five Steps Of Writing An Essay, Steps Of Essay Writing.The Five Steps Of Writing An Essay, Steps Of Essay Writing.
The Five Steps Of Writing An Essay, Steps Of Essay Writing.
 
Writing A College Paper Format. How To Make A Pa
Writing A College Paper Format. How To Make A PaWriting A College Paper Format. How To Make A Pa
Writing A College Paper Format. How To Make A Pa
 
022 Essay Example Writing Rubrics For High School E
022 Essay Example Writing Rubrics For High School E022 Essay Example Writing Rubrics For High School E
022 Essay Example Writing Rubrics For High School E
 
015 Transitional Words For Resumes Professional Res
015 Transitional Words For Resumes Professional Res015 Transitional Words For Resumes Professional Res
015 Transitional Words For Resumes Professional Res
 
Literary Essay Outline Sample - English 102 Writi
Literary Essay Outline Sample - English 102 WritiLiterary Essay Outline Sample - English 102 Writi
Literary Essay Outline Sample - English 102 Writi
 
Robot Writing Paper By Teachers Time Store Tea
Robot Writing Paper By Teachers Time Store TeaRobot Writing Paper By Teachers Time Store Tea
Robot Writing Paper By Teachers Time Store Tea
 
Winner Announcement Of Online Essay Writing Competition
Winner Announcement Of Online Essay Writing CompetitionWinner Announcement Of Online Essay Writing Competition
Winner Announcement Of Online Essay Writing Competition
 
Writing A Paper In Scientific Format
Writing A Paper In Scientific FormatWriting A Paper In Scientific Format
Writing A Paper In Scientific Format
 
010 How To Write Creativeay Report Example Sample Coll
010 How To Write Creativeay Report Example Sample Coll010 How To Write Creativeay Report Example Sample Coll
010 How To Write Creativeay Report Example Sample Coll
 

Recently uploaded

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 

Recently uploaded (20)

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 

Assessing The Accuracy And Teachers Impressions Of Google Translate A Study Of Primary L2 Writers In Hong Kong

  • 1. Assessing the accuracy and teachers’ impressions of Google Translate: A study of primary L2 writers in Hong Kong Paul Stapleton*, Becky Leung Ka Kin Department of English Language Education, The Education University of Hong Kong, 10 Lo Ping Rd., Taipo, Hong Kong a r t i c l e i n f o Article history: 1. Introduction Most people have had the experience of encountering a text in a foreign language and wanting to know what it says. Since the early 2000s machine translation (MT) has been available at the click of a link using providers such as Google Translate (GT), Babel Fish and the like. Concurrently, these same platforms have been available to international students who may wish to read and write in their native tongue using MT to assist with their understanding of a text or assignment submissions. With regard to foreign language education programs and courses, the use of MT has been a particular concern, at least for courses requiring written assignments, because automated translators have the potential to eliminate the motivation for L2 students to learn to write in a target language. Until recently, however, this concern has been mitigated by the quality of translations generated by MT, which were either poorly constructed or even gibberish, and as such, instantly recognizable as the products of MT. Recent advances, however, in the methods used to generate automated translations may be changing this landscape. In the present study, two sets of scripts written by the same primary students were compared with each other – one set written in English, and the other set written in their native Chinese to the same prompt, and then translated into English using GT. Teachers who were unaware that some of the scripts were products of GT graded them and were interviewed about their impressions. 2. Machine translation MT has a lengthy history, emerging during the Cold War in the 1950s with attempts made at automating translations for the United States to gain an upper hand against the Soviet Union. Initially, these ventures were well funded by the US government, but by the early 1970s, without any significant advances, funding dried up and most MT projects were aban- doned, although they continued in some countries (Slocum, 1988). In the 1980s, with growing computer power, new approaches to MT emerged. In a seminal article, Nagao (1984) explained that rather than continuing the deep linguistic analysis of morphological, semantic and syntactic information that had been used in prevailing systems, an approach that matches strings of text in a bilingual corpus with parallel sets of texts offered better quality results. In the 1990s and afterwards, with more powerful computing available, statistical approaches, which still analyzed phrases in bilingual corpora, were viewed as the way forward for MT. * Corresponding author E-mail addresses: paulstapleton@gmail.com (P. Stapleton), rec09nor@gmail.com (B. Leung Ka Kin). Contents lists available at ScienceDirect English for Specific Purposes journal homepage: http://ees.elsevier.com/esp/default.asp https://doi.org/10.1016/j.esp.2019.07.001 0889-4906/Ó 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). English for Specific Purposes 56 (2019) 18–34
  • 2. Most recently, because of intense competition between technology giants such as Google and Facebook whose success partially depends upon being able to facilitate communication among speakers of different languages, new systems have been developed largely based on a neural network approach as opposed to the previous phrased-based system, which translated chunks of text separately. The new neural system takes enormous amounts of human-translated text and trains the system, creating a digital representation of the word or phrase and its accompanying context. It then statistically chooses the closest probable match in the target language. GT claims that although their new neural system “can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms, and translating sentences in isolation rather than considering the context of the paragraph or page” (Google AI Blog, 2016), it is still a sig- nificant improvement over earlier systems. In the meantime, Facebook claims to have developed an alternative neural network approach for language translation “that achieves state-of-the-art accuracy at nine times the speed of recurrent neural systems” (Gehring & Auli, 2017, p.1). Because of these recent improvements in the quality of MT’s translations, coupled with our own positive, albeit informal, experience using GT to translate from English to Chinese in pilot exercises, the present study focuses on the outputs of GT. Because Facebook is embedded in a social media app, it appears not to have been used by many as a stand-alone translation tool. 3. MT as a pedagogical tool The recent advances in MT suggest that pedagogical studies conducted just a few years ago may no longer be as relevant except as indicators of MT’s earlier inadequacies. Sheppard (2011), for example, describes translations from French into English as a “risky business” that is “riddled with mistakes or worse” (p. 566). In another study (Kirchhoff, Turner, Axelrod, & Saavedra, 2011), using GT to translate from English to Spanish, when the fluency of the translation of 385 sentences was rated on a five-point scale from “flawless” (5) to “incomprehensible” (1) by two native-Spanish speakers, a mean fluency score of only 3.73 resulted. Similarly, van Rensburg, Snyman, and Lotz (2012) found that the quality of GT translations from Afrikaans to English and vice versa scored by five raters needed substantial improvement and post-editing. These studies indicate that until a few years ago, GT had reached a level of quality where the translations may have been useful, but were still far from perfect. Two more recent papers by Groves and Mundt (2015), Mundt and Groves (2016) go some distance towards providing background into the use of GT by learners of English in an academic context. In the first (2015), they investigated the linguistic accuracy of texts originally written by students in their L1 (Malay and Chinese) and then translated into English using GT. The results revealed that although the translations had a relatively high rate of grammatical errors, this rate was similar to the minimum level of accuracy required for university entrance. Their 2016 paper discussed some of the ramifications of GT, especially as its translations become more accurate. For example, they weighed whether GT falls into a different category than other technological advances that have improved writing, such as the spell and grammar checkers that arrive with word processing programs. One issue raised by Mundt and Groves (2016) is the matter of discourse competence. They suggest that while MT may be approaching the grammatical level of competence of certain learners of English, it lacks the human ability to satisfy the norms of a discourse community in features that go beyond the sentence level. Another issue they raise is plagiarism. If a student uses GT to translate a source text from another language into English, plagiarism detection software such as Turnitin may not detect it as matching text. There is also the issue of whether the GT translation of a text written in a student’s native tongue can really be called the student’s authentic product. On this latter point, Mundt and Groves (2016) conclude that because the student is producing the original ideas, any MT translation should be considered the student’s own work. All of these vexing issues, however, remain largely hypothetical if, or until, MT reaches a level of accuracy where readers of its translations are not immediately aware that the text was translated by a machine. Apart from the quality of GT’s translations in a pedagogical context, a number of studies have investigated how GT can be used as a language learning tool. Bahri and Mahadi (2016) in a questionnaire-based study asked 17 Malaysian students what their attitudes were regarding using GT as supplementary language learning tool. Their students collectively responded that GT was a useful tool; additionally, students who had stronger positive feelings towards GT as a language learning tool also scored higher in an examination used as a benchmark in the study. However, we learn little from this study about the learning strategies used by the students. In an earlier attitudinal study of 46 Swedish students learning English, Josefsson (2011) found that 90% of students were using GT as part of their usual practice when dealing with English and most had positive attitudes towards it, although many complained about its inaccuracies. More recent studies (Farzi, 2016; Lee, 2019; O’Neill, 2019) have had similar mixed findings regarding students’ use of GT. As for instructors’ attitudes towards the use of MT, several studies have concluded that teachers are largely skeptical about using MT in the language classroom. In one of the larger scale studies investigating students’ and teachers’ attitudes towards MT, Clifford, Merschel, and Munne (2013) found that although students tended to find MT helpful, the 43 instructors they questioned at an American university were “skeptical of a positive impact on language learning” (p. 116). Similarly, in a study of 41 instructors at a Spanish university, while most teachers believed using MT to translate individual words was within ethical bounds, most also believed using MT to translate paragraphs or passages was completely unethical (Jolley & Maimone, 2015). P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 19
  • 3. 4. Research inquiry The studies described above indicate that despite improvements in MT over time, translations still include a large number of errors. They also highlight students’ and teachers’ skepticism towards MT. However, given the recent improvements in GT and the lack of studies on the pedagogical implications of these improvements since GT’s latest upgrade, the time appears appropriate to explore the quality and nature of GT’s translations and consider their implications. The present study, similar to those described above, sought to recruit learners of a second language because, as noted by others (Bahri & Mahadi, 2016; Josefsson, 2011; Mundt & Groves, 2016), GT can be used as a language learning tool. For this reason, we designed an exploratory study that inquired whether GT had reached a level of quality under which students could use their native tongue (in this case Chinese) to write a passage and then use GT to translate it into English and have it pass unrecognized as a product of MT by teachers who grade the scripts. We were also interested to know teachers’ reactions to GT translations (once being informed (post-grading) that the scripts were MT-generated). Thus, the following research questions are pertinent: 1. How do teachers grade the grammar, vocabulary and comprehensibility of two sets of parallel essays composed by L2 primary students – one written in English and the other in Chinese and subsequently translated by GT into English? 2. What are teachers’ reactions upon learning that they have read MTs from students? 3. Do teachers believe GT has a role as a pedagogical tool? 4. What areas of language in the GT-generated product, if any, stand out as being unnatural, i.e., especially advanced for L2 primary students, or erroneous? 5. Method To answer the research questions, we conducted a mixed-method study in three data collection phases: 1) collect com- positions written by Primary 6 students; 2) recruit teachers to grade the translated scripts and the English compositions; 3) interview teachers and inquire about their general impressions of the scripts and their attitudes towards GT as a pedagogical tool. Normal ethical procedures were followed throughout including securing parental permission and assuring anonymity. 5.1. Script collection In the first phase, 26 English compositions and 22 Chinese compositions were collected in June 2018. All the compositions were written by Primary 6 students from a local school in Hong Kong. All of the students were 11–12 years old and native Cantonese speakers. At the time of their participation in the study, they had finished six years of primary school education and were about to enter junior secondary school. They had received daily English language lessons during this time at school and were able to write short compositions in various genres, such as narratives and descriptions. This group of students was chosen because they had received instruction in argumentative writing in the previous year and were familiar with the genre. Choosing from among a list of prompts that were provided by the authors in advance, the students’ teachers decided that the prompt, “Is half-day school a good idea,” (and its Chinese equivalent, “半日制學校好嗎? 試談談你的看法”) was suitable and interesting for students. Then, we left teachers to design the writing class in the way it is normally conducted. Before writing, teachers reviewed the argumentative genre for 30 minutes with students. A worksheet was given to students to draft an outline. Then, the students were given 60 minutes to complete an English composition. Several days later, the Chinese teachers conducted a writing class in a similar way. This time, however, the students had their English composition returned and they were told to write a Chinese composition based on the parallel Chinese prompt (above) in which they could refer to their English composition. This design had a twofold purpose: 1) by allowing the students to view their original composition, we felt these 11-year-old students would be more motivated to complete a task that on the surface they may have felt to be redundant; 2) when comparing the parallel scripts, this method allowed us to identify, compare and analyze the scripts of those students who decided to translate sentence by sentence and word by word. At the end of the first phase, 22 Chinese compositions and 26 English compositions were collected. 5.2. Data cleaning Before the scripts were machine translated, data cleaning was carried out on the Chinese scripts. Two kinds of mis-written characters were found: 1) non-words (錯字) which are mistaken forms of standard Chinese characters and 2) confused words (別字).1 Table 1 shows that nearly all the confused words were homophonic to the correct word. The correction of two confused words, “自你能力” (literally: self-you-ability, from script 11) and “飯箱” (literally: rice-box, from script 12), was deemed necessary because they are not commonly confused for the respective standard words. Keeping the original would 1 Confused words are standard Chinese characters that are written with a similar wrong character in standard word combinations. They are usually homophonic, e.g., when “家賓” (home-guest) is confused for “嘉賓” (honorable-guest), or in a similar formation of the correct word, e.g., when “農歷” (lunar-history) is confused for “農曆” (lunar-calendar). Confused words are wrong because they are semantically incompatible (Li, 2004). P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 20
  • 4. produce confusing translations from the original meaning in Chinese. Four non-words (as shown in Table 2) were also cleaned because they could not be generated by any Chinese input methods. Thus, the corresponding standard characters were input to fill in the gaps. In total, the cleaning rate of the Chinese text was limited to 0.093%. 5.3. Script assessment In the second phase, the 22 Chinese compositions were translated into English by GT and then randomly interspersed with the students’ 26 English compositions for teachers to grade. All the scripts were compiled into an online survey (https://www.surveymonkey.com/r/VGGDX3B) on SurveyMonkey software. Twelve teachers were recruited via personal contacts for this grading exercise. Six teachers were Cantonese- speaking English teachers and six were native English speaking teachers (called “NETs,” an official term used by the local Education Bureau) at various primary schools in Hong Kong. When asked to describe the academic level of their school in relative terms, the teachers claimed their school was about average, i.e., not especially high- or low-level compared to other schools. All of the teachers had multiple years of experience grading and correcting compositions. They were given a rubric to grade the scripts based on three criteria: grammar, vocabulary and comprehensibility. For each criterion, teachers gave a grade of A, B, C or D, based on the rubric which had accompanying descriptors for each grade (Appendix A). They were told in the instructions to ignore content and organization as these two elements fall outside of MT’s capabilities and could serve as a Table 2 Non-words found in Chinese scripts. No. Non- word Correct word 1 另 ling6 “another” 2 遲 ci4 “late” 3 駁 bok3 “another” 4 許 heoi2 “quite (many)” Table 1 Confused words found in Chinese scripts. No. Confused word Correct word Chinese sentence Meaning 1 小 “small” 少 “little/few” 學小 小 一點知識 learn less knowledge 2 太小 小 時間 too little time 3 課堂就會小 小 一點 fewer lessons 4 功課都會小 小 一點 less homework 5 工 “efforts” 功 “task” 做工 工 課 do homework 6 已 “already” 以 “before” 已 已 前香港 Hong Kong in the past 7 應 jing1 “should” 認 jing6 “deem” 我應 應 為 I think. 8 許 “allow” 需 “need” 許 許 要休息 need rest 9 培 “cultivate” 陪 “accompany” 培 培 伴子女 accompany (their) children 10 舒 “relax” 紓 “relieve” 舒 舒 解壓力 relieve stress 11 你 “you” 理 “care” 自你 你 能力 self-care ability 12 箱 “container” 商 “merchant” 飯箱 箱 lunch supplier P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 21
  • 5. distraction from the present study’s focus. The 12 teachers were neither told about the purpose of the study, nor were they told that some of the scripts were translated. Teachers in the first phase of script collection were not recruited for the grading exercise as they were aware of the study’s design and the involvement of GT. 5.4. Interview In the final phase, within a few days after completing their scoring, semi-structured interviews were conducted indi- vidually with the 12 teachers by telephone using the following guiding questions: 1. What were your general impressions of the scripts? 2. Did you notice anything unusual about the writing in the scripts? 3. (After revealing that about half of the scripts were machine translated from the original Chinese). Did you notice this and if so, what were the signals that suggested so? 4. What are your feelings about students using GT? Do you think it has a place in L2 pedagogy? A total of 174 min of recordings were collected and then transcribed by one author. The transcriptions were then sum- marized and sent to respective teachers to check for accuracy and further comment. 5.5. Data analysis To answer the first research question, we needed to transform the raw, ordinal data into scalable measurements to make statistical comparison possible. Although a marking rubric was provided, we realized that every teacher would grade scripts somewhat differently based on their own interpretation of the rubric’s descriptors. To solve these two issues, Rasch modelling was applied. Rasch measurement transforms ordinal grades, e.g., “A” to “D” in this case, into logits which enables researchers to compare teachers’ grades on a linear scale (Boone, Staver, & Yale, 2013). The Rasch model also checks for misfits. Raters (our grading teachers in this case) who are too lenient or strict are indicated by an out-of-range infit mean square. When such cases occur, a raters’ scores are excluded, and the data can be re-run to generate sufficiently reliable grades for the purpose of a study. Then, an independent-sample t-test can be conducted to determine whether there is any statistical significance. This study adopted a two-faceted design, raters (teachers) and items (scripts). The software Minifac (Facets) Rasch was used to generate the Rasch scores. Then, these scores were extracted and input into SPSS to perform a significance test. Rasch measurement was repeated three times to generate data for the three criteria: grammar, vocabulary and comprehensibility. For the second and third research questions enquiring about the teachers’ reaction to the GT-translated scripts and their beliefs about students’ use of GT, the transcripts were reviewed by the first author, which led to four codes being generated. These codes aligned closely with the interview questions, except for the final code, which emerged from the coding process (List 1). List 1 Codes emerging from the interviews 1) Beliefs about the general quality of the scripts (before being informed about GT) 2) Reaction upon being informed about GT 3) Beliefs about using GT in the classroom or as a pedagogical tool 4) Beliefs about the accuracy of GT Once these main themes or codes were established, the second author, after a training session on applying the codes, independently coded three of the 12 transcripts. Agreement was numerically calculated by noting agreement/disagreement at the level of each individual exchange between the interviewer and interviewee over the total number of codes assigned under each of the four codes. Agreement between the two coders (first and second authors) on those three transcripts reached a level of 70%. To further ensure a satisfactory level of reliability, after further training, two more transcripts were coded by the second author which resulted in the level of agreement rising to 87%. This level satisfies standard norms; Smagorinsky (2008, p. 401) suggests reaching a level above 80% over 15% of the data. Differences were resolved via discussion between the two authors. To answer the fourth research question, we reviewed the GT-translated scripts for instances of what we deemed both exceptionally advanced grammar and vocabulary for the Primary 6 learners of English, as well as instances of misuses or errors. 6. Results 6.1. RQ1 Table 3 shows teachers’ grading in Rasch scores. One local and one NET teacher’s scores in grammar fell outside the acceptable limit of fits which is between the range of MnSq 0.5 and 1.5 (Boone et al., 2013: 166), with infit mean squares of 1.7 P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 22
  • 6. and 0.4 respectively. This indicated that their scores did not fit the model; thus, those scores were excluded and the analysis was processed again to generate the results shown in Table 3. The fit statistics of the three data sets (grammar, vocabulary and comprehensibility) show that the teachers’ scores can be taken as sufficiently reliable for the purposes of this study. As shown in Table 3, teachers differed in leniency in each data set. The difference for the grammar scale was larger than the other two scales, with the strictest marker at þ2.13 logits and the most lenient at 1.91 logits. Table 4 compares the scores given by local English teachers and NETs to see if there was an inter-group difference. In all three scales, t-test results showed there were no significant inter-group differences (p > .05). Moreover, the effect sizes were small, as indicated by d < 0.3. As the results show, local teachers and NETs judged the scripts similarly in grammar, vocabulary and comprehensibility. The table in Appendix E contains the Rasch scores for each script. Data in this table were extracted and entered into SPSS to perform an independent-sample t-test. The results of the t-test are shown in Table 5. Table 5 shows non-GT script scores have positive mean measures in both grammar and vocabulary while those for GT scripts are negative. The positive logits indicate that the teachers graded non-GT scripts lower on average, and the negative logits mean that teachers gave GT scripts higher grades. The difference for grammar was significant (t(46) ¼ 3.79, p ¼ .000) and the effect size was large (d ¼ 1.105). Therefore, teachers considered GT-scripts as significantly better than non-GT scripts in grammar. Although the difference for vocabulary was not significant (t(46) ¼ -1.98, p ¼ 0.54), the effect size was medium (d ¼ 0.567). As the t-test results on comprehensibility show, non-GT scripts had a lower mean measure and they were scored higher. However, the difference was not statistically significant (t(46) ¼ .093, p ¼ .926). Moreover, the effect size was small (d ¼ 0.028). Thus, GT and non-GT scripts seemed equally comprehensible to teachers. 6.2. RQ2 and RQ3 interview findings Findings from the interview data are presented here with illustrative excerpts based on the four codes. In themes where differences were noted between the local and NET teachers, distinctions are made; however, for the most part, few important differences were observed between the two groups of teachers. 6.2.1. Beliefs about the general quality of the scripts (before being informed about GT) The main, albeit underlying, purpose for asking the teachers for their general impressions of the scripts was to ascertain whether they had recognized that GT had been used to generate some of the scripts. Among the 12 teachers (pseudonyms used throughout), only two mentioned GT without prompting. NET Ryan volunteered, “[there were] phrases I’ve never come across before. And I think, when I looked at some of them . what I notice is that a lot of them are using Google Translate.” Table 3 Leniency-severity level of teachers (N ¼ 12). Rater Group Grammar Vocabulary Comprehensibility Measure (logits) Model Error Infit MnSq Measure (logits) Model Error Infit MnSq Measure (logits) Model Error Infit MnSq 1 LET 0.10 0.25 0.75 þ1.60 0.23 1.03 2 LET 0.24 0.23 1.24 0.22 0.25 1.17 þ0.18 0.22 1.09 3 LET þ0.48 0.23 0.71 þ0.20 0.25 0.91 þ0.41 0.22 1.10 4 LET 1.91 0.26 1.38 1.25 0.25 1.13 0.60 0.22 1.13 5 LET þ2.13 0.25 1.01 þ1.10 0.25 0.77 þ2.12 0.25 1.39 6 LET þ0.02 0.23 1.15 þ0.74 0.25 1.32 þ0.41 0.22 0.94 7 NET 0.94 0.24 1.16 0.58 0.25 1.18 þ0.59 0.22 1.36 8 NET þ0.43 0.23 0.74 þ1.16 0.25 1.39 þ0.88 0.22 0.80 9 NET 0.08 0.23 0.85 þ0.62 0.25 0.67 þ1.13 0.23 0.90 10 NET þ0.23 0.23 0.67 þ0.98 0.25 0.67 þ0.69 0.22 0.53 11 NET þ1.11 0.23 1.04 þ1.66 0.25 1.20 þ0.64 0.22 0.95 12 NET þ1.35 0.25 0.59 þ1.49 0.23 0.47 M þ0.12 0.24 1.00 0.10 0.25 0.98 0.79 0.22 0.97 SD þ1.03 0.01 0.23 0.22 0.25 0.27 0.69 0.01 0.27 Table 4 Comparison of scores given by local English teachers and NETs. Scale Teacher group N M SD t df p Cohen’s d Grammar Local 5 þ0.10 1.45 .074 8 .943 0.043 NET 5 þ0.15 .75 Vocabulary Local 6 þ0.08 .82 1.690 10 .122 0.087 NET 6 þ0.87 .79 Comprehensibility Local 6 þ0.69 1.00 .503 10 .626 0.280 NET 6 þ0.90 .35 P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 23
  • 7. Likewise local teacher Mandy immediately stated, “After I marked it - after I graded a few of them, I realized that ... you’re doing something like [using] a translation engine, or like Google Translation.” In explaining her suspicions about the use of GT, Mandy correctly noted one script used the phrase “talk about the sky” (original sentence: “Just like playing chess games together, talking about the sky, [an idiomatic expression in Cantonese meaning “making idle chit-chat”] can make each other’s relationship better.”), which did not make sense to her in English. However, the Chinese lexical equivalent, “聊天”, occurred to her, and she realized it was a direct translation for “chatting.” She also noted a student’s use of the term “small homework.” She guessed students might have mis-typed the Chinese word for “little” (少) (for quantity) as “small” (小) (for size), and as the phrase was entered into GT, “small homework” was generated. Mandy also noted some odd grammatical structures, which triggered her to suspect the use of GT. NET Ryan also noted strange phrases like “live to be old and learn to be old” and “school rice merchants” and wondered how Primary 6 students had come across these phrases, and thus suspected the use of GT. However, apart from these two, the remaining 10 teachers focused on technical aspects of the texts, particularly the content, which they were again told was not meant to be part of the assessment. Aside from the teachers’ comments on the content, which were eliminated from the analysis, the majority of these teachers commented that the overall quality of the scripts ranged from “typical” (George, Gary, Ryan and Doris) of the level they are familiar with, to “very good” (Rosa, Megan and Steve) and “impressive” (Justin). It should be noted here that these comments were made about all 48 scripts in general, indicating that the teachers were unaware, at that point, that there were two distinct sets of scripts – GT-translated and non GT-translated. Notable in the remarks coded under the “beliefs about the quality” theme, were two teachers besides Mandy and NET Ryan who mentioned what they thought were expressions translated directly from Chinese. Rosa, for example, noted, “I can comprehend [the text]. I know they’re translating [in their minds] from Chinese to English.” NET George stated, “it looks like they’ve been given words that looked at the direct translation and then used the vocabulary incorrectly.” In these two ex- amples, however, it is uncertain whether their references are to GT-translated texts or not. 6.2.1.1. Vocabulary. Teachers commented with widely divergent views on the range and quality of vocabulary in the scripts, as well as lexical mistakes made. For example, NET Justin thought the vocabulary range was wide. Among local teachers, Michelle thought students could write with basic words only, lacking variety. Rosa was impressed to find advanced vocab- ulary was being used in five to 10 scripts. In her class, she claimed only two to three students (out of roughly 25) were able to use such words. However, Gary found some advanced words were situated in poorly constructed sentence structures. 6.2.1.2. Grammar. Generally speaking, the teachers as a whole believed the grammar in the scripts was either at an average or above-average level compared to their own students. Rosa, for example, reported she did not find many grammatical mistakes in the scripts. She could see students had a good understanding of basic grammar rules. NET George spotted a few good compositions with flawless grammar. NET Justin thought the grammar was “pretty good generally.” However, he said there was considerable disparity in the grammar level. He thought it was due to differences in students’ English ability. NET Steve found there were some grammatical problems, but he thought they were minor as they did not hinder comprehensibility. 6.2.1.3. Comprehensibility. Local teachers had no problem understanding the scripts. Mandy pointed out that local teachers may comprehend students’ compositions better than the NETs since many of the sentences resembled direct translations from Chinese. However, most of the NETs did not report any difficulty with comprehensibility. NET Steve did not give the lowest grade, D, to any scripts in terms of comprehensibility, although NET Martin found some scripts were ambiguous because the context and connections between sentences was sometimes lacking as they presumed readers would know what they were arguing about. 6.2.2. Teachers’ reaction upon being informed about GT Apart from the two teachers who suspected GT was used for some of the scripts, five local teachers and five NETs did not suspect the use of GT during the grading exercise. After the methodology of the study was revealed, their reactions ranged from “surprised” (George and Martin) to “amazed” (Doris) to “shocked” (Megan), although this knowledge appeared to trigger latent thoughts. NETs Martin and Doris did not suspect the use of GT because they noticed their own students would mentally translate Cantonese to English in a literal manner, rather than formulating their ideas in terms of English. NET Justin recalled some irregularities (e.g., a disparity in grammar level, strange phrases, and colloquialisms being used in the wrong context), but he thought these mistakes were also typical of non-MT-translated writing from his students. Table 5 Comparison between non-GT and GT scripts on grammar, vocabulary and comprehensibility. Scale Scripts M SD t df p Cohen’s d Grammar GT 0.78 1.24 3.79 46 .000 1.105 Non-GT þ0.67 1.38 Vocabulary GT 0.40 1.41 1.98 46 .054 0.567 Non-GT þ0.34 1.19 Comprehensibility GT þ0.02 1.21 .093 46 .926 0.028 Non-GT 0.01 .95 P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 24
  • 8. Upon being informed about the GT scripts, several NET teachers quickly recalled having seen some irregularities that made them suspect the use of GT, although such irregularities may not necessarily have been in the GT scripts. NET Steve, for example, suspected some ungrammatical sentences such as, “They’re at home can play games” and “half-day school have less lessons,” might have been generated by GT. Although both instances recalled by NET Steve were in fact written in English as confirmed by a later search, it should be noted that it did not mean similar mistakes could not be found in GT scripts. NET Justin also suspected GT could be responsible for some grammar errors, strange phrases and colloquialisms used in the wrong context. “Now that you’ve told me Google Translate [was used], I’m thinking back and I can see how some of the scripts were probably translated through it.” In retrospect, NET George was the only teacher who suspected advanced words, rather than erroneous phrases, were Google translated. “All of a sudden they have this word that seemed out of place that was very – quite an advanced word for their writing. My thought was that they probably would have used Google Translate.” Among local teachers, Michelle, recalled that some scripts seemed like direct translations from Chinese and she thought those might have been generated by GT. Rita also found strange phrases in some scripts; however, GT did not cross her mind because she thought those mistakes were normal even for her own students. “I didn’t notice, but I did see some strange phrasing in sentences, [which] is quite normal for students... [but] I didn’t have the thought that it’s Google translated.” After the GT element was revealed to him, Gary reported that he guessed students might have gotten help from online translators or from teachers, but it had not occurred to him that any whole script had been machine translated. These remarks from the teachers, taken as a whole, suggest that the GT-translated scripts did not really stand out from those written in English by the students. One point to note is that although the majority of teachers could not distinguish GT-translated scripts from the English scripts written by students, it did not necessarily mean that the teachers felt the translations were better. Megan suggested that GTscripts could be as flawed as students’ writing. Mandy had the impression that machine translators and students both tended to commit the same grammatical mistakes, such as subject-verb agreement, confusion between singular and plural nouns, and tenses. 6.2.3. Teachers’ beliefs about using GT in the classroom or as a pedagogical tool All of the teachers clearly stated that GT should not be used by students to translate passages or sentences they had written in their L1. The following excerpts provide a flavor for this belief: NET Martin: If they’re kind of copying whole passages into [GT], they should be trained to - ideally not to use it. Gary: I don’t like it when I give them a writing, and they translate the whole thing and give it back to me. On the other hand, nine out of 12 teachers were not against GT as a learning tool. From his observations, Gary deemed it normal for students to use machine translators and he appreciated it when GT was used for strengthening students’ language skills. NET Justin described GT as “a powerful educational tool.” NET George also thought GT was “a great tool.” He thought technology was something students in this generation were privileged to have and he considered it “silly” not to use it at all. Mandy thought online translators were acceptable to be used as a tool and she claimed to be delighted to see students make use of many tools available to them for their own good. “Using it like a tool, I would encourage students to do it because we no longer rely on paper dictionaries, and if they have the motivation and the will to use some – any tools available to them, then I will be more than happy.” Although the teachers generally were in favor of students using online translators to assist learning, they were cautious about the extent to which GT should be exploited as a learning tool. For many of them, where to draw the line lay where students could usefully benefit. Three teachers (Mandy, Gary, NET Martin) thought the use of GT should be confined to the word level. For example, Mandy and Gary thought it was acceptable for students to translate single words that they did not know and put them back into sentences. NET Martin thought it was fine if students had the words in their mind and used GT to check whether they were correct. Although NET Justin thought GT was a “powerful tool,” he claimed to be “conflicted” about the use of GT, a belief noted by several other teachers. He thought it could only be beneficial if it were used correctly. Students had to know the grammar rules and how to correct the errors generated by GT. Yet, he believed if students were to become proficient in English writing, they had to be able to “do it from scratch” (aside from using GT for noticing and correcting errors). Two teachers (NET Justin and Rosa) thought the use of GT should depend on students’ language proficiency. NET Justin claimed GT could be useful as an initial step for beginners. However, Rosa opposed students of lower ability using GT. For some teachers, the boundaries for acceptable use of GT depended on the purpose of learning. NET Steve would be against using GT if the purpose was to check students’ grammar, spelling and sentence structure. If the purpose was to see if students could present their ideas in a reasonable and logical manner, however, he thought GT could be applied. NET Doris held a similar stance. She thought it was acceptable for students to use GT to “get an idea.” However, she still worried students would become reliant on GT. Some teachers suggested that schools and teachers should provide guidance for students to ensure GT was properly used for learning. NET Martin thought that students should “get some English out of GT if they were to use it.” In order to achieve this, he believed schools and teachers should train students to check translation results, change them and edit them. Similarly, Rosa thought students should use what they have learnt in class to check GT translation results. NET Ryan suggested GT could be used by teachers to discuss mistakes and to help students improve their English. In other words, teachers could explain why a translated sentence is wrong, how it is wrong and how students can improve it. P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 25
  • 9. On the other hand, three local female teachers (Megan, Michelle and Rita) were completely against students’ using GT. Their beliefs are best summarized by Rita, who was opposed to the use of GT even for single-word translations. “I would not recommend... students to use Google Translate even just to know the meaning of an English word in Chinese because there are Chinese-English dictionaries online.” Rita and Megan worried the students would not be motivated to learn English if they could get immediate translations. Michelle thought students would copy from GT rather than thinking about how to write in English. Megan claimed GT made it too easy for students to get translations and this would discourage students from learning English by themselves. This negative view of GT was partly due to doubts about its ability to generate accurate translations. 6.2.4. Beliefs about the accuracy of GT Many of the teachers volunteered comments about the accuracy of GT’s translations. Among those who commented, eight teachers firmly held the view that GT has too many inaccuracies to be trusted. NET Martin’s belief largely summarizes those of the other seven: “You’re not going to get a direct translation that’s comprehensible to native speakers through platforms like GT.” George added, “the technology definitely has a long way to go.” However, upon having it brought to the 10 teachers’ attention that they had not noticed that half of the scripts were GT-translated, some teachers were reflective. Justin com- mented, “I think Google Translate is fairly powerful right now.” Steve admitted that “perhaps, there have been improvements made in Google Translate in recent years.” Most teachers, however, held to their original beliefs that GT’s translations were inaccurate. In sum, despite indications by most of the teachers that GT could be used as a tool, all teachers had concerns about GT negatively affecting students’ learning. In general, they worried about the inability of GT to generate grammatically correct translations, and further, that GT’s convenience would lead students astray. They also worried that students would use it as a “shortcut” (NET Martin) to get translations right away rather than going through the language learning process. Yet over half of them saw that GT had a place in L2 teaching. Instead of banning the use of GT outright, some of them suggested training on the correct use of GT should be provided in schools so that students could benefit from it. 6.3. Areas of language in the GT generated product that stood out Tables 6 and 7 show instances of vocabulary and grammar found in 12 different GT scripts that we deemed at a level normally more advanced than would typically be expected for Primary 6 L2 students (taken from 11 different scripts). In the case of advanced grammar, we especially noted the use of participial clauses (Table 7). However, despite these examples where GT’s translation appeared to correctly enhance the students’ English, in other places, errors were made, even when the students’ original Chinese was correct. Table 8 shows examples of these (ques- tionable translations in italics). 7. Discussion 7.1. Addressing the research questions The recent advance in the quality of translations generated by MT, particularly GT, was the impetus for the present study. We surmised that if MT-translations did not stand out as having inferior quality when interspersed among a set of parallel compositions originally written in the target language, it would provide evidence that MT may be reaching a higher level of quality than that noted in recent studies, although we are aware that our study was conducted in very confined circum- stances, i.e., relating to students’ level, language and topic. Based on both the quantitative and qualitative results, our sup- position appears to have been confirmed. Our study was driven by four research questions, all concerning the quality and nature of GT’s product focusing on scripts written in Chinese by primary school students and then translated into English. The first research question in some senses was a proxy for another broader question, namely, whether GT’s translations would pass unrecognized as MT. Given that the grades from teachers (both local and NET) on the GT translations were not significantly different than those of the students’ scripts written in English, and where they were, the GT scripts were actually scored higher, we conclude that the mechanical inaccuracies and lexical errors reported in earlier studies on GT may no longer be appearing at the same rate, at least for the level, genre and the two languages involved in the present study. Table 6 Instances of vocabulary deemed exceptionally advanced. Original Chinese GT generated English (script #) 在課堂上打瞌睡 doze off in class 15 實施全日制 implement (full-time) 03 自由度高 high degree of freedom 07 分配時間 allocate time 08 緩解(. 壓力) alleviate 27 跟父母很疏遠 Alienate from their parents 29 *保充體力 Replenish your energy 13 減少家長負擔 Reduce the burden on their parents 33 P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 26
  • 10. Further underscoring this point, and referencing our second research question, only two out of 12 teachers suspected the use of MT. In other words, broadly speaking, the GT-translated scripts appear to have reached a comparable level (and in the case of grammar, a higher level) to those written in English. However, despite this, it appeared that several teachers, both NET and local, continued to believe that GT is still generating poor translations, indicating their beliefs may not have been updated from earlier weaker versions of GT, and MT in general, which have prevailed for decades. Our third research question regarding teachers’ beliefs about using GT as a pedagogical tool led to mixed reactions. While three of the 12 teachers thought GT had no place at all in the classroom, the others saw varying uses. However, all the teachers drew a firm line against students using GT to generate translations from writing in their native tongue beyond the single word level. This latter finding aligns with other studies (Kirchhoff et al., 2011; Sheppard, 2011; van Rensburg et al., 2012) that cast doubt on the quality of MT’s output. Although some of the teachers in the present study had similar reservations about the quality of GT’s output, another concern was the larger threat posed by advancing artificial intelligence. An undercurrent in the interviews was a deep-seated resistance towards GT because its use could undermine the students’ motivation and reason to learn how to write in English, and by extension, even have a negative effect on their chosen teaching career. Our fourth question, which led us to investigate areas of GT’s output that were either impressively good or error-ridden, resulted in conflicting findings. There were instances of vocabulary and grammatical constructions generated by GT that were probably more advanced than the student authors could have produced when writing in English, and this was confirmed by back checking the students’ original scripts written in English. To cite just one example, in script 15, the student wrote in English, “[Students] will be tired doing the lessons.” However, the same student’s Chinese was translated by GT as “Students will doze off in class.” This latter sentence translated by GT from the student’s native tongue presumably better captures the nuanced meaning the student intended. This finding suggests that at least in some cases, the process of writing in the native language and then machine translating into English resulted in not only a correct translation, but also one with more advanced language and nuanced meaning than the students would normally be capable of when writing in English. This apparent improvement is underscored by the significantly higher scores given by the teachers to the GT scripts for grammar. On the other hand, some GT translation errors could be viewed as constructive feedback for improving GT. A few lexical errors resulted from the mistranslation of Chinese homonyms. For example, “改” carries two different meanings: 1) change “改變” (verb-object compound); and 2) mark (homework) “批改” (compounding verbs). In script 6 (as referred to in Table 8), ambiguity arose as the homonym “改” was written with neither a post-modifying object nor a pre-modifying compounding verb. Despite the presence of “workbook” (習作簿) as a contextual clue, GT mis-selected the translated word, “change,” over “mark.” This exposed one shortcoming of assigning priority to probability over contextual understanding when machine- translated choices are made. Table 7 Instances of grammar deemed exceptionally advanced. Chinese English 老師可以在下午時間改同學的習作簿和準備下一天的課堂內 容, 不用在晚上做。 The teacher can change the classmate’s workbook and prepare the next day’s class content in the afternoon, without having to do it at night. 06 這樣的學習制度自由度高, 讓我們在課堂上學習更專注. This kind of learning system has a high degree of freedom, allowing us to learn more in the classroom 07 雖然有很多人覺得全日制較好, 因為可以學多點東西和回家不 知可以做什麼。 Although many people feel that full-time is better, because they can learn more things and go home, I don’t know what to do. 08 還有, 半日制學校的課程較少, 這正正能給學生一個機會來讓他 們自學, 養成主動學習、追求新知識的習慣。這能令他們即 使在長大後仍然持續學習. Also, there are fewer courses in half-day schools, which is a good opportunity for students to self-study and develop the habit of actively learning and pursuing new knowledge. 09 如果是半日制學校, 課時不夠會導致學生不能完全深入地理解 課堂內容, 所有知識都只不過是一知半解。 If it is a half-day school, insufficient class time will result in students not being able to fully understand the content of the class. All knowledge is only a half-baked. 27 Table 8 Instances of GT translation errors. Original Chinese GT generated English 很多學生都希望自己的學校採用 半日制上課時間。 Many students want their school to use half-day class time. 01 有些人又說, 半日制給同學太多時間, 他們會去玩遊戲而不是溫習。 Some people say that half-day gives students too much time, they will go to play games instead of reviewing. 04 老師可以在下午時間改 同學的習作簿. . The teacher can change the classmate’s workbook.in the afternoon. 06 半日制學校有很多好處 Half-day schools have many benefits. 14 但如果是全日制學校, 老師單是開會也要開到七至八點, 還有可能在晚一點。 However, if it is a full-time school, the teacher will only open seven to eight points in the meeting, and it may be late. 31 連睡眠時間都不足, 又怎麼會有精神去上課呢? Even if the sleep time is not enough, how can there be a spirit to go to class? 31 P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 27
  • 11. Certain Cantonese sentence structures also appeared to be difficult for GT to translate into English, even though the students’ Chinese was correct. In the following example from script 31, mistranslation resulted from verb repetition con- struction of a verb-object (V-O) compound. The verb “開會” is a V-O compound in which “開” and “會” carry the meanings of “open” and “meeting” respectively. When combined as a V-O compound, “開會” has a different resulting meaning of “have a meeting,” as with many other V-O com- pounds in Cantonese (Matthews & Yip, 1994). In the emphatic sentence structure “verb 也要 verb 到七至八點,” the verb is repeated in the construction to emphasize the prolongation of the event up to a certain point intime. If the verb is a V-O compound, as in our case, only the verb component is repeated. This partial reduplication of “V-O 也要V到七至八點,” however, caused parsing difficulties for GT because the words “開會” and “開” are seen as two separate words by the machine, rather than a verb repetition construction. This also affected its analysis on the complement “到七至八點” (until 7 to 8 o’clock). Thus, a nonsensical translation was generated. The fact that Cantonese existential sentences and possessives are both introduced by the word “有” and have the same sentence structure of “noun þ 有 þ noun” (Matthews & Yip, 1994) makes them difficult to be distinguished by GT. The following examples from our dataset show that GT tended to translate both sentences into possessives. Since the subject “half-day schooling (半日制學校)” in (2) was inanimate, pairing it with a possessive verb seems inap- propriate, although the comprehensibility of the sentence is not hindered. Similar mistakes were also commonly found in the students’ scripts when they wrote in English: “One day have 24 h” or “it will just have less homework.” 7.2. Implications The implications of the broad improvements in GT are difficult to determine at this early stage. Groves and Mundt (2015) speculate that MT will not replace second language acquisition in the near future. However, they claim that any regulation of MT “to conform to a previously held world view” (p.119) will not succeed. We agree. Preventing students who are learning foreign languages from using GToutside classrooms (or even inside) to translate their assignments from their native tongue may become increasingly fraught with difficulties. Where we disagree with Groves and Mundt (2015) is in their claim that MT is unlikely to be able to cope with discourse features, such as hedging. Certainly capturing nuances across languages is challenging, if not impossible in some cases. However, because MT uses existing and growing banks of texts that have been generated by humans, there should be no reason why MT-generated results cannot be as good as human translations in the coming decade or two. A greater challenge may be related to language student motivation. If accurate translations into the target language are instantlyavailable to studentsupon entering text in their native tongue, the motivation to learn towrite (and alsoread) in a foreign language could decline. Similar advances in the past that have simplified arduous cognitive tasks have seen the quick adoption of new technologies. An earlier generation of students quickly graduated from the slide rule (or in the case of Chinese students – the abacus) to the calculator. Likewise, statistics courses now focus on principles and the variety of tests available while largely ignoring formulas and calculations (now performed by SPSS), that consumed the time and attention of an earlier generation. In other words, shifting to new technologies that provide faster and reliable results leads to changes in behavior that may be difficult to ignore orcontrol. In an era of rapidlyadvancing artificial intelligence that theyoung generation has been reared on, there is little reason to believe that language learners will not take full advantage of the tools available to them. Thus, teaching strategies for reading and writing in a foreign language that incorporate GT need to be devised assuming language teaching continues in its present form. These may include using GT as a tool for checking or enhancing both lan- guage and comprehension after students have either written or read a passage (Lee, 2019; O’Neill, 2019). Another possibility, as noted by some teachers in the present study, is to use GT to look up individual words or phrases; however, such a usage is only a short step away from the temptation to enter sentences or even whole texts into GT from the native tongue for translation into the target language. And this again raises the issue of motivation. Thus, teachers may increasingly need to (1) 老師 單是 開會 也要 開 到 七至八點. Teachers only open-meeting (emphatic particle) open until 7 to 8 o’clock. Meaning: “Just for meetings alone, teachers have meetings until 7 to 8 o’clock.” GT Translation: “the teacher will only open seven to eight points in the meeting” (2) Script 14: 半日制學校有很多好處 (existential) Meaning: “There are many benefits with half-day schooling.” GT Translation: “Half-day schools have many benefits.” (3) Script 29: 家長們可以有更多時間去陪伴子女 (possessive) Meaning and GT translation: Parents can have more time to spend with their children P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 28
  • 12. devise strategies, or new specific purposes to encourage students to learn to write in a foreign language in the face of technology that can provide instant, accurate translations. Theassumption that language teachingwillcontinue in itspresentform,mentioned above,isanissueraised byCrossley(2018), whospeculatesonGT’s impact.Specifically,heseesthepotentialforGTto“replac[e]theneedanddemandforFL[foreignlanguage] learning in general.[as well as] FL teachers” (p. 542) because of a lack of instrumental motivation on the part of students in FL (as opposed to second language) contexts. Although such a scenario would take years, or more likely, decades to transpire given entrenched curriculums that are wedded to English in FL contexts, in the meantime the disruptive impact MT has on students’ motivation and behavior towards foreign language learning may increasingly need to be addressed. 7.3. Limitations The design of the present study, which had teachers scoring two sets of randomly distributed scripts, meant that teachers were unaware of the two sources. Thus, the teachers could grade the scripts naively and we could then observe their fresh impressions upon revealing the source of the two sets of scripts in order to answer one of our research questions. Inherent in this design, however, was our inability to gather accurate data about which set of scripts, GT-translated texts or not, the teachers’ general references referred to. It was entirely possible that during interviews, when teachers referred to a usage they had recalled in a script, it could have been from either source. Other shortcomings concern possible advantages that the study design gave to GT’s translations. For example, having the students respond to the same topic twice raises the likelihood that the second version, which in this case was written in the native tongueand thentranslated byGT,would naturallybebetter. However, wefeel that mostimprovementswould havebeenrelatedto content, which was specificallyexcluded from the grading system. Another possible advantage we gave to GT wasourcleaningof a very small number of characters written by the students in Chinese, which if left uncorrected would have resulted in gibberish if translated by GT. However, one could argue that they were already gibberish in the original version. Teachers’ grades and interview comments provided valuable evidence for this study. Nevertheless, these two kinds of qualitative and quantitative data are subjective in nature. As some teachers pointed out, there may be a natural inclination to compare the quality of the scripts with their own students. Therefore, the grades reflected a relative, rather than an absolute quality. This may explain why grades differed to a large extent even within the same group of teachers. One teacher found himself conflicted in his grading between simpler compositions and more complex ones. For the former, it was easy to grade them higher as fewer mistakes were found. With an intention to reward students who challenged themselves to write in more complicated language, he found himself trying to grade these scripts higher even though they contained more errors. The concern of this teacher brings up the problem of equating “fewer errors” with “higher quality” when assessing students’ compositions. Future studies may investigate this issue further and include positive evidence such as language styles, for- mality and syntactic diversity when compiling marking rubrics. As is customary in small-scale studies such as the present one, the findings are specific to the narrow group of student participants, as well as the subject area of the prompt and the languages being translated and assessed; thus, the findings are indicative only. Nevertheless, given these indications, coupled with the rapid improvements in MT, particularly with com- panies such as Google, Facebook and now Baidu (Dai, 2018) racing to perfect their translation technology, we believe MT will continue to have an increasing impact on L2 writing in the coming years. One further limitation was that we confined our focus to writing without exploring MT’s potential in the L2 reading class. With an increasing amount of reading material consumed in a digital form, the possibility to instantly use GT to translate large chunks of text may also be tempting for students. 8. Conclusion GT uses a corpus comprised of a large number of texts widely ranging from official documents to detective novels that are readily available in multiple languages to serve its purpose (Bellos, 2011). Thus, in some cases, the language that GT generated in our study was more formal and sophisticated than what Primary 6 students would normally produce. Comparing GT scripts with students’ compositions helps us to ask what it is that makes good writing good, and how to take advantage of GT’s strengths to help L2 students learn to write in their target language. Finally, our era of rapidly advancing artificial intelligence, which includes MT, could bring disruptive changes to language learning and teaching. Presently, although it may be understandable that pedagogy lags behind as teachers struggle to learn how to best use and manage the new tools that appear, the recent improvements in MT as indicated in this study suggest that teachers of foreign languages need to quickly develop a broader realisation about a form of technology that is likely to have a significant impact on the teaching and learning of the written word in L2 contexts. Given the unlikelihood that students will not take advantage of such a useful technology, there may be a need to rethink L2 writing pedagogy in which teachers will be forced to consider how to adopt MT as a pedagogical tool. Acknowledgements The authors would like to thank the teachers at the Education University of Hong Kong Jockey Club Primary School. We also thank Jinxin Zhu for his help with statistics. P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 29
  • 13. Appendix A. Rubric Appendix B. Raw grades given by teachers on grammar Grammar (assessing the accuracy of usage including common difficulties such as tense, subject-verb agreement, articles, plurals, complex sentences, etc.) A Error-free or minimal number of errors B Some error-free sentences but only minor errors in most sentences C Most sentences include minor errors with some major errors D Many errors throughout both minor and major Vocabulary (assessing the range, appropriateness, and accuracy A Uses a wide range of vocabulary appropriately and accurately B Uses a range of vocabulary mostly appropriately and accurately with a few errors C Uses a limited range of words with many minor errors and a few major errors D Uses a very limited range of words with many errors both minor and major Comprehensibility (assessing the clarity of message) A Completely understandable B Mostly understandable with some minor ambiguities C Partially understandable with a few major ambiguities or incomprehensible expressions D Some sentences are understandable, but much of the script is beyond comprehension No. Script type Local English teachers Native English teachers L1 L2 L3 L4 L5 L6 N1 N2 N3 N4 N5 N6 1 Non-GT A C C D A C C C D C C C 2 Non-GT A B B C A A B A B B C B 3 Non-GT B D C D A D D C D C C C 4 Non-GT C C B D B B B B B C B B 5 Non-GT A A C D B B B C C C C C 6 Non-GT B B B B A B C B C B A B 7 Non-GT C C B C B B B B B C A B 8 Non-GT C C B C A C D C C C C B 9 Non-GT A B B D A B C B B B B C 10 Non-GT D D D D C B C C C C C C 11 Non-GT B C C D B C D C C C C C 12 Non-GT C D D D C B D C B C B C 13 Non-GT A A A A B C A A C B B A 14 Non-GT B B A C A B D B B C A B 15 Non-GT C C C D B C C C C C C C 16 Non-GT D D C D A C C C C C C C 17 Non-GT B D B C B C D B B C C C 18 Non-GT C D C D C C D C C C C C 19 Non-GT C B A B B C B B B B B B 20 Non-GT A B C C A C C C C C A B 21 Non-GT D C B D B C C C C C B B 22 Non-GT B C C D A B C C C C C B 23 Non-GT C D C D C C D C D C D C 24 Non-GT C B C D C C D B D C B C 25 Non-GT C C C C C C C C C B D C 26 Non-GT D D D D C D D C D C D C 27 GT B A A A A A B B B A A B 28 GT C C B B A A C B C B A B 29 GT C A C C B C A A A B B B 30 GT C C B D A B C A B C A B 31 GT D C C D C C B B B C B B 32 GT B B B C A B B B A B A B 33 GT C C A C A C B B A B A B 34 GT C C B D B B C B B C A B 35 GT B A A B A A B B A A A B 36 GT C B C D B C D B C C B B 37 GT C C B C A B C C A B A B 38 GT C B B C B C D B C B B C 39 GT B A B A A D A C B A C B 40 GT D B B A A C C C B A B B 41 GT A A B B A B A A B B A B 42 GT B D A C A B C B B A B B 43 GT D D B D C D D C C C B C 44 GT B A C C A B C C C B A B 45 GT D C B C B C C C C A B B P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 30
  • 14. Appendix C. Raw grades given by teachers on vocabulary (continued ) No. Script type Local English teachers Native English teachers L1 L2 L3 L4 L5 L6 N1 N2 N3 N4 N5 N6 46 GT C C B D A A B B C C A B 47 GT D B C D B C C B C B B C 48 GT D C B D B B D B C C B C No. Script type Local English teachers Native English teachers L1 L2 L3 L4 L5 L6 N1 N2 N3 N4 N5 N6 1 Non-GT C C C D C C D C C C B C 2 Non-GT A B B B A B B A B B A B 3 Non-GT B D C D C D D C C C B C 4 Non-GT C B B C B A B B B B B B 5 Non-GT B A C C B C B B B B B B 6 Non-GT B C C C B B C B C C A B 7 Non-GT C B B B B A B B B B A A 8 Non-GT C B C B B C C B B B B B 9 Non-GT B C A C A B B C B B B B 10 Non-GT B B C C B A C A B B B B 11 Non-GT C B C D B C C C C C C B 12 Non-GT C C C C B A C B B B B B 13 Non-GT C C C C C B A A C B B B 14 Non-GT B B A B B A D B B C A B 15 Non-GT B B C C B B B C B B B B 16 Non-GT C C C D B C C C C C C B 17 Non-GT C C C C B C C B B B B B 18 Non-GT B C C C D C D B C B C B 19 Non-GT C B A B B C B A C B B B 20 Non-GT B B C D B B B C B B B B 21 Non-GT C B C D C C C B C C B C 22 Non-GT B B B C B B B B B C B B 23 Non-GT C C C D C C D C C C B B 24 Non-GT C C C D B C D B C B B B 25 Non-GT C C C C C B C C D B C B 26 Non-GT C C C D C C D C B C C B 27 GT B A A A B A B A B B A B 28 GT B B B B A A B C B B A A 29 GT C B B B B C A A B A A C 30 GT C C B C B B C A B B B B 31 GT C B C C C B B B B B B B 32 GT C C C C B C A B B B A B 33 GT B C A B A C B A A A A B 34 GT C D B D C A C B B B A B 35 GT A A A B A A B B A A A A 36 GT C D C D C C D B C C B C 37 GT C D C C B C C B B C B C 38 GT C D B B B B C A C B A B 39 GT B B B B A C B C A B C B 40 GT C B C A C C C B B A C B 41 GT B A C B A B B A C B A B 42 GT A B A B A A B C B A C B 43 GT C D C D C C D B C C C B 44 GT B C C C A B C C B B C B 45 GT C B C B B C C C B A B A 46 GT C C B D B A B B B C B B 47 GT C D C D C B D C B B A B 48 GT C C A D B B C A B B C B P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 31
  • 15. Appendix D. Raw grades given by teachers on comprehensibility Appendix E. Results of non-GT (N [ 26) and GT scripts (N [ 22) across three grading criteria No. Script type Local English teachers Native English teachers L1 L2 L3 L4 L5 L6 N1 N2 N3 N4 N5 N6 1 Non-GT A B C B A B C B C B B B 2 Non-GT A A B B A A A A A B B A 3 Non-GT B C B C A C C B B C A B 4 Non-GT B B B C A A A B A C B B 5 Non-GT A A B C A C A B B B B B 6 Non-GT B B B B A B C B B B A B 7 Non-GT C B C B A A B B B C B A 8 Non-GT A B B B B C C B B B B A 9 Non-GT A A A C A B C B A B B B 10 Non-GT C C C C C C B B C B B B 11 Non-GT A A C C A B C C B C B B 12 Non-GT B D D B A B C C C C C B 13 Non-GT A B A B C A B A C A A A 14 Non-GT A A A B A B C B B B A A 15 Non-GT B B B C A B A C B B C B 16 Non-GT B D B D A C C C B C B B 17 Non-GT A C B B A C B A A B B B 18 Non-GT A C C C C C C C B B D B 19 Non-GT C C A C B B B B B B C A 20 Non-GT A B C C A B B B B B A B 21 Non-GT B C C D B C A B B C B B 22 Non-GT A C B B A B B B C B B A 23 Non-GT B C C D B B C C C C B B 24 Non-GT B C C D A C D B C C B B 25 Non-GT B B B B C B A C B B C B 26 Non-GT B C D C C C D C B C C B 27 GT B A A A A A A A A A A B 28 GT B C B A A B C C C B B B 29 GT C A C D B B A A A B B B 30 GT B C B D A C C A B B C B 31 GT C B D D C C A B B B C B 32 GT A B C C A B A B A B B B 33 GT C B A D A D B A A B B B 34 GT B D B C C B B B A C C B 35 GT A A A B A A A A B A B B 36 GT C C D C C D D B B C D B 37 GT C C C C A C B B B B B B 38 GT B D B C B C D B B B C B 39 GT A A A A B C A C A A C B 40 GT A B B A B C B B B A B B 41 GT A A C B A B A A B B A B 42 GT B D A C A B C C B B C B 43 GT C D C D C C B B B C B B 44 GT B C C C A B C C B B C B 45 GT A C B B B B B B B A B B 46 GT B C B D B A B C C C A B 47 GT A C C C B C B B B B B B 48 GT C B B D B B C C C C D B No. Script type Grammar Vocabulary Comprehensibility Measure (logits) Measure (logits) Measure (logits) 1 Non-GT þ0.25 0.48 0.68 2 Non-GT þ1.27 þ2.13 0.09 3 Non-GT þ3.68 þ1.88 þ1.71 4 Non-GT þ1.27 0.24 0.09 5 Non-GT 0.68 0.71 0.89 6 Non-GT þ1.55 þ1.39 þ0.1 7 Non-GT þ2.82 þ1.88 þ1.18 P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 32
  • 16. Appendix F. Supplementary data Supplementary data to this article can be found online at https://doi.org/10.1016/j.esp.2019.07.001. References Bahri, H., & Mahadi, T. S. T. (2016). Google translate as a supplementary tool for learning Malay: A case study at Universiti Sains Malaysia. Advances in Language and Literary Studies, 7(3), 162-167. Bellos, D. (2011). Is that a fish in your ear?: Translation and the meaning of everything. New York: Farrar, Straus and Giroux. Boone, W. J., Staver, J. R., & Yale, M. S. (2013). Rasch analysis in the human sciences. Retrieved from https://doi.org/10.1007/978-94-007-6857-4_20. Clifford, J., Merschel, L., & Munné, J. (2013). Surveying the landscape: What is the role of machine translation in language learning? Revista d’innovació Educativa, 10, 108-121. https://doi.org/10.7203/attic.10.2228. Crossley, S. A. (2018). Technological disruption in foreign language teaching: The rise of simultaneous machine translation. Language Teaching, 51(4), 141- 152. Dai, S. (2018, October 24). Baidu to debut simultaneous machine translation in latest challenge to Google. Retrieved from https://www.scmp.com/tech/start- ups/article/2169832/baidu-debut-simultaneous-machine-translation-latest-challenge-google. Farzi, R. (2016). Taming translation technology for L2 writing: Documenting the use of free online translation tools by ESL students in a writing course. Doctoral dissertation. Université d’Ottawa/University of Ottawa. Gehring, J., & Auli, M. (2017). A novel approach to neural machine translation. Retrieved from https://code.fb.com/ml-applications/a-novel-approach-to- neural-machine-translation/. Google AI Blog. (2016). A neural network for machine translation, at production scale. Retrieved from https://ai.googleblog.com/2016/09/a-neural-network- for-machine.html. Groves, M., & Mundt, K. (2015). Friend or foe? Google translate in language for academic purposes. English for Specific Purposes, 37(1), 112-121. https://doi. org/10.1016/j.esp.2014.09.001. (continued ) No. Script type Grammar Vocabulary Comprehensibility Measure (logits) Measure (logits) Measure (logits) 8 Non-GT þ0.02 0.01 0.47 9 Non-GT þ1.27 þ1.63 þ1.00 10 Non-GT 1.38 1.96 2.22 11 Non-GT þ2.15 þ2.39 þ0.29 12 Non-GT þ0.75 þ0.45 0.68 13 Non-GT 1.14 þ0.22 0.47 14 Non-GT þ1.27 þ1.15 þ1.18 15 Non-GT þ1.55 0.01 þ1.18 16 Non-GT 0.68 1.44 0.28 17 Non-GT 1.14 0.71 þ0.1 18 Non-GT þ0.75 0.48 0.47 19 Non-GT þ0.75 0.01 0.28 20 Non-GT 0.91 0.95 1.35 21 Non-GT þ0.75 þ1.39 þ0.64 22 Non-GT 0.22 0.71 0.68 23 Non-GT þ2.15 þ1.39 þ1.18 24 Non-GT þ1.84 0.71 þ1.00 25 Non-GT 1.90 0.01 1.35 26 Non-GT þ1.27 þ1.39 þ0.10 27 GT þ0.02 0.24 0.47 28 GT 1.14 0.01 0.68 29 GT 1.14 1.69 þ0.10 30 GT 3.26 3.63 2.22 31 GT 0.68 þ0.22 þ0.64 32 GT þ0.25 þ1.15 þ0.29 33 GT 0.68 0.01 þ0.47 34 GT þ1.84 þ2.13 þ1.35 35 GT 2.49 1.44 1.35 36 GT 0.68 0.24 þ0.47 37 GT 1.38 1.19 0.28 38 GT þ0.5 0.01 þ1.35 39 GT 0.91 1.69 þ0.47 40 GT 3.26 2.52 3.11 41 GT 0.22 þ0.22 þ0.64 42 GT þ0.02 0.24 þ1.00 43 GT 1.38 1.96 0.09 44 GT þ0.75 þ2.13 þ2.09 45 GT þ0.50 0.01 þ1.00 46 GT 1.63 0.24 0.68 47 GT 0.91 þ1.15 þ0.47 48 GT 1.38 0.71 1.11 P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 33
  • 17. Jolley, J. R., & Maimone, L. (2015). Free online machine translation: Use and perceptions by Spanish students and instructors. In A. J. Moeller (Ed.), Selected papers from the 2015 Central states Conference on the teaching of foreign languagesLearn languages, explore cultures, transform lives (pp. 181-200). WI: Eau Claire. Josefsson, E. (2011). Contemporary approaches to translation in the classroom: A study of students’ attitudes and strategies. Retrieved from http://urn.kb.se/ resolve?urn¼urn:nbn:se:du-5929. Kirchhoff, K., Turner, M., Axelrod, A., & Saavedra, F. (2011). Application of statistical machine translation to public health information: A feasibility study. Journal of the American Informatics Association, 18(4), 473-478. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3128406/. Lee, S. M. (2019). The impact of using machine translation on EFL students’ writing. Computer Assisted Language Learning. https://doi.org/10.1080/09588221. 2018.1553186. Li, Y. T. (2004). Writing characteristics of Taiwanese students with handwriting difficulties. Journal of National Taiwan Normal University: Education, 49(2), 43-64. https://doi.org/10.3966/2073753X2004104902003. Matthews, S., & Yip, V. (1994). Cantonese: A comprehensive grammar. London: Routledge. Mundt, K., & Groves, M. J. (2016). A double edged sword: The merits and the policy implications of Google translate in higher education. European Journal of Higher Education, 6(3), 1-15. https://doi.org/10.1080/21568235.2016.1172248. Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In A. Elithorn, & R. Banerji (Eds.), Artificial and human intelligence. Retrieved from http://www.mt-archive.info/Nagao-1984.pdf. O’Neill, E. M. (2019). Online translator, dictionary, and search engine use among L2 students. CALL-EJ, 20(1), 154-177. Retrieved from http://callej.org/ journal/20-1/O’Neill2019.pdf. van Rensburg, A., Snyman, C., & Lotz, S. (2012). Applying Google Translate in a higher education environment: Translation products assessed. Southern African Linguistics and Applied Language Studies, 3(4), 511-524. Retrieved from https://doi.org/10.2989/16073614.2012.750824. Sheppard, F. (2011). Medical writing in English: The problem with Google translate. La Presse Médicale, 40(6), 565-566. Retrieved from http://www.em- consulte.com/en/article/293595. Slocum, J. (1988). Machine translation systems. New York: Cambridge University Press. Smagorinsky, P. (2008). The method section as conceptual epicenter in constructing social science research reports. Written Communication, 25, 389-411. Paul Stapleton is an Associate Professor at the Education University of Hong Kong. Leung Ka Kin Becky is a Research Assistant at the Education University of Hong Kong. P. Stapleton, B. Leung Ka Kin / English for Specific Purposes 56 (2019) 18–34 34