Enhancing Neural Machine Translation of Low-Resource Languages: Corpus Development, Human Evaluation and Explainable AI Architectures (2403.01580v1)
Abstract: In the current machine translation (MT) landscape, the Transformer architecture stands out as the gold standard, especially for high-resource language pairs. This research delves into its efficacy for low-resource language pairs including both the English$\leftrightarrow$Irish and English$\leftrightarrow$Marathi language pairs. Notably, the study identifies the optimal hyperparameters and subword model type to significantly improve the translation quality of Transformer models for low-resource language pairs. The scarcity of parallel datasets for low-resource languages can hinder MT development. To address this, gaHealth was developed, the first bilingual corpus of health data for the Irish language. Focusing on the health domain, models developed using this in-domain dataset exhibited very significant improvements in BLEU score when compared with models from the LoResMT2021 Shared Task. A subsequent human evaluation using the multidimensional quality metrics error taxonomy showcased the superior performance of the Transformer system in reducing both accuracy and fluency errors compared to an RNN-based counterpart. Furthermore, this thesis introduces adaptNMT and adaptMLLM, two open-source applications streamlined for the development, fine-tuning, and deployment of neural machine translation models. These tools considerably simplify the setup and evaluation process, making MT more accessible to both developers and translators. Notably, adaptNMT, grounded in the OpenNMT ecosystem, promotes eco-friendly natural language processing research by highlighting the environmental footprint of model development. Fine-tuning of MLLMs by adaptMLLM demonstrated advancements in translation performance for two low-resource language pairs: English$\leftrightarrow$Irish and English$\leftrightarrow$Marathi, compared to baselines from the LoResMT2021 Shared Task.
- “Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild”, 2019 arXiv:1906.02569 [cs.LG]
- Haithem Afli, Sorcha Maguire and Andy Way “Sentiment translation for low resourced languages: experiments on Irish general election Tweets” Unpublished In 18th International Conference on Computational Linguistics and Intelligent Text Processing, 2017 URL: https://doras.dcu.ie/23370/
- Benyamin Ahmadnia and Bonnie J. Dorr “Augmenting Neural Machine Translation through Round-Trip Training Approach” In Open Computer Science 9.1, 2019, pp. 268–278 DOI: doi:10.1515/comp-2019-0019
- “The Digital Divide and Social Inclusion among Refugee Migrants: A Case in Regional Australia” In Information Technology and People 28, 2015 DOI: 10.1108/ITP-04-2014-0083
- “Optimizing Transformer for Low-Resource Neural Machine Translation” In Proceedings of the 28th International Conference on Computational Linguistics Barcelona, Spain (Online): International Committee on Computational Linguistics, 2020, pp. 3429–3435 DOI: 10.18653/v1/2020.coling-main.304
- “Using the wayback machine to mine websites in the social sciences: A methodological resource” In Journal of the Association for Information Science and Technology 67.8, 2016, pp. 1904–1915 DOI: https://doi.org/10.1002/asi.23503
- “Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond” In Transactions of the Association for Computational Linguistics 7 Cambridge, MA: MIT Press, 2019, pp. 597–610 DOI: 10.1162/tacl˙a˙00288
- Ron Artstein “Inter-annotator Agreement” In Handbook of Linguistic Annotation Dordrecht: Springer Netherlands, 2017, pp. 297–313 DOI: 10.1007/978-94-024-0881-2˙11
- “Compositional Representation of Morphologically-Rich Input for Neural Machine Translation” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) Melbourne, Australia: Association for Computational Linguistics, 2018, pp. 305–311 DOI: 10.18653/v1/P18-2049
- Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Bengio “Neural Machine Translation by Jointly Learning to Align and Translate”, 2016 arXiv:1409.0473 [cs.CL]
- “Evaluating the carbon footprint of NLP methods: a survey and analysis of existing tools” In Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing Virtual: Association for Computational Linguistics, 2021, pp. 11–21 DOI: 10.18653/v1/2021.sustainlp-1.2
- “Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI” In Information Fusion 58, 2020, pp. 82–115 DOI: https://doi.org/10.1016/j.inffus.2019.12.012
- María Do Campo Bayón and Pilar Sánchez-Gijón “Evaluating machine translation in a low-resource language combination: Spanish-Galician.” In Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks Dublin, Ireland: European Association for Machine Translation, 2019, pp. 30–35 URL: https://aclanthology.org/W19-6705
- “Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)” Association for Computational Linguistics, 2021 URL: https://aclanthology.org/2021.humeval-1.0
- “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21 Virtual Event, Canada: Association for Computing Machinery, 2021, pp. 610–623 DOI: 10.1145/3442188.3445922
- “Random Search for Hyper-Parameter Optimization” In Journal of Machine Learning Research 13.10, 2012, pp. 281–305 URL: http://jmlr.org/papers/v13/bergstra12a.html
- Elan Van Biljon, Arnu Pretorius and Julia Kreutzer “On Optimal Transformer Depth for Low-Resource Language Translation” In 1st AfricaNLP Workshop Proceedings, AfricaNLP@ICLR 2020, Virtual Conference, Formerly Addis Ababa Ethiopia, 26th April 2020, 2020 URL: https://arxiv.org/abs/2004.04418
- Christopher M. Bishop “Pattern Recognition and Machine Learning (Information Science and Statistics)” Berlin, Heidelberg: Springer-Verlag, 2006
- Ekaba Bisong “Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners” Berkeley, CA: Apress, 2019, pp. 59–64 DOI: 10.1007/978-1-4842-4470-8˙7
- “Findings of the 2017 Conference on Machine Translation (WMT17)” In Proceedings of the Second Conference on Machine Translation Copenhagen, Denmark: Association for Computational Linguistics, 2017, pp. 169–214 DOI: 10.18653/v1/W17-4717
- “Findings of the 2018 Conference on Machine Translation (WMT18)” In Proceedings of the Third Conference on Machine Translation: Shared Task Papers Belgium, Brussels: Association for Computational Linguistics, 2018, pp. 272–303 DOI: 10.18653/v1/W18-6401
- “Massive Exploration of Neural Machine Translation Architectures” In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Copenhagen, Denmark: Association for Computational Linguistics, 2017, pp. 1442–1451 DOI: 10.18653/v1/D17-1151
- “Language Models Are Few-Shot Learners” In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20 Vancouver, BC, Canada: Curran Associates Inc., 2020 URL: https://dl.acm.org/doi/pdf/10.5555/3495724.3495883
- Christian Buck, Kenneth Heafield and Bas Ooyen “N-gram Counts and Language Models from the Common Crawl” In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14) Reykjavik, Iceland: European Language Resources Association (ELRA), 2014, pp. 3579–3584 URL: http://www.lrec-conf.org/proceedings/lrec2014/pdf/1097_Paper.pdf
- “(Meta-) Evaluation of Machine Translation” In Proceedings of the Second Workshop on Statistical Machine Translation Prague, Czech Republic: Association for Computational Linguistics, 2007, pp. 136–158 URL: https://aclanthology.org/W07-0718
- “Is neural machine translation the new state of the art?” In The Prague Bulletin of Mathematical Linguistics PBML, 2017 DOI: DOI: 10.1515/pralin-2017-0013
- “The UCF Systems for the LoResMT 2021 Machine Translation Shared Task” In Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021) Virtual: Association for Machine Translation in the Americas, 2021, pp. 129–133 URL: https://aclanthology.org/2021.mtsummit-loresmt.13
- “On the Properties of Neural Machine Translation: Encoder–Decoder Approaches” In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation Doha, Qatar: Association for Computational Linguistics, 2014, pp. 103–111 DOI: 10.3115/v1/W14-4012
- Chenhui Chu, Raj Dabre and Sadao Kurohashi “An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation” In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) Vancouver, Canada: Association for Computational Linguistics, 2017, pp. 385–391 DOI: 10.18653/v1/P17-2061
- Jacob Cohen “A Coefficient of Agreement for Nominal Scales” In Educational and Psychological Measurement 20.1, 1960, pp. 37–46 DOI: 10.1177/001316446002000104
- “Unsupervised Cross-lingual Representation Learning at Scale” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics Online: Association for Computational Linguistics, 2020, pp. 8440–8451 DOI: 10.18653/v1/2020.acl-main.747
- “No Language Left Behind: Scaling Human-Centered Machine Translation”, 2022 DOI: 10.48550/arXiv.2207.04672
- “SYSTRAN’s Pure Neural Machine Translation Systems”, 2016 arXiv:1610.05540 [cs.CL]
- “Meteor Universal: Language Specific Translation Evaluation for Any Target Language” In Proceedings of the Ninth Workshop on Statistical Machine Translation Baltimore, Maryland, USA: Association for Computational Linguistics, 2014, pp. 376–380 DOI: 10.3115/v1/W14-3348
- “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) Minneapolis, Minnesota: Association for Computational Linguistics, 2019, pp. 4171–4186 DOI: 10.18653/v1/N19-1423
- Shuoyang Ding, Adithya Renduchintala and Kevin Duh “A Call for Prudent Choice of Subword Merge Operations in Neural Machine Translation” In Proceedings of Machine Translation Summit XVII: Research Track Dublin, Ireland: European Association for Machine Translation, 2019, pp. 204–213 URL: https://aclanthology.org/W19-6620
- “SMT versus NMT: Preliminary comparisons for Irish” In Proceedings of the AMTA 2018 Workshop on Technologies for MT of Low Resource Languages (LoResMT 2018) Boston, MA: Association for Machine Translation in the Americas, 2018, pp. 12–20 URL: https://aclanthology.org/W18-2202
- “A human evaluation of English-Irish statistical and neural machine translation” In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation Lisboa, Portugal: European Association for Machine Translation, 2020, pp. 431–440 URL: https://aclanthology.org/2020.eamt-1.46
- Mikel L. Forcada “Making sense of neural machine translation” In Translation Spaces 6.2 John Benjamins, 2017, pp. 291–309 DOI: https://doi.org/10.1075/ts.6.2.06for
- “Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation” In Transactions of the Association for Computational Linguistics 9 Cambridge, MA: MIT Press, 2021, pp. 1460–1474 DOI: 10.1162/tacl˙a˙00437
- Philip Gage “A New Algorithm for Data Compression” In C Users J. 12.2 USA: R & D Publications, Inc., 1994, pp. 23–38
- “Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation” In Computer Vision – ECCV 2016 Cham: Springer International Publishing, 2016, pp. 597–613 DOI: https://doi.org/10.1007/978-3-319-46493-0˙36
- “Finding the Optimal Vocabulary Size for Neural Machine Translation” In Findings of the Association for Computational Linguistics: EMNLP 2020 Online: Association for Computational Linguistics, 2020, pp. 3955–3964 DOI: 10.18653/v1/2020.findings-emnlp.352
- “XAI—Explainable artificial intelligence” In Science Robotics 4.37, 2019, pp. eaay7120 DOI: 10.1126/scirobotics.aay7120
- “Parallel corpora for medium density languages” In Recent Advances in Natural Language Processing IV, 2007, pp. 247–258 DOI: 10.1075/cilt.292.32var
- “Dual Learning for Machine Translation” In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16 Barcelona, Spain: Curran Associates Inc., 2016, pp. 820–828 URL: https://dl.acm.org/doi/pdf/10.5555/3157096.3157188
- “Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning” In J. Mach. Learn. Res. 21.1 JMLR.org, 2020 URL: https://dl.acm.org/doi/pdf/10.5555/3455716.3455964
- “Long Short-Term Memory” In Neural Comput. 9.8 Cambridge, MA, USA: MIT Press, 1997, pp. 1735–1780 DOI: 10.1162/neco.1997.9.8.1735
- Linta Iftikhar “DocGPT: Impact of ChatGPT-3 on Health Services as a Virtual Doctor” In EC Paediatrics 12, 2023, pp. 45–55
- “Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation” In Proceedings of Machine Translation Summit XVII: Research Track Dublin, Ireland: European Association for Machine Translation, 2019, pp. 128–139 URL: https://aclanthology.org/W19-6613
- Wandri Jooste, Rejwanul Haque and Andy Way “Knowledge Distillation: A Method for Making Neural Machine Translation More Efficient” In Information 13.2, 2022 DOI: 10.3390/info13020088
- “Knowledge Distillation for Sustainable Neural Machine Translation” In Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track) Orlando, USA: Association for Machine Translation in the Americas, 2022, pp. 221–230 URL: https://aclanthology.org/2022.amta-upg.16
- “Marian: Fast Neural Machine Translation in C++” In Proceedings of ACL 2018, System Demonstrations Melbourne, Australia: Association for Computational Linguistics, 2018, pp. 116–121 DOI: 10.18653/v1/P18-4020
- “ChatGPT for good? On opportunities and challenges of large language models for education” In Learning and Individual Differences 103, 2023, pp. 102274 DOI: https://doi.org/10.1016/j.lindif.2023.102274
- Diederik P. Kingma and Jimmy Ba “Adam: A Method for Stochastic Optimization” In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015 URL: http://arxiv.org/abs/1412.6980
- “OpenNMT: Open-Source Toolkit for Neural Machine Translation” In Proceedings of ACL 2017, System Demonstrations Vancouver, Canada: Association for Computational Linguistics, 2017, pp. 67–72 URL: https://aclanthology.org/P17-4012
- Filip Klubička, Antonio Toral and Víctor M. Sánchez-Cartagena “Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian” In Machine Translation 32.3, 2018, pp. 195–215 DOI: 10.1007/s10590-018-9214-x
- “Six Challenges for Neural Machine Translation” In Proceedings of the First Workshop on Neural Machine Translation Vancouver: Association for Computational Linguistics, 2017, pp. 28–39 DOI: 10.18653/v1/W17-3204
- Julia Kreutzer, Jasmijn Bastings and Stefan Riezler “Joey NMT: A Minimalist NMT Toolkit for Novices” In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations Hong Kong, China: Association for Computational Linguistics, 2019, pp. 109–114 DOI: 10.18653/v1/D19-3019
- Taku Kudo “Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Melbourne, Australia: Association for Computational Linguistics, 2018, pp. 66–75 DOI: 10.18653/v1/P18-1007
- “SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing” In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations Brussels, Belgium: Association for Computational Linguistics, 2018, pp. 66–71 DOI: 10.18653/v1/D18-2012
- “Quantifying the Carbon Emissions of Machine Learning” In CoRR abs/1910.09700, 2019 arXiv: http://arxiv.org/abs/1910.09700
- “gaHealth: An English–Irish Bilingual Corpus of Health Data” In Proceedings of the Thirteenth Language Resources and Evaluation Conference Marseille, France: European Language Resources Association, 2022, pp. 6753–6758 URL: https://aclanthology.org/2022.lrec-1.727
- Séamus Lankford, Haithem Afli and Andy Way “Machine Translation in the Covid domain: an English-Irish case study for LoResMT 2021” In Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021) Virtual: Association for Machine Translation in the Americas, 2021, pp. 144–150 URL: https://aclanthology.org/2021.mtsummit-loresmt.15
- Séamus Lankford, Haithem Afli and Andy Way “Transformers for Low-Resource Languages: Is Féidir Linn!” In Proceedings of Machine Translation Summit XVIII: Research Track Virtual: Association for Machine Translation in the Americas, 2021, pp. 48–60 URL: https://aclanthology.org/2021.mtsummit-research.5
- Séamus Lankford, Haithem Afli and Andy Way “Human Evaluation of English-Irish Transformer-Based NMT” In Information 13.7, 2022 DOI: 10.3390/info13070309
- Séamus Lankford, Haithem Afli and Andy Way “adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource Languages with Integrated LLM Playgrounds” In Information 14.12, 2023 DOI: 10.3390/info14120638
- Séamus Lankford, Haithem Afli and Andy Way “adaptNMT: an open-source, language-agnostic development environment for neural machine translation” In Language Resources and Evaluation Springer, 2023 DOI: DOI: 10.1007/s10579-023-09671-2
- Séamus Lankford, Haithem Afli and Andy Way “Design of an Open-Source Architecture for Neural Machine Translation” In Proceedings of the 1st Workshop on Open Community-Driven Machine Translation Tampere, Finland: European Association for Machine Translation, 2023, pp. 15–20 URL: https://aclanthology.org/2023.crowdmt-1.2
- “A set of recommendations for assessing human–machine parity in language translation” In Journal of Artificial Intelligence Research 67, 2020, pp. 653–672 DOI: https://doi.org/10.1613/jair.1.11371
- “GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding” In International Conference on Learning Representations, 2021 URL: https://openreview.net/forum?id=qrwe7XHTmYb
- “Pivot Machine Translation Using Chinese as Pivot Language” In Machine Translation Singapore: Springer Singapore, 2019, pp. 74–85 DOI: https://doi.org/10.1007/978-981-13-3083-4˙7
- “FaDA: fast document aligner using word embedding” In Prague Bulletin of Mathematical Linguistics PBML, 2016, pp. 169–179 DOI: DOI: 10.1515/pralin-2016-0016
- Arle Lommel “Metrics for Translation Quality Assessment: A Case for Standardising Error Typologies” In Translation Quality Assessment: From Principles to Practice Cham: Springer International Publishing, 2018, pp. 109–127 DOI: 10.1007/978-3-319-91241-7˙6
- Arle Lommel, Aljoscha Burchardt and Hans Uszkoreit “Multidimensional Quality Metrics (MQM): A Framework for Declaring and Describing Translation Quality Metrics” In Tradumàtica: tecnologies de la traducció 0, 2014, pp. 455–463 DOI: 10.5565/rev/tradumatica.77
- “Using a new analytic measure for the annotation and analysis of MT errors on real data” In Proceedings of the 17th Annual conference of the European Association for Machine Translation Dubrovnik, Croatia: European Association for Machine Translation, 2014, pp. 165–172 URL: https://aclanthology.org/2014.eamt-1.38
- “Blend: a Novel Combined MT Metric Based on Direct Assessment — CASICT-DCU submission to WMT17 Metrics Task” In Proceedings of the Second Conference on Machine Translation Copenhagen, Denmark: Association for Computational Linguistics, 2017, pp. 598–603 DOI: 10.18653/v1/W17-4768
- “Responses to language barriers in consultations with refugees and asylum seekers: a telephone survey of Irish general practitioners” In BMC Family Practice 9.1, 2008, pp. 68 DOI: 10.1186/1471-2296-9-68
- Mary L McHugh “Interrater reliability: the kappa statistic” In Biochemia medica 22.3 Medicinska naklada, 2012, pp. 276–282 DOI: 10.11613/BM.2012.031
- I.Dan Melamed, Ryan Green and Joseph P. Turian “Precision and Recall of Machine Translation” In Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers, 2003, pp. 61–63 URL: https://aclanthology.org/N03-2021
- “Design and Analysis of Experiments, 9th Edition” New York: Wiley, 2022
- Joss Moorkens, Séamus Lankford and Andy Way “Machine Translation and Automation An Introduction for Students, Translators, and Users.” London: Routledge, under review, 2024
- Shuyo Nakatani “Language Detection Library for Java”, 2010 URL: https://github.com/shuyo/language-detection
- Franz Josef Och and Hermann Ney “A Systematic Comparison of Various Statistical Alignment Models” In Computational Linguistics 29.1, 2003, pp. 19–51 DOI: 10.1162/089120103321337421
- “Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-resource Languages” In Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021) Virtual: Association for Machine Translation in the Americas, 2021, pp. 114–123 URL: https://aclanthology.org/2021.mtsummit-loresmt.11
- OpenAI “GPT-4 Technical Report”, 2023 arXiv:2303.08774 [cs.CL]
- “fairseq: A Fast, Extensible Toolkit for Sequence Modeling” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations) Minneapolis, Minnesota: Association for Computational Linguistics, 2019, pp. 48–53 DOI: 10.18653/v1/N19-4009
- “Bleu: a Method for Automatic Evaluation of Machine Translation” In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics Philadelphia, Pennsylvania, USA: Association for Computational Linguistics, 2002, pp. 311–318 DOI: 10.3115/1073083.1073135
- Peyman Passban, Andy Way and Qun Liu “Tailoring Neural Architectures for Translating from Morphologically Rich Languages” In Proceedings of the 27th International Conference on Computational Linguistics Santa Fe, New Mexico, USA: Association for Computational Linguistics, 2018, pp. 3134–3145 URL: https://aclanthology.org/C18-1265
- Maja Popović “chrF: character n-gram F-score for automatic MT evaluation” In Proceedings of the Tenth Workshop on Statistical Machine Translation Lisbon, Portugal: Association for Computational Linguistics, 2015, pp. 392–395 DOI: 10.18653/v1/W15-3049
- Matt Post “A Call for Clarity in Reporting BLEU Scores” In Proceedings of the Third Conference on Machine Translation: Research Papers Brussels, Belgium: Association for Computational Linguistics, 2018, pp. 186–191 DOI: 10.18653/v1/W18-6319
- “Attentive fine-tuning of Transformers for Translation of low-resourced languages @LoResMT 2021” In Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021) Virtual: Association for Machine Translation in the Americas, 2021, pp. 134–143 URL: https://aclanthology.org/2021.mtsummit-loresmt.14
- “Improving language understanding by generative pre-training” In Technical report, OpenAI OpenAI, 2018 URL: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
- “Language models are unsupervised multitask learners” In OpenAI blog 1.8, 2019, pp. 9 URL: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
- “DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters” In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’20 Virtual Event, CA, USA: Association for Computing Machinery, 2020, pp. 3505–3506 DOI: 10.1145/3394486.3406703
- Sebastian Ruder, Ivan Vulić and Anders Søgaard “A Survey of Cross-Lingual Word Embedding Models” In J. Artif. Int. Res. 65.1 El Segundo, CA, USA: AI Access Foundation, 2019, pp. 569–630 DOI: 10.1613/jair.1.11640
- David E Rumelhart, Geoffrey E Hinton and Ronald J Williams “Learning representations by back-propagating errors” In Nature 323.6088 Nature Publishing Group, 1986, pp. 533–536 DOI: 10.1038/323533a0
- “Informing the Use of Hyperparameter Optimization Through Metalearning” In 2017 IEEE International Conference on Data Mining (ICDM), 2017, pp. 1051–1056 DOI: 10.1109/ICDM.2017.137
- “CCMatrix: Mining Billions of High-Quality Parallel Sentences on the Web” In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Online: Association for Computational Linguistics, 2021, pp. 6490–6500 DOI: 10.18653/v1/2021.acl-long.507
- SEAI “Sustainable Energy in Ireland”, https://www.seai.ie/publications/Energy-in-Ireland-2020.pdf, 2020
- Rico Sennrich, Barry Haddow and Alexandra Birch “Edinburgh Neural Machine Translation Systems for WMT 16” In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers Berlin, Germany: Association for Computational Linguistics, 2016, pp. 371–376 DOI: 10.18653/v1/W16-2323
- Rico Sennrich, Barry Haddow and Alexandra Birch “Neural Machine Translation of Rare Words with Subword Units” In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Berlin, Germany: Association for Computational Linguistics, 2016, pp. 1715–1725 DOI: 10.18653/v1/P16-1162
- “Revisiting Low-Resource Neural Machine Translation: A Case Study” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics Florence, Italy: Association for Computational Linguistics, 2019, pp. 211–221 DOI: 10.18653/v1/P19-1021
- “A Study of Translation Edit Rate with Targeted Human Annotation” In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers Cambridge, Massachusetts, USA: Association for Machine Translation in the Americas, 2006, pp. 223–231 URL: https://aclanthology.org/2006.amta-papers.25
- “Results of the WMT15 Metrics Shared Task” In Proceedings of the Tenth Workshop on Statistical Machine Translation Lisbon, Portugal: Association for Computational Linguistics, 2015, pp. 256–273 DOI: 10.18653/v1/W15-3031
- “DGT-TM: A freely available Translation Memory in 22 languages” In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12) Istanbul, Turkey: European Language Resources Association (ELRA), 2012, pp. 454–459 URL: http://www.lrec-conf.org/proceedings/lrec2012/pdf/814_Paper.pdf
- Emma Strubell, Ananya Ganesh and Andrew McCallum “Energy and Policy Considerations for Deep Learning in NLP” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics Florence, Italy: Association for Computational Linguistics, 2019, pp. 3645–3650 DOI: 10.18653/v1/P19-1355
- Ilya Sutskever, Oriol Vinyals and Quoc V Le “Sequence to Sequence Learning with Neural Networks” In Advances in Neural Information Processing Systems 27 Curran Associates, Inc., 2014 URL: https://proceedings.neurips.cc/paper_files/paper/2014/file/a14ac55a4f27472c5d894ec1c3c743d2-Paper.pdf
- “LaMDA: Language Models for Dialog Applications” In CoRR abs/2201.08239, 2022 arXiv: https://arxiv.org/abs/2201.08239
- Benoit Thouin “The Meteo system” In Translating and the Computer: Practical experience of machine translation London, UK: Aslib, 1981 URL: https://aclanthology.org/1981.tc-1.4
- “OPUS-MT – Building open translation services for the World” In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation Lisboa, Portugal: European Association for Machine Translation, 2020, pp. 479–480 URL: https://aclanthology.org/2020.eamt-1.61
- “Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation” In Proceedings of the Third Conference on Machine Translation: Research Papers Brussels, Belgium: Association for Computational Linguistics, 2018, pp. 113–123 DOI: 10.18653/v1/W18-6312
- “Attention is All you Need” In Advances in Neural Information Processing Systems 30 Curran Associates, Inc., 2017 URL: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
- Andy Way “Quality expectations of machine translation” In Translation Quality Assessment: From Principles to Practice 1, Machine Translation: Technologies and Applications Series Volume Berlin/Heidelberg: Springer, 2018, pp. 159–178 DOI: DOI:10.1007/978-3-319-91241-7
- Andy Way “Machine translation: where are we at today?” In The Bloomsbury Companion to Language Industry Studies, Bloomsbury Companions NY, USA: Bloomsbury Academic Publishing, 2019 URL: https://doras.dcu.ie/24598/
- “Language Models are Few-shot Multilingual Learners” In Proceedings of the 1st Workshop on Multilingual Representation Learning Punta Cana, Dominican Republic: Association for Computational Linguistics, 2021, pp. 1–15 DOI: 10.18653/v1/2021.mrl-1.1
- “Transformers: State-of-the-Art Natural Language Processing” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations Online: Association for Computational Linguistics, 2020, pp. 38–45 DOI: 10.18653/v1/2020.emnlp-demos.6
- “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation” In CoRR abs/1609.08144, 2016 arXiv: http://arxiv.org/abs/1609.08144
- Shuoheng Yang, Yuxin Wang and Xiaowen Chu “A Survey of Deep Learning Techniques for Neural Machine Translation” In CoRR abs/2002.07526, 2020 arXiv: https://arxiv.org/abs/2002.07526
- “Transfer Learning for Low-Resource Neural Machine Translation” In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing Austin, Texas: Association for Computational Linguistics, 2016, pp. 1568–1575 DOI: 10.18653/v1/D16-1163
- Séamus Lankford (17 papers)