MTCue: Model Checkpoints

dataset

posted on 2023-06-20, 08:05 authored by Sebastian Vincent, Robert FlynnRobert Flynn, Carolina ScartonCarolina Scarton

The model checkpoints contained here are associated with an ACL 2023 paper entitled "MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by Leveraging Unstructured Context in Neural Machine Translation" (citation is to be added when the Proceedings are published).

Each .zip file here contains a checkpoint to a baseline translation model and MTCue for the language pair in the name (e.g. en.de is the English-to-German language pair). How to use them is described in detail in the associated GitHub repository.

The models (checkpoints.zip) were trained in PyTorch and via the Fairseq toolkit:

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., … Chintala, S. (2019). PyTorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32(NeurIPS).

Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. 2019. fairseq: A Fast, Extensible Toolkit for Sequence Modeling. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pages 48–53, Minneapolis, Minnesota. Association for Computational Linguistics.

Full documentation to how to use the resources is included in the [GitHub repository](https://github.com/st-vincent1/MTCue) which contains a link to this ORDA page.

Funding

UKRI Centre for Doctoral Training in Speech and Language Technologies and their Applications

Engineering and Physical Sciences Research Council

Find out more...

History

Ethics

There is no personal data or any that requires ethical approval

Policy

The data complies with the institution and funders' policies on access and sharing

Sharing and access restrictions

The uploaded data can be shared openly

Data description

The file formats are open or commonly used

Methodology, headings and units

There is a file including methodology, headings and units, such as a readme.txt

Usage metrics