The University of Sheffield
Browse
ARCHIVE
ru.en.zip (1.47 GB)
ARCHIVE
en.ru.zip (1.47 GB)
ARCHIVE
pl.en.zip (1.47 GB)
ARCHIVE
en.pl.zip (1.47 GB)
ARCHIVE
fr.en.zip (1.47 GB)
ARCHIVE
en.fr.zip (1.47 GB)
ARCHIVE
de.en.zip (1.47 GB)
ARCHIVE
en.de.zip (1.47 GB)
1/0
8 files

MTCue: Model Checkpoints

dataset
posted on 2023-06-20, 08:05 authored by Sebastian Vincent, Robert Flynn, Carolina ScartonCarolina Scarton

The model checkpoints contained here are associated with an ACL 2023 paper entitled "MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by Leveraging Unstructured Context in Neural Machine Translation" (citation is to be added when the Proceedings are published).


Each .zip file here contains a checkpoint to a baseline translation model and MTCue for the language pair in the name (e.g. en.de is the English-to-German language pair). How to use them is described in detail in the associated GitHub repository.


The models (checkpoints.zip) were trained in PyTorch and via the Fairseq toolkit:

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G.,  Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf,  A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S.,  Steiner, B., Fang, L., … Chintala, S. (2019). PyTorch: An imperative  style, high-performance deep learning library. Advances in Neural  Information Processing Systems, 32(NeurIPS).


Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. 2019. fairseq: A Fast, Extensible Toolkit for Sequence Modeling. In Proceedings  of the 2019 Conference of the North American Chapter of the Association  for Computational Linguistics (Demonstrations), pages 48–53, Minneapolis, Minnesota. Association for Computational Linguistics.


Full documentation to how to use the resources is included in the [GitHub repository](https://github.com/st-vincent1/MTCue) which contains a link to this ORDA page.

Funding

UKRI Centre for Doctoral Training in Speech and Language Technologies and their Applications

Engineering and Physical Sciences Research Council

Find out more...

History

Ethics

  • There is no personal data or any that requires ethical approval

Policy

  • The data complies with the institution and funders' policies on access and sharing

Sharing and access restrictions

  • The uploaded data can be shared openly

Data description

  • The file formats are open or commonly used

Methodology, headings and units

  • There is a file including methodology, headings and units, such as a readme.txt

Usage metrics

    School of Computer Science

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC