HyperAIHyperAI

Command Palette

Search for a command to run...

European Parliament Proceedings Parallel Corpus 1996-2011 Statistical Machine Translation Corpus

Date

7 years ago

Size

3.75 GB

Organization

University of Edinburgh

Publish URL

www.statmt.org

The European Parliament Proceedings Parallel Corpus 1996-2011 dataset is a corpus for statistical machine translation. The Europarl parallel corpus is derived from the proceedings of the European Parliament and includes versions in 21 European languages:

  • Romani (French, Italian, Spanish, Portuguese, Romanian)
  • Germanic languages (English, Dutch, German, Danish, Swedish)
  • Slavik (Bulgarian, Czech, Polish, Slovak, Slovenian)
  • Finni-Ugric (Finnish, Hungarian, Estonian)
  • Baltic (Latvian, Lithuanian)
  • Greek

The European Parliament Proceedings Parallel Corpus 1996-2011 dataset was originally published by the School of Informatics at the University of Edinburgh, Scotland in 2005, with the main publisher being Philipp Koehn.

The 7th edition of this dataset was released in 2012. The related paper is "Europarl: A Parallel Corpus for Statistical Machine Translation"

European_Parliament_Proceedings_Parallel_Corpus_1996-2011.torrent
Seeding 3Downloading 0Completed 1,015Total Downloads 1,558
  • European_Parliament_Proceedings_Parallel_Corpus_1996-2011/
    • README.md
      1.55 KB
    • README.txt
      3.11 KB
      • data/
        • bg-en.tgz
          40.62 MB
        • cs-en.tgz
          99.8 MB
        • da-en.tgz
          278.8 MB
        • de-en.tgz
          467.42 MB
        • el-en.tgz
          611.8 MB
        • es-en.tgz
          797.83 MB
        • et-en.tgz
          854.43 MB
        • europarl.tgz
          2.3 GB
        • fi-en.tgz
          2.47 GB
        • fr-en.tgz
          2.66 GB
        • hu-en.tgz
          2.72 GB
        • it-en.tgz
          2.9 GB
        • lt-en.tgz
          2.95 GB
        • lv-en.tgz
          3.01 GB
        • nl-en.tgz
          3.2 GB
        • pl-en.tgz
          3.25 GB
        • pt-en.tgz
          3.44 GB
        • ro-en.tgz
          3.47 GB
        • sk-en.tgz
          3.53 GB
        • sl-en.tgz
          3.58 GB
        • sv-en.tgz
          3.75 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
European Parliament Proceedings Parallel Corpus 1996-2011 Statistical Machine Translation Corpus | Datasets | HyperAI