Language data for AI

At Oxford Languages, our language data specialists build unique text and speech datasets suitable for model training and other natural language processing (NLP) applications.

We offer

Human-curated data

Human-curated language datasets in over 60 languages.

Extensive data features

Dataset features to support a wide variety of use cases.

Data sourcing

Language, data and product sourcing specialists.

Data support

Support from our Customer Success team, to help you get the most value from our data.

Use cases we support

Machine Translation

We offer over 60 languages, including bilingual datasets, to support machine translation.

AI voice generator

Our pronunciation datasets provide lexical transcriptions and audio to improve text-to-speech and AI dubbing applications.

Conversational AI

Our language databases are designed to help with natural language understanding, enabling models to learn languages and interpret meaning accurately.

Predictive text

Our datasets support predictive text on onscreen keyboards and AI writing assistant applications. Sensitivity labels in our data can be used to improve handling of offensive, vulgar, or demeaning language, while dialect labels improve text prediction in regional dialects and language variations.

AI writing assistant tools

Oxford Languages offers datasets that can aid writing tools in suggesting grammar, spelling, and vocabulary improvements.

Get in touch

If you would like more information about our datasets and services or have any questions, please feel free to contact us.

First name* Last name* Email* Job title* Organization/Business name* Country/Region*

Marketing opt in

I agree to be contacted by an OUP representative based on the form request and agree to receive relevant information and offers.*

* Mandatory field

You may withdraw your consent by clicking the unsubscribe link at the bottom of our emails and can unsubscribe from our emails at any time.

Our Privacy Policy sets out how Oxford University Press handles your personal information, and your rights to object to your personal information being used for marketing to you or being processed as part of our business activities.

Connect with us:

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide.