XQuAD: A Parallel Benchmark for Cross-Lingual Question Answering

XQuAD: A Parallel Benchmark for Cross-Lingual Question Answering

What is XQuAD?

XQuAD (Cross-lingual Question Answering Dataset) is a compact, fully parallel benchmark for measuring how well QA models transfer across languages. It features:

  • 240 English paragraphs and 1,190 question–answer pairs from SQuAD v1.1
  • 10 additional languages (Spanish, German, Greek, Russian, Turkish, Arabic, Vietnamese, Thai, Chinese, and Hindi) professionally translated
  • 11 aligned versions of every example for consistent cross-lingual evaluation.

What’s Inside

Every entry has four parts: the passage, the question, the answer (with its starting position), and an ID.

Split: one validation set per language (1,190 items each) to keep results directly comparable.

Why it Matters

  • Cross-Lingual Evaluation: Test zero-shot or fine-tuned models on identical content in multiple languages.
  • Transfer Learning Research: Analyze how English-trained encoders perform when the language of the context and question changes.
  • Lightweight Baseline: With ~1,200 examples per language, XQuAD runs at a high speed, revealing clear multilingual performance gaps.

License

Released under the Creative Commons Attribution-ShareAlike 4.0 International License.

Get started

Benchmark your multilingual QA model today.

Unlock XQuAD on AIOZ AI and drop the aligned files directly into your evaluation pipeline.