Jan 5, 2026

XStoryCloze: Multilingual Benchmark for Narrative Understanding

XStoryCloze is now available on AIOZ AI, the collaborative marketplace powered by AIOZ DePIN.

This dataset is a professionally translated version of the English StoryCloze benchmark, extended into 10 non-English languages. It is designed for cross-lingual evaluation of narrative understanding, with a focus on zero-/few-shot performance in multilingual language models.

This benchmark provides aligned story completion tasks to test coherence, commonsense reasoning, and generative performance across diverse languages. Hosted by AIOZ AI for Text-to-Text Generation and evaluation tasks.

Try it now: https://aiozai.network/datasets/41645d01-e7c4-4f2d-a18a-160eff631265

How It Works

The dataset features multiple-choice story cloze tasks where models must select the correct ending for a four-sentence context from two options.

Data is structured with line-by-line alignment across languages for fair comparison. Each language includes a train split (360 examples) for few-shot prompting and a test split (1510 examples) for evaluation.

This setup enables efficient benchmarking of multilingual models on constrained narrative reasoning in a standardized format.

Input: A four-sentence story context and two possible endings
Output: Selection of the coherent, correct ending (multiple-choice selection)

Ideal Use Cases

Zero- and few-shot evaluation of multilingual language models
Benchmarking cross-lingual transfer in narrative reasoning and coherence
Studying commonsense understanding and story plausibility across languages

License

Released under Creative Commons CC BY-4.0, with credit to Xi Victoria Lin, Todor Mihaylov, and collaborators for creating and maintaining the dataset.

Get Started

With XStoryCloze now available on AIOZ AI, researchers can easily access a robust multilingual benchmark for advancing narrative AI capabilities.

Unlock its potential today: download the dataset and integrate it into your multilingual model evaluations for deeper insights into cross-lingual reasoning!