Many languages lack culturally-specific evaluation datasets that are created by language community members themselves. This year's shared task for the Multilingual Representation Learning (MRL) workshop was for contributors to create a manually-annotated physical commonsense reasoning evaluation dataset for their language(s), e.g. for researchers who speak non-English language(s) natively. The format is similar to PIQA, a physical commonsense reasoning benchmark where each example consists of a prompt with two candidate completions ("solutions"). The result is Global PIQA, a collaboratively constructed multilingual physical reasoning benchmark with broad language coverage and culturally-specific examples for different languages.
All authors of accepted submissions had the option to be included on the resulting benchmark paper.
The shared task has concluded, however there is still an opportunity to contribute to Global PIQA! We will be accepting submissions for any language or variety that is not currently represented in Global PIQA. We especially invite submissions for low-resource languages and non-prestige varieties. Fill out this form to register your interest in contributing.