Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 72 tok/s

Gemini 2.5 Pro 41 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 115 tok/s Pro

Kimi K2 203 tok/s Pro

GPT OSS 120B 451 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

FQuAD: French Question Answering Dataset (2002.06071v2)

Published 14 Feb 2020 in cs.CL, cs.AI, and cs.LG

Abstract: Recent advances in the field of LLMing have improved state-of-the-art results on many Natural Language Processing tasks. Among them, Reading Comprehension has made significant progress over the past few years. However, most results are reported in English since labeled resources available in other languages, such as French, remain scarce. In the present work, we introduce the French Question Answering Dataset (FQuAD). FQuAD is a French Native Reading Comprehension dataset of questions and answers on a set of Wikipedia articles that consists of 25,000+ samples for the 1.0 version and 60,000+ samples for the 1.1 version. We train a baseline model which achieves an F1 score of 92.2 and an exact match ratio of 82.1 on the test set. In order to track the progress of French Question Answering models we propose a leader-board and we have made the 1.0 version of our dataset freely available at https://illuin-tech.github.io/FQuAD-explorer/.

Citations (93)

View on Semantic Scholar

Summary

An Overview of the French Question Answering Dataset (FQuAD)

The paper "FQuAD: French Question Answering Dataset" introduces a significant contribution to the field of NLP, specifically focusing on the development of resources for the French language. Notably, while significant advancements have been made in reading comprehension tasks using datasets like SQuAD for English, similar resources in other languages, including French, remain underdeveloped. This paper addresses this gap by presenting FQuAD—a French native reading comprehension dataset.

Dataset Description and Collection

FQuAD is constructed to mirror the structure of the SQuAD dataset, widely used for English question-answering tasks. The dataset is comprised of over 60,000 samples, with questions and answers derived from Wikipedia articles, and is released in two versions: FQuAD 1.0 and FQuAD 1.1. The dataset is noteworthy for its extensive annotation process, which involved crowdsourcing among university students, ensuring a variety of question types and complexities.

For FQuAD 1.1, an additional set of samples were collected to increase the dataset's complexity, thus presenting a richer resource for developing LLMs. The dataset's resources, including a leaderboard to track model progress, are made available publicly, fostering collaboration and further research in the domain.

Benchmarking and Performance Analysis

The paper benchmarks various NLP models on the FQuAD dataset, employing both monolingual (CamemBERT, FlauBERT) and multilingual models (mBERT, XLM-RoBERTa). Significantly, the experiments demonstrate that CamemBERT\textsubscript{LARGE} achieves remarkable results with an F1 score of 92.2% and an exact match ratio of 82.1% on the FQuAD test set—metrics that notably surpass human performance baselines on the same dataset.

The paper suggests that while multilingual models like XLM-R are robust and versatile, they do not surpass the targeted efficiency provided by the monolingual CamemBERT model on French datasets. This outcome supports the ongoing investment in language-specific NLP models to achieve optimal performance in non-English contexts.

Cross-Lingual and Translation-Based Approaches

Beyond traditional native training, the paper explores cross-lingual transfer efficacy through zero-shot learning. Multilingual models trained on the SQuAD (English) dataset were evaluated on FQuAD, and vice versa. The observed performance degradation in zero-shot settings underscores the challenges inherent in cross-linguistic transfer when equivalent native datasets are unavailable. Moreover, evaluating models trained on a translated SQuAD (French) dataset offers insight into the limitations of this strategy: translation-based models underperformed compared to models trained directly on native datasets.

Implications and Future Directions

Practically, FQuAD serves as a foundation for advancing French-specific NLP applications, encouraging the development and refinement of LLMs capable of nuanced reading comprehension in French. Theoretically, insights drawn from the performance comparison between monolingual and multilingual models can influence future model architectures, advocating for increased resources in language-specific training datasets.

The introduction of FQuAD signifies an essential step toward democratizing language technology across linguistic boundaries, establishing a methodology that can extend to other under-resourced languages. Future work may explore the integration of adversarial samples akin to SQuAD 2.0, further elevating the challenge and robustness of the FQuAD dataset. Additionally, examining more sophisticated cross-lingual pre-training approaches may enhance model adaptability across languages.