Question Answering is Not a Trivial Activity

What is QANTA?

QANTA (Question Aanswering is Not a Trivial Activity) is a question answering dataset composed of questions from Quizbowl - a trivia game that is challenging for both humans and machines. Each question contains 4-5 pyramidally arranged clues: obscure ones at the beginning and obvious ones at the end. Players of Quizbowl (humans and machines) compete to prove their superior mastery of knowledge by trying to answer using the least amount of information possible. More information on QANTA, including offline events, can be found at qanta.org.

Getting Started

Download a copy of the dataset (distributed under the CC BY-SA 4.0 license):

To help you get started and demonstrate our API requirements, we provide a baseline system. The repo below contains code to: download data, train model in a docker container, evaluate model in a way that is identical to the CodaLab evaluation. You should be able to reproduce the "Baseline" entry on the leaderboard.

Baseline system

Submissions are done through CodaLab and the models will be tested with the same evaluation script.

CodaLab Submission Tutorial

Have Questions?

Please send your questions to our google group or at pedro@cs.umd.edu and shifeng@cs.umd.edu.

Acknowledgements

We thank the SQuAD team for allowing us to use their code and templates for generating this website.

Leaderboard

We evaluate each system with four metrics: accuracy at the end of the first sentence (first_acc) and at the end of the question (end_acc), and two new metrics: expected wins with system buzzer (EW) and with optimal buzzer (EW_OPT). Ranking is decided by EW.

Submissions marked with (*) are tested on our latest adversarial dataset.

Rank	Model	first_acc	end_acc	EW	EW_OPT
1 Dec 10, 2018	* Benchmark Human	0.666	0.666	0.542	0.542
2 Dec 10, 2018	BitER_the_dusT FYY	0.119	0.672	0.291	0.618
3 Dec 10, 2018	SBQA CMSC723 Technical Wizards	0.104	0.559	0.271	0.589
4 Dec 10, 2018	DAN-TFIDF Buzzer CMSC723 ForwardRethinking	0.0690	0.609	0.265	0.593
5 Dec 10, 2018	* SBQA CMSC723 Technical Wizards	0.0821	0.618	0.224	0.551
6 Dec 10, 2018	* BitER_the_dusT FYY	0.0832	0.622	0.219	0.551
7 Dec 10, 2018	* DAN-TFIDF Buzzer CMSC723 ForwardRethinking	0.0778	0.617	0.219	0.550
8 Dec 10, 2018	* TFIDF Guesser CMSC723 Working Title	0.0732	0.602	0.214	0.544
9 Dec 03, 2018	DAN CMSC723 Technical Wizards	0.0468	0.557	0.214	0.546
10 Dec 10, 2018	GLOVE_300 DAN+TFIDF CMSC723 ForwardRethinking	0.0877	0.604	0.201	0.594
11 Dec 11, 2018	TFIDF Guesser CMSC723 Working Title	0.0534	0.469	0.193	0.513
12 Dec 01, 2018	TFIDF Buzzer CMSC723 Technical Wizards	0.0595	0.468	0.183	0.514
13 Dec 01, 2018	TF-IDF Thresh CMSC723 ForwardRethinking	0.0363	0.558	0.147	0.551
14 Nov 20, 2018	TF-IDF CMSC723 FYY	0.0463	0.545	0.137	0.540
15 Dec 03, 2018	GLOVE_300 DAN CMSC723 ForwardRethinking	0.0558	0.503	0.113	0.527
16 Nov 19, 2018	Less Than Adequate DAN CMSC723 Iota	0.0317	0.437	0.0949	0.496
17 Nov 14, 2018	DAN Wiki University of Maryland	0.0923	0.560	0.0487	0.582
18 Nov 21, 2018	Elmo DAN University of Maryland	0.102	0.508	0.0453	0.567
19 Nov 11, 2018	DAN Baseline University of Maryland	0.0736	0.432	0.0207	0.529

QANTA