Download
Release Notes
Mar 12, 2023Dataset now contains 13.4k images and 84.9k questions including 3-hop questions.
Feb 21, 2023Dataset now contains 14.4k images and 52.9k questions
Jan 13, 2023Release of the initial version with 7.7k images and 30k questions
Format
File name
Description
train.zip
Images for the train split
test.zip
Images for the test split
train.json
Data about questions for the train split in the following format:
- n_questions: (int) Number of questions
- questions: (object array) Array of questions
- question_id: [int] ID of the question
- question: [str] Content of the question
- image_id: [str] ID of the image
- image_name: [str] Name of the image in CV dataset
- image_dir: [str] Path to the image in CV dataset
- dataset_name: [str] Name of CV the dataset
- answers: [str array] Array of true answers
- choices: [str array] Array of choices (including true answers)
- choice_scores: [int array] Array of scores for each choice (1=true and 0=false)
- property_id: [str] Wikidata ID of the property used to construct the question
- property_label: [str] Wikidata name of the property used to construct the question
test.json
Data about questions for the test split. See description of train.json