Rank Model Hints Score 95% CI (±) Rank Spread Provider

Confidence Intervals on Model Score Model score is a 95%-CI estimation of the % of correct answers a human annotator would have attributed to the model. Answer correctness is based on both answer and justification.

This work uses material from the Professor Layton wiki at Fandom and is licensed under the Creative Commons Attribution-Share Alike License.

Category Performance Analysis

Technical Report

Download PDF