Bayesian calculation: a mathematical technique that helps determine the conditional probability of an event based on prior knowledge and new evidence.
MAPD: Mean Absolute Percentage Deviation calculates the average percentage difference between actual values and predicted values, providing a relative measure of error.
Median: We collect the output values from the LLMs in an array. We then take the median value. This is between 0 and 1 (i.e. 0.15 = 15%).
Base rate: The base rate helps to determine if the median makes sense.
SD: The standard deviation between the median and the base rate.
Confidence: We query each of the LLMs on how confident they are of their predictions (between 0 and 10) and take a median. Because the LLMs tend to be overconfident, we take anything lower than 6 as being low confidence. This factors into the overall model.
Conf Mode: Based on the confidence value. >=9 is high confidence. Below 6 is low confidence (this is also triggered by an exceptionally high SD).
Mellers: This refers to Barbara Mellers, specifically a paper she wrote that includes a formula for moving values towards an extreme (i.e. 0 or 1).
Reverse Mellers: This uses the formula from above, but with a sub-1 coefficient to move the values closer to 50%.
Theory of Mind: We ask the LLMs what they think other LLMs would predict. We hope that this makes them consider the questions more deeply.
Beta Distribution: Currently unused, but possibly of interest. This is based on the median, the base rate and the SD.
Close Type: We noticed that the appropriate base case for some questions is closer to the extremes, while others are closer to 50%. When we have a low confidence value, this helps us to determine whether to extremize or de-extremize the value. ‘A’ implies closer to zero. ‘B’ implies closer to 50%. ‘C’ implies closer to 100%.
I