There are 0 examples not solved by any model.
Solving some of these can be a good signal that your model is indeed better than leading models if these are good problems.
| example_link | model | min_pass1_of_model |
|---|
These are 10 problems with the lowest correlation with the overall evaluation (i.e. better models tend to do worse on these. )
| example_link | pass1_of_ex | tau |
|---|---|---|
| 60 | 0.073 | -0.184 |
| 67 | 0.049 | -0.128 |
| 12 | 0.038 | -0.104 |
| 0 | 0.099 | 0.005 |
| 48 | 0.049 | 0.037 |
| 35 | 0.069 | 0.061 |
| 22 | 0.105 | 0.105 |
| 23 | 0.233 | 0.167 |
| 59 | 0.164 | 0.216 |
| 63 | 0.224 | 0.237 |
Histogram of problems by the accuracy on each problem.
Histogram of problems by the minimum win rate to solve each problem.