There are 0 examples not solved by any model.
Solving some of these can be a good signal that your model is indeed better than leading models if these are good problems.
| example_link | model | min_pass1_of_model |
|---|
These are 10 problems with the lowest correlation with the overall evaluation (i.e. better models tend to do worse on these. )
| example_link | pass1_of_ex | tau |
|---|---|---|
| 568 | 0.258 | -0.509 |
| 598 | 0.050 | -0.470 |
| 590 | 0.130 | -0.458 |
| 703 | 0.104 | -0.338 |
| 558 | 0.068 | -0.299 |
| 67 | 0.035 | -0.289 |
| 683 | 0.013 | -0.288 |
| 195 | 0.079 | -0.215 |
| 172 | 0.170 | -0.176 |
| 274 | 0.113 | -0.163 |
Histogram of problems by the accuracy on each problem.
Histogram of problems by the minimum win rate to solve each problem.