There are 0 examples not solved by any model.
Solving some of these can be a good signal that your model is indeed better than leading models if these are good problems.
| example_link | model | min_pass1_of_model |
|---|
These are 10 problems with the lowest correlation with the overall evaluation (i.e. better models tend to do worse on these. )
| example_link | pass1_of_ex | tau |
|---|---|---|
| 88 | 0.122 | -0.421 |
| 25 | 0.173 | -0.397 |
| 57 | 0.035 | -0.394 |
| 245 | 0.068 | -0.387 |
| 250 | 0.039 | -0.380 |
| 170 | 0.013 | -0.318 |
| 181 | 0.052 | -0.310 |
| 121 | 0.018 | -0.307 |
| 226 | 0.096 | -0.307 |
| 202 | 0.093 | -0.295 |
Histogram of problems by the accuracy on each problem.
Histogram of problems by the minimum win rate to solve each problem.