There are 0 examples not solved by any model.
Solving some of these can be a good signal that your model is indeed better than leading models if these are good problems.
| example_link | model | min_pass1_of_model |
|---|
These are 10 problems with the lowest correlation with the overall evaluation (i.e. better models tend to do worse on these. )
| example_link | pass1_of_ex | tau |
|---|---|---|
| 46 | 0.084 | -0.042 |
| 17 | 0.111 | 0.069 |
| 87 | 0.203 | 0.093 |
| 70 | 0.289 | 0.149 |
| 86 | 0.195 | 0.155 |
| 91 | 0.190 | 0.165 |
| 34 | 0.324 | 0.171 |
| 29 | 0.475 | 0.244 |
| 79 | 0.207 | 0.247 |
| 89 | 0.309 | 0.262 |
Histogram of problems by the accuracy on each problem.
Histogram of problems by the minimum win rate to solve each problem.