There are 0 examples not solved by any model.
Solving some of these can be a good signal that your model is indeed better than leading models if these are good problems.
| example_link | model | min_pass1_of_model |
|---|
These are 10 problems with the lowest correlation with the overall evaluation (i.e. better models tend to do worse on these. )
| example_link | pass1_of_ex | tau |
|---|---|---|
| 87 | 0.158 | -0.003 |
| 17 | 0.107 | 0.042 |
| 46 | 0.110 | 0.133 |
| 86 | 0.189 | 0.146 |
| 70 | 0.209 | 0.163 |
| 29 | 0.529 | 0.217 |
| 89 | 0.266 | 0.237 |
| 30 | 0.343 | 0.280 |
| 57 | 0.198 | 0.284 |
| 91 | 0.202 | 0.296 |
Histogram of problems by the accuracy on each problem.
Histogram of problems by the minimum win rate to solve each problem.