Without clear ways to measure AI, we risk two extremes: overhyping capabilities, or deploying systems irresponsibly. This ...