🔒 解决前已哈希封存
该预测在出 call 时已生成 SHA-256 哈希。任何人都可以验证 call 没有事后修改。
Will Anthropic have the #1 AI model at the end of April 2026 (Style Control On)? | Yes | 2026-04-30
ac8748c0
验证中…
在终端中自行验证
echo -n "Will Anthropic have the #1 AI model at the end of April 2026 (Style Control On)? | Yes | 2026-04-30" | shasum -a 256 | cut -c1-8
🧑⚖️ AI 评审
The leaderboard at lmarena.ai/leaderboard/text confirms Anthropic's claude-opus-4-6-thinking is #1 at 1504 ±5, ahead of its own claude-opus-4-6 at 1496 ±5 and others lower, matching the analyst's claim (likely the Style Control On view). Current Polymarket price is ~80% Yes, and no new competitor models have been released to challenge this lead despite rumors (e.g., OpenAI Spud not launched); arena scores update slowly, supporting >90% true probability of holding through Apr 30. The analysis summary has a contradictory phrasing (overestimates vs. buy Yes), but the recommendation and facts align for a strong, actionable edge.
The web search results from [lmarena.ai](http://lmarena.ai/leaderboard) and the [Hugging Face dataset](https://huggingface.co/datasets/lmarena-ai/leaderboard-dataset/viewer/text_style_control/latest) confirm the analyst's claim: as of the latest data (April 2, 2026), 'claude-opus-4-6-thinking' is ranked #1 in the 'Text Arena | Overall' leaderboard with style control on, with a significant score lead (1504) over the nearest competitor (1499). The market rules specify using this exact leaderboard on April 30, 2026. Given the substantial lead and the short time (16 days) until resolution, the probability of Anthropic retaining the top spot is very high, making the 'Yes' side at 78% a strong bet with a clear edge.
The analysis contains a direct logical contradiction, claiming the market 'overestimates' Anthropic's chances while simultaneously recommending a 'Buy Yes' trade. Furthermore, claiming a 17% edge on a 78% asset implies a 95% true probability, which is highly implausible given the volatile AI release cycle; a surprise model drop from OpenAI or Google in the remaining 16 days is a realistic risk that the current 78% price already accurately reflects. The trade fails the >80% certainty threshold required for a single bet.
查看今日所有开放推荐
再 +2 个开放推荐 · 完整 3 评审推理 · Telegram 高级频道。
立即订阅