{"id":72950,"date":"2026-04-17T09:04:37","date_gmt":"2026-04-17T02:04:37","guid":{"rendered":"https:\/\/hbbgroup.net\/claude-opus-4-7-is-here-anthropics-latest-model-delivers-but-its-a-token-eating-machine\/"},"modified":"2026-04-17T09:04:37","modified_gmt":"2026-04-17T02:04:37","slug":"claude-opus-4-7-is-here-anthropics-latest-model-delivers-but-its-a-token-eating-machine","status":"publish","type":"post","link":"https:\/\/hbbgroup.net\/zh\/claude-opus-4-7-is-here-anthropics-latest-model-delivers-but-its-a-token-eating-machine\/","title":{"rendered":"Claude Opus 4.7 Is Here: Anthropic\u2019s Latest Model Delivers, But It\u2019s a Token Eating Machine"},"content":{"rendered":"<div>\n<div>\n<h4 color=\"#333\">In brief<\/h4>\n<ul>\n<li>Anthropic just released its most capable Opus model yet, Claude Opus 4.7.<\/li>\n<li>The model delivers strong benchmark gains across coding and reasoning, but is not the controversial Mythos model that Anthropic offers to select partners.<\/li>\n<li>Claude Opus 4.7 shows visible chain-of-thought and unusually high token usage.<\/li>\n<\/ul>\n<\/div>\n<p>Anthropic shipped <a href=\"https:\/\/www.anthropic.com\/news\/claude-opus-4-7\" target=\"_blank\" rel=\"nofollow external noopener\">Claude Opus 4.7<\/a> today, calling it the company\u2019s most capable Opus model yet. We tested it, and the marketing lines up with the results.<\/p>\n<p>&#8220;Our latest model, Claude Opus 4.7, is now generally available.&#8221; the company said in its official announcement. &#8220;Users report being able to hand off their hardest coding work\u2014the kind that previously needed close supervision\u2014to Opus 4.7 with confidence.&#8221;<\/p>\n<p>The model arrives on the heels of weeks of user complaints about Opus 4.6 allegedly losing its edge. Developers across <a href=\"https:\/\/github.com\/anthropics\/claude-code\/issues\/42796\" target=\"_blank\">GitHub<\/a>, Reddit, and <a href=\"https:\/\/x.com\/petergyang\/status\/2043466663258140685?s=20\" target=\"_blank\" rel=\"nofollow external noopener\">X<\/a> <a href=\"https:\/\/www.reddit.com\/r\/ClaudeAI\/comments\/1ses1qm\/anthropic_stayed_quiet_until_someone_showed\/?utm_source=share&#038;utm_medium=web3x&#038;utm_name=web3xcss&#038;utm_term=1&#038;utm_content=share_button\" target=\"_blank\">documented<\/a> what they called &#8220;<a href=\"https:\/\/www.reddit.com\/r\/Layout_dev\/comments\/1shh317\/ai_shrinkflation_is_real_and_anthropic_just_got\/?utm_source=share&#038;utm_medium=web3x&#038;utm_name=web3xcss&#038;utm_term=1&#038;utm_content=share_button\" target=\"_blank\">AI shrinkflation<\/a>&#8220;\u2014the feeling that the model they&#8217;d been paying for had quietly gotten worse. <a href=\"https:\/\/decrypt.co\/364483\/anthropic-opus-47-full-stack-ai-studio-mythos\" target=\"_blank\">As we reported yesterday<\/a>, Anthropic was already preparing 4.7 while sitting on something far more powerful that it can&#8217;t release publicly: Claude Mythos.<\/p>\n<p>When the announcement <a href=\"https:\/\/x.com\/claudeai\/status\/2044785261393977612?s=20\" target=\"_blank\" rel=\"nofollow external noopener\">dropped<\/a> this morning, X users who had been loudest about 4.6&#8217;s degradation were quick to reply with sarcasm: Opus 4.7, some joked, felt like &#8220;early Opus 4.6&#8243;\u2014the version people actually liked, before they believed Anthropic quietly turned the dials down. Anthropic, of course, has <a href=\"https:\/\/x.com\/trq212\/status\/2043023892579766290?s=20\" target=\"_blank\" rel=\"nofollow external noopener\">denied<\/a> ever degrading model weights to manage compute demand.<\/p>\n<p>Benchmarks back up Anthropic&#8217;s claims. On SWE-bench Multilingual, a benchmark that measures coding skills, Opus 4.7 scored 80.5% against 4.6&#8217;s 77.8%.<\/p>\n<p>On GDPVal-AA, a third-party evaluation of economically valuable knowledge work across finance and legal domains, 4.7 scored 1,753 Elo against GPT-5.4&#8217;s 1,674\u2014a clear margin over the closest competitor.<\/p>\n<p>Document reasoning via OfficeQA Pro showed the starkest jump: 80.6% for 4.7 versus 57.1% for 4.6, with GPT-5.4 and Gemini 3.1 Pro trailing at 51.1% and 42.9% respectively. Long-term coherence on Vending-Bench 2, a benchmark that measures how good models are at long context and reasoning tasks like owning a vending business, clocked in at $10,937 money balance versus $8,018 for 4.6\u2014a proxy for how well the model sustains useful behavior over long autonomous runs.<\/p>\n<div>\n<figure><img loading=\"lazy\" alt width=\"2600\" height=\"2638\" decoding=\"async\" data-nimg=\"1\" src=\"https:\/\/img.decrypt.co\/insecure\/rs:fit:3840:0:0:0\/plain\/https:\/\/cdn.decrypt.co\/wp-content\/uploads\/2026\/04\/d434d15757c6abac1122af483617741776d5a114-2600x2638-1.png@webp\"><\/figure>\n<\/div>\n<p>Cybersecurity is the one area where Anthropic deliberately held back. Opus 4.7 launches with automated safeguards that detect and block prohibited or high-risk cybersecurity requests. Anthropic confirmed it &#8220;experimented with efforts to differentially reduce&#8221; 4.7&#8217;s cyber capabilities during training.<\/p>\n<p>Security professionals can apply to a new <a href=\"https:\/\/claude.com\/form\/cyber-use-case\" target=\"_blank\" rel=\"nofollow external noopener\">Cyber Verification Program<\/a> for access to those features. This is the company&#8217;s test run for the safeguards it will eventually need to deploy with Mythos-class models at scale.<\/p>\n<p>Opus 4.7 is the most powerful model <i>publicly<\/i> available. Mythos Preview, Anthropic&#8217;s true frontier model, remains restricted to vetted security firms. <a href=\"https:\/\/decrypt.co\/364141\/anthropic-claude-mythos-serious-threat-overhyped-ai-security-institute\" target=\"_blank\">As the UK&#8217;s AI Security Institute evaluated last week<\/a>, Mythos was the first AI to complete &#8220;The Last Ones,&#8221; a 32-step corporate network attack simulation that typically takes human red teams 20 hours.<\/p>\n<p>Opus 4.7 is not that. But it&#8217;s the public-facing model that Anthropic will use to learn how those safety guardrails hold up in the wild before it dares release anything scarier.<\/p>\n<p>On the token side, Opus 4.7 uses an updated tokenizer that can map the same input to roughly 1.0x\u20131.35x more tokens depending on content type. The model also reasons more at higher effort levels, particularly on later turns in agentic workflows. Anthropic published a migration guide for developers planning to upgrade from 4.6.<\/p>\n<p>We ran our own test\u2014the same game-building prompt we&#8217;ve used to evaluate every major model release. Opus 4.7 produced the best result we&#8217;ve ever gotten from any model. The most visually polished game, the most genuinely challenging difficulty curve, the best mechanics, and the most creative win and loss screens. It appeared to generate levels procedurally, and none of them felt impossible\u2014a balance that has tripped up other models repeatedly.<\/p>\n<p>You can test the game <a href=\"https:\/\/jaldps.itch.io\/emerge-the-game-claude-opus-47\" target=\"_blank\">here<\/a><\/p>\n<figure><img loading=\"lazy\" alt=\"Emerge: The Game, created by Claude Opus 4.7\" width=\"3016\" height=\"1690\" decoding=\"async\" data-nimg=\"1\" src=\"https:\/\/img.decrypt.co\/insecure\/rs:fit:3840:0:0:0\/plain\/https:\/\/cdn.decrypt.co\/wp-content\/uploads\/2026\/04\/Captura-de-pantalla-2026-04-16-a-las-13.49.26.png@webp\"><figcaption>Emerge: The Game, created by Claude Opus 4.7<\/figcaption><\/figure>\n<p>It wasn&#8217;t zero-shot. Opus 4.6 had cleared that same test without any fixes. Opus 4.7 needed one round of bug fixes. That could be bad luck\u2014a single iteration is a thin sample\u2014but it&#8217;s worth noting. What struck us more was how the model handled that round: It spotted additional bugs on its own, without being guided toward them. Opus 4.6 typically waited to be told where to look.<\/p>\n<p><a href=\"https:\/\/decrypt.co\/362633\/xiaomi-mimo-v2-pro-review-so-good-mistaken-deepseek-v4\" target=\"_blank\">Xiaomi MiMo v2 Pro <\/a>was the model with <a href=\"https:\/\/jaldps.itch.io\/emerge-the-game-xiaomi-mimo\" target=\"_blank\">the best results<\/a> until now, but unlike Opus, it produced a working result without the need for more than one iteration. Some may argue it was more visually pleasing and had a soundtrack, which was an advantage, but the game\u2019s logic and physics fell short against Opus after a single round of bug fixes.<\/p>\n<figure><img loading=\"lazy\" alt=\"Emerge: The Game, created by Xiaomi MiMo v2 Pro\" width=\"3024\" height=\"2002\" decoding=\"async\" data-nimg=\"1\" src=\"https:\/\/img.decrypt.co\/insecure\/rs:fit:3840:0:0:0\/plain\/https:\/\/cdn.decrypt.co\/wp-content\/uploads\/2026\/04\/Captura-de-pantalla-2026-04-16-a-las-14.13.20.png@webp\"><figcaption>Emerge: The Game, created by Xiaomi MiMo v2 Pro<\/figcaption><\/figure>\n<p>Also, Xiaomi\u2019s model produces these results at a fraction of the cost charged by Anthropic, which could be a major thing to consider for serious projects.<\/p>\n<p>The chain-of-thought behavior was different too at first glance. Unlike 4.6, which tucked its reasoning into a separate thinking box (meaning it was not part of the final answer), Opus 4.7 surfaced its chain of thought as part of the main text output. The reasoning was visible and traceable, not hidden behind a UI abstraction, which is a plus for those valuing transparency. Whether Anthropic will keep that behavior or eventually collapse it into a hidden block again is unclear.<\/p>\n<div>\n<figure><img loading=\"lazy\" alt width=\"3018\" height=\"1714\" decoding=\"async\" data-nimg=\"1\" src=\"https:\/\/img.decrypt.co\/insecure\/rs:fit:3840:0:0:0\/plain\/https:\/\/cdn.decrypt.co\/wp-content\/uploads\/2026\/04\/Captura-de-pantalla-2026-04-16-a-las-13.13.14.png@webp\"><\/figure>\n<\/div>\n<p>The token usage was unlike anything we&#8217;d seen before. For the first time in our testing, a single session depleted our entire token quota. Watching the model work, we saw it complete a full draft\u2014then write what appeared to be the entire game again from scratch under the label &#8220;Rewrite Emerge with bug fixes and improvements,&#8221; followed by a second pass labeled &#8220;Create a rewritten Emerge with bug fixes and improvements.&#8221;<\/p>\n<p>This means, if you\u2019re into serious coding, you\u2019ll be forced to either upgrade your plan, pay a lot on API tokens, or wait a long time until Anthropic resets your usage quotas. Or you could just use a comparable model that charges a lot less<\/p>\n<div>\n<figure><img loading=\"lazy\" alt width=\"3024\" height=\"1829\" decoding=\"async\" data-nimg=\"1\" src=\"https:\/\/img.decrypt.co\/insecure\/rs:fit:3840:0:0:0\/plain\/https:\/\/cdn.decrypt.co\/wp-content\/uploads\/2026\/04\/Captura-de-pantalla-2026-04-16-a-las-13.25.31-e1776360505649.png@webp\"><\/figure>\n<\/div>\n<p>Opus 4.6 had never done this. However, it&#8217;s consistent with what Anthropic warns in the migration guide: more output tokens, especially on agentic tasks at higher effort levels.<\/p>\n<p>Opus 4.7 is available today at <a href=\"https:\/\/claude.ai\/\" target=\"_blank\" rel=\"nofollow external noopener\">Claude.ai<\/a>, the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Pricing is unchanged from 4.6: $5 per million input tokens, $25 per million output tokens. Developers can access it via the string claude-opus-4-7.<\/p>\n<div>\n<h3>Daily Debrief Newsletter<\/h3>\n<p>Start every day with the top news stories right now, plus original features, a podcast, videos and more.<\/p>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>In brief Anthropic just released its most capable Opus model yet, Claude Opus 4.7. The model delivers strong benchmark gains [&hellip;]<\/p>","protected":false},"author":5,"featured_media":72951,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[220],"tags":[],"class_list":["post-72950","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tien-dien-tu"],"acf":[],"_links":{"self":[{"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/posts\/72950","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/comments?post=72950"}],"version-history":[{"count":0,"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/posts\/72950\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/media\/72951"}],"wp:attachment":[{"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/media?parent=72950"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/categories?post=72950"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hbbgroup.net\/zh\/wp-json\/wp\/v2\/tags?post=72950"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}