{
    "id": 74760,
    "date": "2026-04-22T08:57:46",
    "date_gmt": "2026-04-22T01:57:46",
    "guid": {
        "rendered": "https:\/\/hbbgroup.net\/this-frankenstein-ai-merges-claude-opus-glm-and-qwen-and-outperforms-top-models\/"
    },
    "modified": "2026-04-22T08:57:46",
    "modified_gmt": "2026-04-22T01:57:46",
    "slug": "this-frankenstein-ai-merges-claude-opus-glm-and-qwen-and-outperforms-top-models",
    "status": "publish",
    "type": "post",
    "link": "https:\/\/hbbgroup.net\/en_us\/this-frankenstein-ai-merges-claude-opus-glm-and-qwen-and-outperforms-top-models\/",
    "title": {
        "rendered": "This Frankenstein AI Merges Claude Opus, GLM and Qwen\u2014And Outperforms Top Models"
    },
    "content": {
        "rendered": "<div>\n<div>\n<h4 color=\"#333\">In brief<\/h4>\n<ul>\n<li>AI engineer Kyle Hessling merged two of Jackrong&#8217;s Claude Opus 4.6 and GLM-5.1 distilled finetunes into a single &#8220;frankenmerge.&#8221;<\/li>\n<li>A post-merge &#8220;heal fine-tune&#8221; was required to fix garbled code output caused by the layer boundary between the two independently-trained models.<\/li>\n<li>The model over-reasons on some tasks, but it&#8217;s a solvable problem.<\/li>\n<\/ul>\n<\/div>\n<p>You thought <a href=\"https:\/\/huggingface.co\/Jackrong\" target=\"_blank\" rel=\"nofollow external noopener\">Qwopus<\/a> was cool because it merged Qwen and Opus? Well, Kyle Hessling, an AI engineer with a lot of knowledge and free time just took that recipe and threw GLM\u2014one of the best reasoning models out there\u2014into the mix. The result is an 18 billion parameter frankenmerge that fits on a cheap GPU and outperforms Alibaba&#8217;s newest 35B model.<\/p>\n<p>For those who don&#8217;t know, parameters are the numerical values baked into a neural network during training, like dials that a neural network can adjust \u2014 the more of them, the more knowledge and complexity the model can handle, and the more memory it needs to run.<\/p>\n<p>Hessling, an AI infrastructure engineer, stacked two of Jackrong&#8217;s Qwen3.5 finetunes on top of each other: layers 0 through 31 from <a href=\"https:\/\/huggingface.co\/Jackrong\/Qwopus3.5-9B-v3.5\" target=\"_blank\" rel=\"nofollow external noopener\">Qwopus 3.5-9B-v3.5<\/a>, which distills Claude 4.6 Opus&#8217;s reasoning style into Qwen as a base model, and layers 32 through 63 from <a href=\"https:\/\/huggingface.co\/Jackrong\/Qwen3.5-9B-GLM5.1-Distill-v1\" target=\"_blank\" rel=\"nofollow external noopener\">Qwen 3.5-9B-GLM5.1-Distill-v1<\/a>, trained on reasoning data from z.AI&#8217;s GLM-5.1 teacher model on top of the same Qwen base.<\/p>\n<p>The hypothesis: Give the model Opus-style structured planning in the first half of the reasoning and GLM&#8217;s problem decomposition scaffold in the second\u201464 layers total, in one model.<\/p>\n<p>The technique is called a passthrough frankenmerge\u2014no blending, no averaging of weights, just raw layer stacking. Hessling had to write his own merge script from scratch because existing tools don&#8217;t support Qwen 3.5&#8217;s hybrid linear\/full attention architecture. The <a href=\"https:\/\/huggingface.co\/KyleHessling1\/Qwopus-GLM-18B-Merged-GGUF\" target=\"_blank\" rel=\"nofollow external noopener\">resulting model<\/a> passed 40 out of 44 capability tests, beating Alibaba&#8217;s Qwen 3.6-35B-A3B MoE\u2014which requires 22 GB of VRAM\u2014while running on just 9.2 GB in Q4_K_M quantization.<\/p>\n<p>An NVIDIA RTX 3060 handles it fine\u2026 theoretically.<\/p>\n<p>Hessling explains that making this model wasn\u2019t easy. The raw merge used to throw garbled code. But even so, the test models he published went kind of viral among enthusiasts.<\/p>\n<p>Hessling&#8217;s final fix was a &#8220;heal fine-tune&#8221;\u2014basically a QLoRA (a bit of code that is embedded into the model like an appendix and heavily conditions the final output) targeting all attention and projections.<\/p>\n<p>We tried it, and even though the idea of having Qwen, Claude Opus, and GLM 5.1 running locally in our potato is beyond tempting, in reality we found that the model is so good at reasoning through things that it ends up overthinking.<\/p>\n<p>When tested it on an M1 MacBook running an MLX quantized version (a model optimized to run on Macs). When prompted to generate our usual test game, the reasoning chain ran so long it hit the token limit and gave us a nice long piece of reasoning without a working result in a zero shot interaction. That&#8217;s a daily-use blocker for anyone wanting to run this locally on consumer hardware for any serious application.<\/p>\n<p>We went a bit softer and things still were challenging. A simple &#8220;write a Snake game&#8221; prompt took over 40 minutes in reasoning&#8230; lots of it.<\/p>\n<div>\n<figure><img loading=\"lazy\" alt width=\"1942\" height=\"450\" decoding=\"async\" data-nimg=\"1\" src=\"https:\/\/img.decrypt.co\/insecure\/rs:fit:3840:0:0:0\/plain\/https:\/\/cdn.decrypt.co\/wp-content\/uploads\/2026\/04\/Captura-de-pantalla-2026-04-20-a-las-17.25.54.png@webp\"><\/figure>\n<\/div>\n<p>You can see the results in our <a href=\"https:\/\/github.com\/jaldps\/ai-tests\/blob\/main\/Coding\/Qwopus-GLM-18B-Merged-GGUF\" target=\"_blank\">Github repository<\/a>.<\/p>\n<p>This is a known tension in the Qwopus lineage: <a href=\"https:\/\/huggingface.co\/Jackrong\/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2\" target=\"_blank\" rel=\"nofollow external noopener\">Jackrong&#8217;s v2 finetunes<\/a> were built to address Qwen 3.5&#8217;s tendency toward repetitive internal loops and &#8220;think more economically.&#8221; Stacking 64 layers of two reasoning distills appears to amplify that behavior on certain prompts.<\/p>\n<p>That&#8217;s a solvable problem, and the open-source community will likely solve it. What matters here is the broader pattern: a pseudonymous developer publishes specialized finetunes with full training guides, another enthusiast stacks them with a custom script, runs 1,000 healing steps, and lands a model that outperforms a 35 billion parameter release from one of the world&#8217;s largest AI labs. The whole thing fits in a small file.<\/p>\n<p>This is what makes open-source worth watching\u2014not just the big labs releasing weights, but the layer-by-layer solutions, the specialization happening below the radar. The gap between a weekend project and a frontier deployment is narrower the more developers join the community.<\/p>\n<p>Jackrong has since mirrored Hessling&#8217;s repository, and the model had accumulated over three thousand downloads within its first two weeks of availability.<\/p>\n<div>\n<h3>Daily Debrief Newsletter<\/h3>\n<p>Start every day with the top news stories right now, plus original features, a podcast, videos and more.<\/p>\n<\/div>\n<\/div>",
        "protected": false
    },
    "excerpt": {
        "rendered": "<p>In brief AI engineer Kyle Hessling merged two of Jackrong&#8217;s Claude Opus 4.6 and GLM-5.1 distilled finetunes into a single [&hellip;]<\/p>",
        "protected": false
    },
    "author": 5,
    "featured_media": 74761,
    "comment_status": "open",
    "ping_status": "open",
    "sticky": false,
    "template": "",
    "format": "standard",
    "meta": {
        "_acf_changed": false,
        "footnotes": ""
    },
    "categories": [
        220
    ],
    "tags": [],
    "class_list": [
        "post-74760",
        "post",
        "type-post",
        "status-publish",
        "format-standard",
        "has-post-thumbnail",
        "hentry",
        "category-tien-dien-tu"
    ],
    "acf": [],
    "_links": {
        "self": [
            {
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/posts\/74760",
                "targetHints": {
                    "allow": [
                        "GET"
                    ]
                }
            }
        ],
        "collection": [
            {
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/posts"
            }
        ],
        "about": [
            {
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/types\/post"
            }
        ],
        "author": [
            {
                "embeddable": true,
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/users\/5"
            }
        ],
        "replies": [
            {
                "embeddable": true,
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/comments?post=74760"
            }
        ],
        "version-history": [
            {
                "count": 0,
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/posts\/74760\/revisions"
            }
        ],
        "wp:featuredmedia": [
            {
                "embeddable": true,
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/media\/74761"
            }
        ],
        "wp:attachment": [
            {
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/media?parent=74760"
            }
        ],
        "wp:term": [
            {
                "taxonomy": "category",
                "embeddable": true,
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/categories?post=74760"
            },
            {
                "taxonomy": "post_tag",
                "embeddable": true,
                "href": "https:\/\/hbbgroup.net\/en_us\/wp-json\/wp\/v2\/tags?post=74760"
            }
        ],
        "curies": [
            {
                "name": "wp",
                "href": "https:\/\/api.w.org\/{rel}",
                "templated": true
            }
        ]
    }
}