{
  "type": "article",
  "title": "The Chinese Model That Ran Undercover on OpenRouter for Two Months Finally Has a Name, and a Rock-Bottom Price",
  "summary": "Meituan has confirmed that LongCat-2.0, its 1.6-trillion-parameter open-license MoE model, is the anonymous \"Owl Alpha\" that topped OpenRouter charts for two months, and it undercuts rivals on price while running entirely on Chinese chips.",
  "content": "Chinese tech firm Meituan pulled back the curtain on LongCat-2.0 on June 30, confirming that the open-license, 1.6-trillion-parameter mixture-of-experts model is the very same system that had been quietly logging traffic on OpenRouter for two months under the codename Owl Alpha. For weeks, developers were routing real work through it without knowing who built it.\n\nA giant that only flexes part of itself\nParameters are the full set of dials a model can tune during training, and LongCat-2.0 carries a staggering 1.6 trillion of them. It does not fire all of those at once, though. For every token, the smallest chunk of data an AI reads, the model switches on roughly 48 billion parameters, a figure that slides anywhere from 33 billion to 56 billion depending on how hard the request is to answer. That selective activation is what keeps a model this size fast enough to be practical.\n\n \n\nTwo months undercover, and it paid off\nRunning anonymously turned out to be a smart move. By the time Meituan put its name on the model, Owl Alpha had already climbed to the top of the Hermes Agent workspace, taken second place on Claude Code, and landed third across OpenClaw deployments, with all three rankings based on monthly call volume. In other words, people were voting for it with their usage long before they knew whose model it was.\n\nThe first trillion-parameter model built entirely on Chinese chips\nWhat sets this release apart is where it was made. LongCat-2.0 is the first trillion-parameter model to be trained and deployed from start to finish on domestic Chinese ASICs, rather than merely being served on them after training somewhere else. The contrast with DeepSeek's V4-Pro is telling: that model leaned on Huawei chips only for inference, while its pretraining still ran on Nvidia hardware.\n\nMeituan says the pretraining run chewed through more than 35 trillion tokens on a cluster of over 50,000 domestically built accelerators and crossed the finish line with \"no rollbacks or irrecoverable loss spikes.\" That kind of stability is a genuine flex, because massive training runs on untested hardware stacks routinely collapse partway through, and it fits a broader pattern of China trying to wean its AI ambitions off U.S. silicon.\n\nWhere the price makes its argument\nThe strongest pitch, though, is the bill. Standard API access costs $0.75 per million input tokens and $2.95 per million output tokens, dropping to $0.30 and $1.20 during the current launch promotion, with reads from cached context thrown in for free. That badly undercuts GPT-5.5, which runs $5 and $30 per million tokens, beats Claude Sonnet 5's introductory $2 and $10, and sits right alongside DeepSeek V4-Pro's permanent $0.435 and $0.87, as well as Xiaomi's MiMo-V2.5 Pro, which fell to that same rate after its own May price cuts.\n\nFor the heaviest users, there is an even cheaper route. Meituan also sells token packs, with a bundle of 1 billion tokens going for around $60, which makes it especially attractive to coders burning through volume.\n\nHow it actually holds up\nPut to work on a quick game-building task, LongCat-2.0 got the job done, and the output held together reasonably well after a few rounds of iteration. The finished result sat visibly below Claude Fable and Opus 4.8, which makes it easier to slot near Sonnet 4.6, yet the quality you get per dollar at these prices is tough to argue against.\n\nIn the game, it sent waves of enemies in from different angles while the camera auto-centered on whichever one was closest. The logic broke down, however, once difficulty ramped up the enemy count. At higher speeds the target-switching went haywire, with the focus snapping to a nearer enemy in the middle of a typing prompt, which made the game maddening to play.\n\nThat is par for the course in vibe-coding sessions, where models rarely anticipate the downstream consequences of a choice and instead deliver exactly what the prompt literally asks for. It is also the best argument for a cheap model: the lower the cost, the more freely a user can keep iterating until the result finally matches what they had in mind. On the whole, without any extra hand-holding, the quality lands somewhere between DeepSeek v4 Flash and DeepSeek v4 Pro on quick coding tasks.\n\nWhat makes it tick\nLongCat-2.0 leans on a handful of tricks to run faster and smarter without ballooning in size. Its attention system, built on DeepSeek's design, zeroes in only on the most relevant slices of very long conversations instead of weighing everything equally, which helps it answer more quickly.\n\nThere is also a new N-gram embedding system, a way of grasping groups of words or subwords as a single unit, that hands the model roughly 100 times more possible representations without piling on extra components. In effect, it teaches the AI to recognize common phrases rather than lone words. Instead of reading \"New,\" \"York,\" and \"City\" as three separate fragments, it can treat \"New York City\" as a single meaningful idea, giving it a far richer grip on language without growing dramatically bigger.\n\nAfter training, Meituan fuses three specialized systems, one tuned for using tools (Agent), one for solving problems (Reasoning), and one for conversation (Interaction). A routing mechanism then picks which blend of specialists should take on each request, much like handing the right job to the right team.\n\nThe benchmark scorecard\nOn SWE-bench Pro, which measures how often a model actually resolves real GitHub issues drawn from production codebases, LongCat-2.0 scored 59.5, edging out GPT-5.5's 58.6 and Gemini 3.1 Pro's 54.2, though it still trails Claude Opus 4.7 and 4.8. On FORTE, which grades agents on everyday office tasks across 15 professions under a 45-minute cap, it managed 73.2, level with Claude Opus 4.6 but behind GPT-5.5's 77.8.\n\nWho wins, and the one catch\nThe clearest beneficiaries are teams building coding agents on a tight budget, along with anyone running high-volume, repository-scale work where those free cached-context reads pile up in your favor. The model is live right now through Meituan's OpenAI- and Anthropic-compatible API endpoints, or through agent harnesses like Hermes, Claude Code, and OpenClaw that already support it. It also ships with a 1M-token context window and its own LongCat Sparse Attention (LSA), which scales efficiently across that million-token span.\n\nThere is one group left out in the cold: anyone who wants to self-host. Both the GitHub and Hugging Face repositories still say \"model weights coming soon,\" and Meituan has yet to name a date for when those files will actually land.\n\nWhat this means for you\n• For budget-conscious developers: Very cheap API access (a launch-promo $0.30 input and $1.20 output per million tokens, with cached reads free) sharply lowers the cost of running coding agents.\n• For heavy users: A 1 billion-token pack for around $60 makes large repository-scale work even cheaper.\n• For self-hosters: You cannot download the weights yet; GitHub and Hugging Face only say \"coming soon,\" with no date set.\n\nQuestions & Answers\n\n1. What is LongCat-2.0?\nIt is Meituan's 1.6-trillion-parameter open-license mixture-of-experts AI model, unveiled on June 30.\n\n2. What was Owl Alpha?\nIt was the anonymous codename under which this model ran on OpenRouter for two months.\n\n3. How much does it cost?\nStandard rates are $0.75 input and $2.95 output per million tokens, cut to $0.30 and $1.20 during the launch promo; cached reads are free and a 1 billion-token pack costs around $60.\n\n4. What chips was it trained on?\nIt is the first trillion-parameter model trained and deployed end-to-end on domestic Chinese ASICs, using over 50,000 accelerators and more than 35 trillion tokens.\n\n5. How does it score on benchmarks?\nIt scored 59.5 on SWE-bench Pro and 73.2 on FORTE.\n\n6. Can I self-host it?\nNot yet. GitHub and Hugging Face both say the model weights are \"coming soon,\" with no date given.\n\n7. How does its price compare to GPT-5.5?\nIt is far cheaper, whereas GPT-5.5 charges $5 input and $30 output per million tokens.",
  "url": "https://trendkia.com/en/ai/do-mahine-taka-chupachapa-openrouter-para-chhaya-raha-yaha-chini-ai-modala-aba-samane-aya-meituan-ka-longcat-2-0-3925",
  "category": "AI",
  "publishedAt": "2026-07-01",
  "tags": [
    "LongCat-2.0",
    "Meituan",
    "AI model",
    "OpenRouter",
    "Owl Alpha",
    "Chinese AI",
    "coding AI",
    "open-license model"
  ],
  "language": "en",
  "site": "TrendKia"
}