Top Qs
Timeline
Chat
Perspective

Moonshot AI

Chinese artificial intelligence company From Wikipedia, the free encyclopedia

Remove ads

Moonshot AI (Moonshot; Chinese: 月之暗面; pinyin: Yuè Zhī Ànmiàn; lit. 'Dark Side of the Moon') is an artificial intelligence (AI) company based in Beijing, China. It has been dubbed one of China's "AI Tiger" companies by investors[1] with its focus on developing large language models.

Quick facts Native name, Company type ...
Remove ads

Background

Moonshot was founded in March 2023 by Yang Zhilin, Zhou Xinyu and Wu Yuxin. It was launched on the 50th anniversary of Pink Floyd's The Dark Side of the Moon which was Yang's favorite album and the inspiration for the company's name.[2][3]

Yang has stated his goal for founding Moonshot AI is to build foundation models to achieve AGI.[4] Yang's three milestones are long context length, multimodal world model, and a scalable general architecture capable of continuous self-improvement without human input.[4]

In October 2023, the company released the first version of its chatbot, Kimi, which was capable of processing up to 200,000 Chinese characters per conversation.[5]

In June 2024, it was reported that Moonshot was planning to enter the US market. An insider revealed Moonshot was developing products for the US market, including an AI role-playing chat application called Ohai as well as a music video generator called Noisee. In response, Moonshot stated it had no plans to develop and release overseas products.[6]

In July 2025, Moonshot released Kimi K2, a new version of their chatbot with more parameters.[7]

Remove ads

Funding and investments

Moonshot was valued at $300 million when it received its initial funding of $60 million and had 40 employees.[3][8]

In February 2024, Alibaba Group led a $1 billion funding round for Moonshot, which gave it a valuation of $2.5 billion.[8]

In August 2024, Tencent and Gaorong Capital joined as investors in a $300 million funding round that valued Moonshot at $3.3 billion.[9]

In October 2025, Moonshot was reportedly nearing the completion of a new funding round of approximately $600 million, led by IDG Capital with participation from existing investors including Tencent, valuing the company at $3.8 billion pre-money.[10][11]

Remove ads

Products and research

Summarize
Perspective

Kimi

In October 2023, Moonshot launched its first AI chatbot, Kimi, whose name comes from Yang's English nickname. It had emerged as the closest rival to Baidu's Ernie Bot.[2][12]

In March 2024, Moonshot claimed Kimi could handle 2 million Chinese characters in a single prompt which was a significant upgrade from the previous version that could only handle 200,000. Due to the increased number of users, on 21 March, Kimi suffered an outage for two days and Moonshot had to issue an apology.[12][13]

As of August 2024, Kimi ranked third in active monthly users according to aicpb.com.[14]

On 20 January 2025, Kimi K1.5 was released. Moonshot claimed it matched the performance of OpenAI o1 in mathematics, coding, and multimodal reasoning capabilities.[15]

In June 2025, Kimi dropped in popularity to seventh place in active monthly users.[14]

In July 2025, the company released the weights for Kimi K2, a large language model with 1 trillion total parameters.[16] The model uses a mixture-of-experts (MoE) architecture, where 32 billion parameters are active during inference. K2 was trained on 15.5 trillion tokens of data and is released under a modified MIT license.[17][18] Kimi K2 is an open source LLM, meaning that it can be downloaded and built upon by users.[7] The day after its release, Kimi K2 had the most downloads on the platform, an increase in popularity from previous months.[7] Moonshot claims that the model excels in coding tasks, having passed tests like LiveCodeBench.[7] In certain instances, the model performed on-par with or better than its Western counterparts.[7] It has also been praised for its writing skills.[7] On 9 September 2025, Moonshot AI released an updated version of K2, Kimi-K2-Instruct-0905, which further increased its performance in agentic coding tasks and doubled its context window from 128K tokens to 256K tokens.[19][20]

The release of Kimi K2 follows a trend amongst Chinese companies to make their AI models open sourced likely trying to counter US’s efforts to limit China's tech growth.[14]

In November 2025, Moonshot released Kimi K2 Thinking, an open-source update to Kimi K2 designed for advanced reasoning and agentic tasks. The model, trained for approximately $4.6 million, features a 1-trillion-parameter MoE architecture with 32 billion active parameters and supports up to 256,000-token contexts. It can execute 200-300 sequential tool calls autonomously and uses native INT4 quantization for efficiency. Benchmarks showed it outperforming GPT-5 and Claude Sonnet 4.5 on tests including Humanity's Last Exam (44.9%), BrowseComp (60.2%), and SWE-Bench Verified (71.3%). It is released under a modified MIT license requiring attribution for products exceeding 100 million monthly users or $20 million in monthly revenue.[21][22][23]

In China, Kimi has six tiers of plans ranging from 5.2 yuan for four days to 399 yuan for a year of priority use.[24]

Mooncake serving platform

Mooncake is the platform that serves Moonshot's Kimi chatbot and processes 100 billion tokens daily.[25] Moonshot was awarded the Erik Riedel Best Paper Award at the USENIX FAST conference for the paper detailing the architecture of Mooncake.[25]

Scaling Muon optimizer

In the Moonshot and UCLA joint paper "Muon is Scalable for LLM Training", the researchers claim to have successfully scaled the Muon optimizer, which was previously known to have strong results in training small language models, to train a 16 billion parameter mixture of experts (MoE) large language model with 3 billion active parameters.[26] The researchers indicate that Muon improves computational efficiency by a factor of 2 compared to the standard optimizer, AdamW, in training large models.[26] The researchers have open sourced their Muon optimizer implementation and the pretrained and instruction-tuned checkpoints.[4]

Scaling reinforcement learning with LLMs

In their technical report on the Kimi K1.5 model, Moonshot researchers outline their reinforcement learning methods, which they claim enabled the model to achieve state-of-the-art reasoning capabilities on par with OpenAI's o1 model.[27] The researchers note that long context scaling and improved policy optimization methods were key, without relying on complex techniques like Monte Carlo tree search, value functions, and process reward models.[27]

Remove ads

See also

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads