论文下载中心 | ChinaAI Roadmaps

GLM Series

论文	年份	阅读页	PDF
GLM: General Language Model Pre-training with Autoregressive Blank Infilling	2021	arXiv	PDF
GLM-130B: An Open Bilingual Pre-trained Model	2022	arXiv	PDF
WebGLM	2023	arXiv	PDF
ChatGLM / GLM-4 All Tools	2024	arXiv	PDF
AutoGLM: Autonomous Foundation Agents for GUIs	2024	arXiv	PDF
GLM-4-Voice	2024	arXiv	PDF
GLM-4.1V-Thinking & GLM-4.5V	2025	arXiv	PDF
GLM-4.5: Agentic, Reasoning, and Coding Foundation Models	2025	arXiv	PDF
GLM-TTS Technical Report	2025	arXiv	PDF
GLM-5: From Vibe Coding to Agentic Engineering	2026	arXiv	PDF
GLM-OCR Technical Report	2026	arXiv	PDF
GLM-5V-Turbo	2026	arXiv	PDF

论文	年份	阅读页	PDF
Mooncake: A KVCache-Centric Disaggregated Architecture for LLM Serving	2024	arXiv	PDF
Kimi k1.5: Scaling Reinforcement Learning with LLMs	2025	arXiv	PDF
Muon is Scalable for LLM Training	2025	arXiv	PDF
Kimi-VL Technical Report	2025	arXiv	PDF
Kimi-Audio Technical Report	2025	arXiv	PDF
Kimi K2: Open Agentic Intelligence	2025/2026	arXiv	PDF
Kimi Linear: An Expressive, Efficient Attention Architecture	2025	arXiv	PDF
Kimi K2.5: Visual Agentic Intelligence	2026	arXiv	PDF

论文	年份	阅读页	PDF
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism	2024	arXiv	PDF
DeepSeekMoE: Towards Ultimate Expert Specialisation in Mixture-of-Experts Language Models	2024	arXiv	PDF
DeepSeek-Coder: When the Large Language Model Meets Programming	2024	arXiv	PDF
DeepSeek-Math: Pushing the Limits of Mathematical Reasoning	2024	arXiv	PDF
DeepSeek-VL: Towards Real-World Vision-Language Understanding	2024	arXiv	PDF
DeepSeek-V2: A Strong, Economical and Efficient Mixture-of-Experts Language Model	2024	arXiv	PDF
DeepSeek-Prover	2024	arXiv	PDF
DeepSeek-Coder-V2	2024	arXiv	PDF
Let the Expert Stick to His Last: Expert-Specialised Fine-Tuning for Sparse Models	2024	arXiv	PDF
Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts	2024	arXiv	PDF
DeepSeek-Prover V1.5	2024	arXiv	PDF
Janus	2024	arXiv	PDF
JanusFlow	2024	arXiv	PDF
DeepSeek-VL2	2024	arXiv	PDF
DeepSeek-V3 Technical Report	2024	arXiv	PDF
Janus-Pro	2025	arXiv	PDF
DeepSeek-R1	2025	arXiv	PDF
Native Sparse Attention	2025	arXiv	PDF
Inference-Time Scaling for Generalist Reward Modelling	2025	arXiv	PDF
DeepSeek-Prover V2	2025	arXiv	PDF
DeepSeek-OCR	2025	arXiv	PDF
DeepSeek-Math-V2	2025	arXiv	PDF
DeepSeek-V3.2	2025	arXiv	PDF

论文/报告	年份	阅读页	PDF / 报告
MiniMax-01: Scaling Foundation Models with Lightning Attention	2025	arXiv	PDF
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention	2025	arXiv	PDF
MiniMax M2.5: Built for Real-World Productivity	2026	Official	Model page
MiniMax M3: Coding Frontier, 1M Context, Native Multimodality	2026	Official	Model page