45可下载 PDF
92唯一外链已验证
2026-05-11最后校验日期
GLM Series
| 论文 | 年份 | 阅读页 | |
|---|---|---|---|
| GLM: General Language Model Pre-training with Autoregressive Blank Infilling | 2021 | arXiv | |
| GLM-130B: An Open Bilingual Pre-trained Model | 2022 | arXiv | |
| WebGLM | 2023 | arXiv | |
| ChatGLM / GLM-4 All Tools | 2024 | arXiv | |
| AutoGLM: Autonomous Foundation Agents for GUIs | 2024 | arXiv | |
| GLM-4-Voice | 2024 | arXiv | |
| GLM-4.1V-Thinking & GLM-4.5V | 2025 | arXiv | |
| GLM-4.5: Agentic, Reasoning, and Coding Foundation Models | 2025 | arXiv | |
| GLM-TTS Technical Report | 2025 | arXiv | |
| GLM-5: From Vibe Coding to Agentic Engineering | 2026 | arXiv | |
| GLM-OCR Technical Report | 2026 | arXiv | |
| GLM-5V-Turbo | 2026 | arXiv |
Kimi Series
| 论文 | 年份 | 阅读页 | |
|---|---|---|---|
| Mooncake: A KVCache-Centric Disaggregated Architecture for LLM Serving | 2024 | arXiv | |
| Kimi k1.5: Scaling Reinforcement Learning with LLMs | 2025 | arXiv | |
| Muon is Scalable for LLM Training | 2025 | arXiv | |
| Kimi-VL Technical Report | 2025 | arXiv | |
| Kimi-Audio Technical Report | 2025 | arXiv | |
| Kimi K2: Open Agentic Intelligence | 2025/2026 | arXiv | |
| Kimi Linear: An Expressive, Efficient Attention Architecture | 2025 | arXiv | |
| Kimi K2.5: Visual Agentic Intelligence | 2026 | arXiv |
DeepSeek Series
| 论文 | 年份 | 阅读页 | |
|---|---|---|---|
| DeepSeek LLM: Scaling Open-Source Language Models with Longtermism | 2024 | arXiv | |
| DeepSeekMoE: Towards Ultimate Expert Specialisation in Mixture-of-Experts Language Models | 2024 | arXiv | |
| DeepSeek-Coder: When the Large Language Model Meets Programming | 2024 | arXiv | |
| DeepSeek-Math: Pushing the Limits of Mathematical Reasoning | 2024 | arXiv | |
| DeepSeek-VL: Towards Real-World Vision-Language Understanding | 2024 | arXiv | |
| DeepSeek-V2: A Strong, Economical and Efficient Mixture-of-Experts Language Model | 2024 | arXiv | |
| DeepSeek-Prover | 2024 | arXiv | |
| DeepSeek-Coder-V2 | 2024 | arXiv | |
| Let the Expert Stick to His Last: Expert-Specialised Fine-Tuning for Sparse Models | 2024 | arXiv | |
| Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts | 2024 | arXiv | |
| DeepSeek-Prover V1.5 | 2024 | arXiv | |
| Janus | 2024 | arXiv | |
| JanusFlow | 2024 | arXiv | |
| DeepSeek-VL2 | 2024 | arXiv | |
| DeepSeek-V3 Technical Report | 2024 | arXiv | |
| Janus-Pro | 2025 | arXiv | |
| DeepSeek-R1 | 2025 | arXiv | |
| Native Sparse Attention | 2025 | arXiv | |
| Inference-Time Scaling for Generalist Reward Modelling | 2025 | arXiv | |
| DeepSeek-Prover V2 | 2025 | arXiv | |
| DeepSeek-OCR | 2025 | arXiv | |
| DeepSeek-Math-V2 | 2025 | arXiv | |
| DeepSeek-V3.2 | 2025 | arXiv |