[COURSE] Add LLM related courses (#746)

* add CMU11868 * add cmu11-667 * add cmu11711 * update cmu11-868 * update cmu-11667 * nits
2025-06-08 00:16:52 +08:00 · 2025-06-08 00:16:52 +08:00 · a74ddd98d3
parent 2b4ba63b09
commit a74ddd98d3
7 changed files with 201 additions and 0 deletions
--- a/docs/深度生成模型/大语言模型/CMU11-667.en.md
+++ b/docs/深度生成模型/大语言模型/CMU11-667.en.md
@ -0,0 +1,31 @@
+# CMU11-667: Large Language Models: Methods and Applications
+
+## Course Overview
+
+- University: Carnegie Mellon University  
+- Prerequisites: Solid background in machine learning (equivalent to CMU 10-301/10-601) and natural language processing (equivalent to 11-411/11-611); proficiency in Python and familiarity with PyTorch or similar deep learning frameworks.  
+- Programming Language: Python  
+- Course Difficulty: 🌟🌟🌟🌟  
+- Estimated Study Hours: 100+ hours  
+
+This graduate-level course provides a comprehensive overview of methods and applications of Large Language Models (LLMs), covering a wide range of topics from core architectures to cutting-edge techniques. Course content includes:
+
+1. **Foundations**: Neural network architectures for language modeling, training procedures, inference, and evaluation metrics.  
+2. **Advanced Topics**: Model interpretability, alignment methods, emergent capabilities, and applications in both textual and non-textual domains.  
+3. **System & Optimization Techniques**: Large-scale pretraining strategies, deployment optimization, and efficient training/inference methods.  
+4. **Ethics & Safety**: Addressing model bias, adversarial attacks, and legal/regulatory concerns.
+
+The course blends lectures, readings, quizzes, interactive exercises, assignments, and a final project to offer students a deep and practical understanding of LLMs, preparing them for both research and real-world system development.
+
+**Self-Study Tips**:
+
+- Thoroughly read all assigned papers and materials before each class.  
+- Become proficient with PyTorch and implement core models and algorithms by hand.  
+- Complete the assignments diligently to build practical skills and reinforce theoretical understanding.
+
+## Course Resources
+
+- Course Website: <https://cmu-llms.org/>  
+- Course Videos: Selected lecture slides and materials are available on the website; full lecture recordings may require CMU internal access.  
+- Course Materials: Curated research papers and supplementary materials, with the full reading list available on the course site.  
+- Assignments: Six programming assignments covering data preparation, Transformer implementation, retrieval-augmented generation, model evaluation and debiasing, and training efficiency. Details at <https://cmu-llms.org/assignments/>
--- a/docs/深度生成模型/大语言模型/CMU11-667.md
+++ b/docs/深度生成模型/大语言模型/CMU11-667.md
@ -0,0 +1,31 @@
+# CMU11-667: Large Language Models: Methods and Applications
+
+## 课程简介
+
+- 所属大学：Carnegie Mellon University
+- 先修要求：具备机器学习基础（相当于 CMU 的 10-301/10-601）和自然语言处理基础（相当于 11-411/11-611）；熟练掌握 Python，熟悉 PyTorch 等深度学习框架。
+- 编程语言：Python
+- 课程难度：🌟🌟🌟🌟
+- 预计学时：100 学时以上
+
+该研究生课程全面介绍了大型语言模型（LLM）的方法与应用，涵盖从基础架构到前沿技术的广泛主题。课程内容包括：
+
+1. **基础知识**：语言模型的网络架构、训练、推理和评估方法。
+2. **进阶主题**：模型解释性、对齐方法、涌现能力，以及在语言任务和非文本任务中的应用。
+3. **扩展技术**：大规模预训练技术、模型部署优化，以及高效的训练和推理方法。
+4. **伦理与安全**：模型偏见、攻击方法、法律问题等。
+
+课程采用讲座、阅读材料、小测验、互动活动、作业和项目相结合的方式进行，旨在为学生提供深入理解 LLM 的机会，并为进一步的研究或应用打下坚实基础。
+
+**自学建议**：
+
+- 认真阅读每次课前指定的论文和材料。
+- 熟悉 PyTorch 等深度学习框架，动手实现模型和算法。
+- 扎实完成课程作业。
+
+## 课程资源
+
+- 课程网站：<https://cmu-llms.org/>
+- 课程视频：部分讲座幻灯片和材料可在课程网站获取，完整视频可能需通过 CMU 内部平台访问。
+- 课程教材：精选论文和资料，具体阅读列表详见课程网站。
+- 课程作业：共六次作业，涵盖预训练数据准备、Transformer 实现、检索增强生成、模型比较与偏见缓解、训练效率提升等主题，详情见 <https://cmu-llms.org/assignments/>
--- a/docs/深度生成模型/大语言模型/CMU11-711.en.md
+++ b/docs/深度生成模型/大语言模型/CMU11-711.en.md
@ -0,0 +1,27 @@
+# CMU 11-711: Advanced Natural Language Processing (ANLP)
+
+## Course Overview
+
+* University: Carnegie Mellon University  
+* Prerequisites: No strict prerequisites, but students should have experience with Python programming, as well as a background in probability and linear algebra. Prior experience with neural networks is recommended.  
+* Programming Language: Python  
+* Course Difficulty: 🌟🌟🌟🌟  
+* Estimated Workload: 100 hours  
+
+This is a graduate-level course covering both foundational and advanced topics in Natural Language Processing (NLP). The syllabus spans word representations, sequence modeling, attention mechanisms, Transformer architectures, and cutting-edge topics such as large language model pretraining, instruction tuning, complex reasoning, multimodality, and model safety. Compared to similar courses, this course stands out for the following reasons:
+
+1. **Comprehensive and research-driven content**: In addition to classical NLP methods, it offers in-depth discussions of recent trends and state-of-the-art techniques such as LLaMa and GPT-4.  
+2. **Strong practical component**: Each lecture includes code demonstrations and online quizzes, and the final project requires reproducing and improving upon a recent research paper.  
+3. **Highly interactive**: Active engagement is encouraged through Piazza discussions, Canvas quizzes, and in-class Q&A, resulting in an immersive and well-paced learning experience.
+
+Self-study tips:
+
+* Read the recommended papers before class and follow the reading sequence step-by-step.  
+* Set up a Python environment and become familiar with PyTorch and Hugging Face, as many hands-on examples are based on these frameworks.
+
+## Course Resources
+
+* Course Website: [https://www.phontron.com/class/anlp-fall2024/](https://www.phontron.com/class/anlp-fall2024/)  
+* Course Videos: Lecture recordings are available on Canvas (CMU login required)  
+* Course Texts: Selected classical and cutting-edge research papers + chapters from *A Primer on Neural Network Models for Natural Language Processing* by Yoav Goldberg  
+* Course Assignments: [https://www.phontron.com/class/anlp-fall2024/assignments/](https://www.phontron.com/class/anlp-fall2024/assignments/)
--- a/docs/深度生成模型/大语言模型/CMU11-711.md
+++ b/docs/深度生成模型/大语言模型/CMU11-711.md
@ -0,0 +1,28 @@
+# CMU 11-711: Advanced Natural Language Processing (ANLP)
+
+## 课程简介
+
+* 所属大学：Carnegie Mellon University
+* 先修要求：无硬性先修要求，但需具备 Python 编程经验，以及概率论和线性代数基础；有神经网络使用经验者更佳。
+* 编程语言：Python
+* 课程难度：🌟🌟🌟🌟
+* 预计学时：100 学时
+
+该课程为研究生级别的 NLP 入门与进阶课程，覆盖从词表征、序列建模，到注意力机制、Transformer 架构，再到大规模语言模型预训练、指令微调与复杂推理、多模态和安全性等前沿主题。与其他同类课程相比，本课程：
+
+1. **内容全面且紧跟最新研究**：除经典算法外，深入讲解近年热门的大模型方法（如 LLaMa、GPT-4 等）。
+2. **实践性强**：每次课配套代码演示与在线小测，学期末项目需复现并改进一篇前沿论文。
+3. **互动良好**：Piazza 讨论、Canvas 测验及现场答疑，学习体验沉浸而有节奏。
+
+自学建议：
+
+* 提前阅读课前推荐文献，跟着阅读顺序循序渐进。
+* 准备好 Python 环境并熟悉 PyTorch/Hugging Face，因为大量实战代码示例基于此。
+* 扎实完成课程作业。
+
+## 课程资源
+
+* 课程网站：[https://www.phontron.com/class/anlp-fall2024/](https://www.phontron.com/class/anlp-fall2024/)
+* 课程视频：课堂讲座录制并上传至 Canvas（需 CMU 帐号登录）
+* 课程教材：各类经典与前沿论文＋Goldberg《A Primer on Neural Network Models for Natural Language Processing》章节阅读
+* 课程作业：[https://www.phontron.com/class/anlp-fall2024/assignments/](https://www.phontron.com/class/anlp-fall2024/assignments/)
--- a/docs/深度生成模型/大语言模型/CMU11-868.en.md
+++ b/docs/深度生成模型/大语言模型/CMU11-868.en.md
@ -0,0 +1,40 @@
+# CMU 11-868: Large Language Model Systems
+
+## Course Overview
+
+- University: Carnegie Mellon University  
+- Prerequisites: Strongly recommended to have taken Deep Learning (11-785) or Advanced NLP (11-611 or 11-711)  
+- Programming Language: Python  
+- Course Difficulty: 🌟🌟🌟🌟  
+- Estimated Workload: 120 hours  
+
+This graduate-level course focuses on the full stack of large language model (LLM) systems — from algorithms to engineering. The curriculum covers, but is not limited to:
+
+1. **GPU Programming and Automatic Differentiation**: Master CUDA kernel calls, fundamentals of parallel programming, and deep learning framework design.  
+2. **Model Training and Distributed Systems**: Learn efficient training algorithms, communication optimizations (e.g., ZeRO, FlashAttention), and distributed training frameworks like DDP, GPipe, and Megatron-LM.  
+3. **Model Compression and Acceleration**: Study quantization (GPTQ), sparsity (MoE), compiler technologies (JAX, Triton), and inference-time serving systems (vLLM, CacheGen).  
+4. **Cutting-Edge Topics and Systems Practice**: Includes retrieval-augmented generation (RAG), multimodal LLMs, RLHF systems, and end-to-end deployment, monitoring, and maintenance.
+
+Compared to similar courses, this one stands out for its **tight integration with recent papers and open-source implementations** (hands-on work expanding CUDA support in the miniTorch framework), a **project-driven assignment structure** (five programming assignments + a final project), and **guest lectures from industry experts**, offering students real-world insights into LLM engineering challenges and solutions.
+
+**Self-Study Tips**:
+
+- Set up a CUDA-compatible environment in advance (NVIDIA GPU + CUDA Toolkit + PyTorch).  
+- Review fundamentals of parallel computing and deep learning (autograd, tensor operations).  
+- Carefully read the assigned papers and slides before each lecture, and follow the assignments to extend the miniTorch framework from pure Python to real CUDA kernels.
+
+This course assumes a solid understanding of deep learning and is **not suitable for complete beginners**. See the [FAQ](https://llmsystem.github.io/llmsystem2024spring/docs/FAQ) for more on prerequisites.
+
+The assignments are fairly challenging and include:
+
+1. **Assignment 1**: Implement an autograd framework + custom CUDA ops + basic neural networks  
+2. **Assignment 2**: Build a GPT2 model from scratch  
+3. **Assignment 3**: Accelerate training with custom CUDA kernels for Softmax and LayerNorm  
+4. **Assignment 4**: Implement distributed model training (difficult to configure independently for self-study)
+
+## Course Resources
+
+- Course Website: <https://llmsystem.github.io/llmsystem2025spring/>  
+- Syllabus: <https://llmsystem.github.io/llmsystem2025spring/docs/Syllabus/>  
+- Assignments: <https://llmsystem.github.io/llmsystem2025springhw/>  
+- Course Texts: Selected research papers + selected chapters from *Programming Massively Parallel Processors (4th Edition)*
--- a/docs/深度生成模型/大语言模型/CMU11-868.md
+++ b/docs/深度生成模型/大语言模型/CMU11-868.md
@ -0,0 +1,39 @@
+# CMU 11-868: Large Language Model Systems
+
+## 课程简介
+
+- 所属大学：Carnegie Mellon University
+- 先修要求：强烈建议已修读 Deep Learning (11785) 或 Advanced NLP (11-611 或 11-711)
+- 编程语言：Python
+- 课程难度：🌟🌟🌟🌟  
+- 预计学时：120 学时  
+
+该课程面向研究生开设，聚焦“从算法到工程”的大语言模型系统构建全过程。课程内容包括但不限于：
+
+1. **GPU 编程与自动微分**：掌握 CUDA kernel 调用、并行编程基础，以及深度学习框架设计原理。  
+2. **模型训练与分布式系统**：学习高效的训练算法、通信优化（ZeRO、FlashAttention）、分布式训练框架（DDP、GPipe、Megatron-LM）。  
+3. **模型压缩与加速**：量化（GPTQ）、稀疏化（MoE）、编译技术（JAX、Triton）、以及推理时的服务化设计（vLLM、CacheGen）。  
+4. **前沿技术与系统实践**：涵盖检索增强生成（RAG）、多模态 LLM、RLHF 系统，以及端到端的在线维护和监控。  
+
+与同类课程相比，本课程的优势在于**紧密结合最新论文与开源实现**（通过 miniTorch 框架动手扩展 CUDA 支持）；**项目驱动**的作业体系（五次编程作业 + 期末大项目）；以及**工业嘉宾讲座**，能让学生近距离了解真实世界中 LLM 工程实践的挑战与解决方案。
+
+**自学建议**：
+
+- 提前配置好支持 CUDA 的开发环境（NVIDIA GPU + CUDA Toolkit + PyTorch）。  
+- 复习并行计算和深度学习基础（自动微分、张量运算）。  
+- 阅读每次课前指定的论文与幻灯片，跟着作业把 miniTorch 框架从纯 Python 拓展到真实 CUDA 内核。  
+
+该课程要求你对深度学习有一定的预备知识，不适合纯小白入手，可见 [FAQ](https://llmsystem.github.io/llmsystem2024spring/docs/FAQ) 的先修要求。
+实验总体来说是有难度的，主要内容如下：
+
+1. Assignment1: 自动微分框架 + CUDA 手写算子 + 基础神经网络构建
+2. Assignmant2: GPT2 模型构建
+3. Assignment3: 通过手写 CUDA 的 Softmax 和 LayerNorm 算子优化模型训练速度
+4. Assignment4: 分布式模型训练，自学的话可能不太好配置环境
+
+## 课程资源
+
+- 课程网站：<https://llmsystem.github.io/llmsystem2025spring/>  
+- 课程大纲：<https://llmsystem.github.io/llmsystem2025spring/docs/Syllabus/>  
+- 课程作业：<https://llmsystem.github.io/llmsystem2025springhw/>  
+- 课程教材：精选论文 + 《Programming Massively Parallel Processors, 4th Ed》 部分章节
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -114,6 +114,7 @@ plugins:
            "国立台湾大学: 李宏毅机器学习": NTU Machine Learning
            深度生成模型: Deep Generative Models
            学习路线图: Roadmap
+            "大语言模型": Large Language Models
            机器学习进阶: Advanced Machine Learning
            学习路线图: Roadmap
            后记: Postscript
@ -282,6 +283,10 @@ nav:
      - "UCB CS285: Deep Reinforcement Learning": "深度学习/CS285.md"
  - 深度生成模型:
      - "学习路线图": "深度生成模型/roadmap.md"
+      - "大语言模型":
+        - "CMU 11-868: Large Language Model System": "深度生成模型/大语言模型/CMU11-868.md"
+        - "CMU 11-667: Large Language Models: Methods and Applications": "深度生成模型/大语言模型/CMU11-667.md"
+        - "CMU 11-711: Advanced Natural Language Processing": "深度生成模型/大语言模型/CMU11-711.md"
  - 机器学习进阶:
      - "学习路线图": "机器学习进阶/roadmap.md"
      - "CMU 10-708: Probabilistic Graphical Models": "机器学习进阶/CMU10-708.md"