[COURSE] Add LLM related courses (#746)

* add CMU11868

* add cmu11-667

* add cmu11711

* update cmu11-868

* update cmu-11667

* nits
This commit is contained in:
Yinmin Zhong 2025-06-08 00:16:52 +08:00 committed by GitHub
parent 2b4ba63b09
commit a74ddd98d3
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
7 changed files with 201 additions and 0 deletions

View File

@ -0,0 +1,31 @@
# CMU11-667: Large Language Models: Methods and Applications
## Course Overview
- University: Carnegie Mellon University
- Prerequisites: Solid background in machine learning (equivalent to CMU 10-301/10-601) and natural language processing (equivalent to 11-411/11-611); proficiency in Python and familiarity with PyTorch or similar deep learning frameworks.
- Programming Language: Python
- Course Difficulty: 🌟🌟🌟🌟
- Estimated Study Hours: 100+ hours
This graduate-level course provides a comprehensive overview of methods and applications of Large Language Models (LLMs), covering a wide range of topics from core architectures to cutting-edge techniques. Course content includes:
1. **Foundations**: Neural network architectures for language modeling, training procedures, inference, and evaluation metrics.
2. **Advanced Topics**: Model interpretability, alignment methods, emergent capabilities, and applications in both textual and non-textual domains.
3. **System & Optimization Techniques**: Large-scale pretraining strategies, deployment optimization, and efficient training/inference methods.
4. **Ethics & Safety**: Addressing model bias, adversarial attacks, and legal/regulatory concerns.
The course blends lectures, readings, quizzes, interactive exercises, assignments, and a final project to offer students a deep and practical understanding of LLMs, preparing them for both research and real-world system development.
**Self-Study Tips**:
- Thoroughly read all assigned papers and materials before each class.
- Become proficient with PyTorch and implement core models and algorithms by hand.
- Complete the assignments diligently to build practical skills and reinforce theoretical understanding.
## Course Resources
- Course Website: <https://cmu-llms.org/>
- Course Videos: Selected lecture slides and materials are available on the website; full lecture recordings may require CMU internal access.
- Course Materials: Curated research papers and supplementary materials, with the full reading list available on the course site.
- Assignments: Six programming assignments covering data preparation, Transformer implementation, retrieval-augmented generation, model evaluation and debiasing, and training efficiency. Details at <https://cmu-llms.org/assignments/>

View File

@ -0,0 +1,31 @@
# CMU11-667: Large Language Models: Methods and Applications
## 课程简介
- 所属大学Carnegie Mellon University
- 先修要求:具备机器学习基础(相当于 CMU 的 10-301/10-601和自然语言处理基础相当于 11-411/11-611熟练掌握 Python熟悉 PyTorch 等深度学习框架。
- 编程语言Python
- 课程难度:🌟🌟🌟🌟
- 预计学时100 学时以上
该研究生课程全面介绍了大型语言模型LLM的方法与应用涵盖从基础架构到前沿技术的广泛主题。课程内容包括
1. **基础知识**:语言模型的网络架构、训练、推理和评估方法。
2. **进阶主题**:模型解释性、对齐方法、涌现能力,以及在语言任务和非文本任务中的应用。
3. **扩展技术**:大规模预训练技术、模型部署优化,以及高效的训练和推理方法。
4. **伦理与安全**:模型偏见、攻击方法、法律问题等。
课程采用讲座、阅读材料、小测验、互动活动、作业和项目相结合的方式进行,旨在为学生提供深入理解 LLM 的机会,并为进一步的研究或应用打下坚实基础。
**自学建议**
- 认真阅读每次课前指定的论文和材料。
- 熟悉 PyTorch 等深度学习框架,动手实现模型和算法。
- 扎实完成课程作业。
## 课程资源
- 课程网站:<https://cmu-llms.org/>
- 课程视频:部分讲座幻灯片和材料可在课程网站获取,完整视频可能需通过 CMU 内部平台访问。
- 课程教材:精选论文和资料,具体阅读列表详见课程网站。
- 课程作业共六次作业涵盖预训练数据准备、Transformer 实现、检索增强生成、模型比较与偏见缓解、训练效率提升等主题,详情见 <https://cmu-llms.org/assignments/>

View File

@ -0,0 +1,27 @@
# CMU 11-711: Advanced Natural Language Processing (ANLP)
## Course Overview
* University: Carnegie Mellon University
* Prerequisites: No strict prerequisites, but students should have experience with Python programming, as well as a background in probability and linear algebra. Prior experience with neural networks is recommended.
* Programming Language: Python
* Course Difficulty: 🌟🌟🌟🌟
* Estimated Workload: 100 hours
This is a graduate-level course covering both foundational and advanced topics in Natural Language Processing (NLP). The syllabus spans word representations, sequence modeling, attention mechanisms, Transformer architectures, and cutting-edge topics such as large language model pretraining, instruction tuning, complex reasoning, multimodality, and model safety. Compared to similar courses, this course stands out for the following reasons:
1. **Comprehensive and research-driven content**: In addition to classical NLP methods, it offers in-depth discussions of recent trends and state-of-the-art techniques such as LLaMa and GPT-4.
2. **Strong practical component**: Each lecture includes code demonstrations and online quizzes, and the final project requires reproducing and improving upon a recent research paper.
3. **Highly interactive**: Active engagement is encouraged through Piazza discussions, Canvas quizzes, and in-class Q&A, resulting in an immersive and well-paced learning experience.
Self-study tips:
* Read the recommended papers before class and follow the reading sequence step-by-step.
* Set up a Python environment and become familiar with PyTorch and Hugging Face, as many hands-on examples are based on these frameworks.
## Course Resources
* Course Website: [https://www.phontron.com/class/anlp-fall2024/](https://www.phontron.com/class/anlp-fall2024/)
* Course Videos: Lecture recordings are available on Canvas (CMU login required)
* Course Texts: Selected classical and cutting-edge research papers + chapters from *A Primer on Neural Network Models for Natural Language Processing* by Yoav Goldberg
* Course Assignments: [https://www.phontron.com/class/anlp-fall2024/assignments/](https://www.phontron.com/class/anlp-fall2024/assignments/)

View File

@ -0,0 +1,28 @@
# CMU 11-711: Advanced Natural Language Processing (ANLP)
## 课程简介
* 所属大学Carnegie Mellon University
* 先修要求:无硬性先修要求,但需具备 Python 编程经验,以及概率论和线性代数基础;有神经网络使用经验者更佳。
* 编程语言Python
* 课程难度:🌟🌟🌟🌟
* 预计学时100 学时
该课程为研究生级别的 NLP 入门与进阶课程覆盖从词表征、序列建模到注意力机制、Transformer 架构,再到大规模语言模型预训练、指令微调与复杂推理、多模态和安全性等前沿主题。与其他同类课程相比,本课程:
1. **内容全面且紧跟最新研究**:除经典算法外,深入讲解近年热门的大模型方法(如 LLaMa、GPT-4 等)。
2. **实践性强**:每次课配套代码演示与在线小测,学期末项目需复现并改进一篇前沿论文。
3. **互动良好**Piazza 讨论、Canvas 测验及现场答疑,学习体验沉浸而有节奏。
自学建议:
* 提前阅读课前推荐文献,跟着阅读顺序循序渐进。
* 准备好 Python 环境并熟悉 PyTorch/Hugging Face因为大量实战代码示例基于此。
* 扎实完成课程作业。
## 课程资源
* 课程网站:[https://www.phontron.com/class/anlp-fall2024/](https://www.phontron.com/class/anlp-fall2024/)
* 课程视频:课堂讲座录制并上传至 Canvas需 CMU 帐号登录)
* 课程教材各类经典与前沿论文Goldberg《A Primer on Neural Network Models for Natural Language Processing》章节阅读
* 课程作业:[https://www.phontron.com/class/anlp-fall2024/assignments/](https://www.phontron.com/class/anlp-fall2024/assignments/)

View File

@ -0,0 +1,40 @@
# CMU 11-868: Large Language Model Systems
## Course Overview
- University: Carnegie Mellon University
- Prerequisites: Strongly recommended to have taken Deep Learning (11-785) or Advanced NLP (11-611 or 11-711)
- Programming Language: Python
- Course Difficulty: 🌟🌟🌟🌟
- Estimated Workload: 120 hours
This graduate-level course focuses on the full stack of large language model (LLM) systems — from algorithms to engineering. The curriculum covers, but is not limited to:
1. **GPU Programming and Automatic Differentiation**: Master CUDA kernel calls, fundamentals of parallel programming, and deep learning framework design.
2. **Model Training and Distributed Systems**: Learn efficient training algorithms, communication optimizations (e.g., ZeRO, FlashAttention), and distributed training frameworks like DDP, GPipe, and Megatron-LM.
3. **Model Compression and Acceleration**: Study quantization (GPTQ), sparsity (MoE), compiler technologies (JAX, Triton), and inference-time serving systems (vLLM, CacheGen).
4. **Cutting-Edge Topics and Systems Practice**: Includes retrieval-augmented generation (RAG), multimodal LLMs, RLHF systems, and end-to-end deployment, monitoring, and maintenance.
Compared to similar courses, this one stands out for its **tight integration with recent papers and open-source implementations** (hands-on work expanding CUDA support in the miniTorch framework), a **project-driven assignment structure** (five programming assignments + a final project), and **guest lectures from industry experts**, offering students real-world insights into LLM engineering challenges and solutions.
**Self-Study Tips**:
- Set up a CUDA-compatible environment in advance (NVIDIA GPU + CUDA Toolkit + PyTorch).
- Review fundamentals of parallel computing and deep learning (autograd, tensor operations).
- Carefully read the assigned papers and slides before each lecture, and follow the assignments to extend the miniTorch framework from pure Python to real CUDA kernels.
This course assumes a solid understanding of deep learning and is **not suitable for complete beginners**. See the [FAQ](https://llmsystem.github.io/llmsystem2024spring/docs/FAQ) for more on prerequisites.
The assignments are fairly challenging and include:
1. **Assignment 1**: Implement an autograd framework + custom CUDA ops + basic neural networks
2. **Assignment 2**: Build a GPT2 model from scratch
3. **Assignment 3**: Accelerate training with custom CUDA kernels for Softmax and LayerNorm
4. **Assignment 4**: Implement distributed model training (difficult to configure independently for self-study)
## Course Resources
- Course Website: <https://llmsystem.github.io/llmsystem2025spring/>
- Syllabus: <https://llmsystem.github.io/llmsystem2025spring/docs/Syllabus/>
- Assignments: <https://llmsystem.github.io/llmsystem2025springhw/>
- Course Texts: Selected research papers + selected chapters from *Programming Massively Parallel Processors (4th Edition)*

View File

@ -0,0 +1,39 @@
# CMU 11-868: Large Language Model Systems
## 课程简介
- 所属大学Carnegie Mellon University
- 先修要求:强烈建议已修读 Deep Learning (11785) 或 Advanced NLP (11-611 或 11-711)
- 编程语言Python
- 课程难度:🌟🌟🌟🌟
- 预计学时120 学时
该课程面向研究生开设,聚焦“从算法到工程”的大语言模型系统构建全过程。课程内容包括但不限于:
1. **GPU 编程与自动微分**:掌握 CUDA kernel 调用、并行编程基础,以及深度学习框架设计原理。
2. **模型训练与分布式系统**学习高效的训练算法、通信优化ZeRO、FlashAttention、分布式训练框架DDP、GPipe、Megatron-LM
3. **模型压缩与加速**量化GPTQ、稀疏化MoE、编译技术JAX、Triton、以及推理时的服务化设计vLLM、CacheGen
4. **前沿技术与系统实践**涵盖检索增强生成RAG、多模态 LLM、RLHF 系统,以及端到端的在线维护和监控。
与同类课程相比,本课程的优势在于**紧密结合最新论文与开源实现**(通过 miniTorch 框架动手扩展 CUDA 支持);**项目驱动**的作业体系(五次编程作业 + 期末大项目);以及**工业嘉宾讲座**,能让学生近距离了解真实世界中 LLM 工程实践的挑战与解决方案。
**自学建议**
- 提前配置好支持 CUDA 的开发环境NVIDIA GPU + CUDA Toolkit + PyTorch
- 复习并行计算和深度学习基础(自动微分、张量运算)。
- 阅读每次课前指定的论文与幻灯片,跟着作业把 miniTorch 框架从纯 Python 拓展到真实 CUDA 内核。
该课程要求你对深度学习有一定的预备知识,不适合纯小白入手,可见 [FAQ](https://llmsystem.github.io/llmsystem2024spring/docs/FAQ) 的先修要求。
实验总体来说是有难度的,主要内容如下:
1. Assignment1: 自动微分框架 + CUDA 手写算子 + 基础神经网络构建
2. Assignmant2: GPT2 模型构建
3. Assignment3: 通过手写 CUDA 的 Softmax 和 LayerNorm 算子优化模型训练速度
4. Assignment4: 分布式模型训练,自学的话可能不太好配置环境
## 课程资源
- 课程网站:<https://llmsystem.github.io/llmsystem2025spring/>
- 课程大纲:<https://llmsystem.github.io/llmsystem2025spring/docs/Syllabus/>
- 课程作业:<https://llmsystem.github.io/llmsystem2025springhw/>
- 课程教材:精选论文 + 《Programming Massively Parallel Processors, 4th Ed》 部分章节

View File

@ -114,6 +114,7 @@ plugins:
"国立台湾大学: 李宏毅机器学习": NTU Machine Learning
深度生成模型: Deep Generative Models
学习路线图: Roadmap
"大语言模型": Large Language Models
机器学习进阶: Advanced Machine Learning
学习路线图: Roadmap
后记: Postscript
@ -282,6 +283,10 @@ nav:
- "UCB CS285: Deep Reinforcement Learning": "深度学习/CS285.md"
- 深度生成模型:
- "学习路线图": "深度生成模型/roadmap.md"
- "大语言模型":
- "CMU 11-868: Large Language Model System": "深度生成模型/大语言模型/CMU11-868.md"
- "CMU 11-667: Large Language Models: Methods and Applications": "深度生成模型/大语言模型/CMU11-667.md"
- "CMU 11-711: Advanced Natural Language Processing": "深度生成模型/大语言模型/CMU11-711.md"
- 机器学习进阶:
- "学习路线图": "机器学习进阶/roadmap.md"
- "CMU 10-708: Probabilistic Graphical Models": "机器学习进阶/CMU10-708.md"