[COURSE] Add LLM related courses (#746)
* add CMU11868 * add cmu11-667 * add cmu11711 * update cmu11-868 * update cmu-11667 * nits
This commit is contained in:
parent
2b4ba63b09
commit
a74ddd98d3
|
|
@ -0,0 +1,31 @@
|
|||
# CMU11-667: Large Language Models: Methods and Applications
|
||||
|
||||
## Course Overview
|
||||
|
||||
- University: Carnegie Mellon University
|
||||
- Prerequisites: Solid background in machine learning (equivalent to CMU 10-301/10-601) and natural language processing (equivalent to 11-411/11-611); proficiency in Python and familiarity with PyTorch or similar deep learning frameworks.
|
||||
- Programming Language: Python
|
||||
- Course Difficulty: 🌟🌟🌟🌟
|
||||
- Estimated Study Hours: 100+ hours
|
||||
|
||||
This graduate-level course provides a comprehensive overview of methods and applications of Large Language Models (LLMs), covering a wide range of topics from core architectures to cutting-edge techniques. Course content includes:
|
||||
|
||||
1. **Foundations**: Neural network architectures for language modeling, training procedures, inference, and evaluation metrics.
|
||||
2. **Advanced Topics**: Model interpretability, alignment methods, emergent capabilities, and applications in both textual and non-textual domains.
|
||||
3. **System & Optimization Techniques**: Large-scale pretraining strategies, deployment optimization, and efficient training/inference methods.
|
||||
4. **Ethics & Safety**: Addressing model bias, adversarial attacks, and legal/regulatory concerns.
|
||||
|
||||
The course blends lectures, readings, quizzes, interactive exercises, assignments, and a final project to offer students a deep and practical understanding of LLMs, preparing them for both research and real-world system development.
|
||||
|
||||
**Self-Study Tips**:
|
||||
|
||||
- Thoroughly read all assigned papers and materials before each class.
|
||||
- Become proficient with PyTorch and implement core models and algorithms by hand.
|
||||
- Complete the assignments diligently to build practical skills and reinforce theoretical understanding.
|
||||
|
||||
## Course Resources
|
||||
|
||||
- Course Website: <https://cmu-llms.org/>
|
||||
- Course Videos: Selected lecture slides and materials are available on the website; full lecture recordings may require CMU internal access.
|
||||
- Course Materials: Curated research papers and supplementary materials, with the full reading list available on the course site.
|
||||
- Assignments: Six programming assignments covering data preparation, Transformer implementation, retrieval-augmented generation, model evaluation and debiasing, and training efficiency. Details at <https://cmu-llms.org/assignments/>
|
||||
|
|
@ -0,0 +1,31 @@
|
|||
# CMU11-667: Large Language Models: Methods and Applications
|
||||
|
||||
## 课程简介
|
||||
|
||||
- 所属大学:Carnegie Mellon University
|
||||
- 先修要求:具备机器学习基础(相当于 CMU 的 10-301/10-601)和自然语言处理基础(相当于 11-411/11-611);熟练掌握 Python,熟悉 PyTorch 等深度学习框架。
|
||||
- 编程语言:Python
|
||||
- 课程难度:🌟🌟🌟🌟
|
||||
- 预计学时:100 学时以上
|
||||
|
||||
该研究生课程全面介绍了大型语言模型(LLM)的方法与应用,涵盖从基础架构到前沿技术的广泛主题。课程内容包括:
|
||||
|
||||
1. **基础知识**:语言模型的网络架构、训练、推理和评估方法。
|
||||
2. **进阶主题**:模型解释性、对齐方法、涌现能力,以及在语言任务和非文本任务中的应用。
|
||||
3. **扩展技术**:大规模预训练技术、模型部署优化,以及高效的训练和推理方法。
|
||||
4. **伦理与安全**:模型偏见、攻击方法、法律问题等。
|
||||
|
||||
课程采用讲座、阅读材料、小测验、互动活动、作业和项目相结合的方式进行,旨在为学生提供深入理解 LLM 的机会,并为进一步的研究或应用打下坚实基础。
|
||||
|
||||
**自学建议**:
|
||||
|
||||
- 认真阅读每次课前指定的论文和材料。
|
||||
- 熟悉 PyTorch 等深度学习框架,动手实现模型和算法。
|
||||
- 扎实完成课程作业。
|
||||
|
||||
## 课程资源
|
||||
|
||||
- 课程网站:<https://cmu-llms.org/>
|
||||
- 课程视频:部分讲座幻灯片和材料可在课程网站获取,完整视频可能需通过 CMU 内部平台访问。
|
||||
- 课程教材:精选论文和资料,具体阅读列表详见课程网站。
|
||||
- 课程作业:共六次作业,涵盖预训练数据准备、Transformer 实现、检索增强生成、模型比较与偏见缓解、训练效率提升等主题,详情见 <https://cmu-llms.org/assignments/>
|
||||
|
|
@ -0,0 +1,27 @@
|
|||
# CMU 11-711: Advanced Natural Language Processing (ANLP)
|
||||
|
||||
## Course Overview
|
||||
|
||||
* University: Carnegie Mellon University
|
||||
* Prerequisites: No strict prerequisites, but students should have experience with Python programming, as well as a background in probability and linear algebra. Prior experience with neural networks is recommended.
|
||||
* Programming Language: Python
|
||||
* Course Difficulty: 🌟🌟🌟🌟
|
||||
* Estimated Workload: 100 hours
|
||||
|
||||
This is a graduate-level course covering both foundational and advanced topics in Natural Language Processing (NLP). The syllabus spans word representations, sequence modeling, attention mechanisms, Transformer architectures, and cutting-edge topics such as large language model pretraining, instruction tuning, complex reasoning, multimodality, and model safety. Compared to similar courses, this course stands out for the following reasons:
|
||||
|
||||
1. **Comprehensive and research-driven content**: In addition to classical NLP methods, it offers in-depth discussions of recent trends and state-of-the-art techniques such as LLaMa and GPT-4.
|
||||
2. **Strong practical component**: Each lecture includes code demonstrations and online quizzes, and the final project requires reproducing and improving upon a recent research paper.
|
||||
3. **Highly interactive**: Active engagement is encouraged through Piazza discussions, Canvas quizzes, and in-class Q&A, resulting in an immersive and well-paced learning experience.
|
||||
|
||||
Self-study tips:
|
||||
|
||||
* Read the recommended papers before class and follow the reading sequence step-by-step.
|
||||
* Set up a Python environment and become familiar with PyTorch and Hugging Face, as many hands-on examples are based on these frameworks.
|
||||
|
||||
## Course Resources
|
||||
|
||||
* Course Website: [https://www.phontron.com/class/anlp-fall2024/](https://www.phontron.com/class/anlp-fall2024/)
|
||||
* Course Videos: Lecture recordings are available on Canvas (CMU login required)
|
||||
* Course Texts: Selected classical and cutting-edge research papers + chapters from *A Primer on Neural Network Models for Natural Language Processing* by Yoav Goldberg
|
||||
* Course Assignments: [https://www.phontron.com/class/anlp-fall2024/assignments/](https://www.phontron.com/class/anlp-fall2024/assignments/)
|
||||
|
|
@ -0,0 +1,28 @@
|
|||
# CMU 11-711: Advanced Natural Language Processing (ANLP)
|
||||
|
||||
## 课程简介
|
||||
|
||||
* 所属大学:Carnegie Mellon University
|
||||
* 先修要求:无硬性先修要求,但需具备 Python 编程经验,以及概率论和线性代数基础;有神经网络使用经验者更佳。
|
||||
* 编程语言:Python
|
||||
* 课程难度:🌟🌟🌟🌟
|
||||
* 预计学时:100 学时
|
||||
|
||||
该课程为研究生级别的 NLP 入门与进阶课程,覆盖从词表征、序列建模,到注意力机制、Transformer 架构,再到大规模语言模型预训练、指令微调与复杂推理、多模态和安全性等前沿主题。与其他同类课程相比,本课程:
|
||||
|
||||
1. **内容全面且紧跟最新研究**:除经典算法外,深入讲解近年热门的大模型方法(如 LLaMa、GPT-4 等)。
|
||||
2. **实践性强**:每次课配套代码演示与在线小测,学期末项目需复现并改进一篇前沿论文。
|
||||
3. **互动良好**:Piazza 讨论、Canvas 测验及现场答疑,学习体验沉浸而有节奏。
|
||||
|
||||
自学建议:
|
||||
|
||||
* 提前阅读课前推荐文献,跟着阅读顺序循序渐进。
|
||||
* 准备好 Python 环境并熟悉 PyTorch/Hugging Face,因为大量实战代码示例基于此。
|
||||
* 扎实完成课程作业。
|
||||
|
||||
## 课程资源
|
||||
|
||||
* 课程网站:[https://www.phontron.com/class/anlp-fall2024/](https://www.phontron.com/class/anlp-fall2024/)
|
||||
* 课程视频:课堂讲座录制并上传至 Canvas(需 CMU 帐号登录)
|
||||
* 课程教材:各类经典与前沿论文+Goldberg《A Primer on Neural Network Models for Natural Language Processing》章节阅读
|
||||
* 课程作业:[https://www.phontron.com/class/anlp-fall2024/assignments/](https://www.phontron.com/class/anlp-fall2024/assignments/)
|
||||
|
|
@ -0,0 +1,40 @@
|
|||
# CMU 11-868: Large Language Model Systems
|
||||
|
||||
## Course Overview
|
||||
|
||||
- University: Carnegie Mellon University
|
||||
- Prerequisites: Strongly recommended to have taken Deep Learning (11-785) or Advanced NLP (11-611 or 11-711)
|
||||
- Programming Language: Python
|
||||
- Course Difficulty: 🌟🌟🌟🌟
|
||||
- Estimated Workload: 120 hours
|
||||
|
||||
This graduate-level course focuses on the full stack of large language model (LLM) systems — from algorithms to engineering. The curriculum covers, but is not limited to:
|
||||
|
||||
1. **GPU Programming and Automatic Differentiation**: Master CUDA kernel calls, fundamentals of parallel programming, and deep learning framework design.
|
||||
2. **Model Training and Distributed Systems**: Learn efficient training algorithms, communication optimizations (e.g., ZeRO, FlashAttention), and distributed training frameworks like DDP, GPipe, and Megatron-LM.
|
||||
3. **Model Compression and Acceleration**: Study quantization (GPTQ), sparsity (MoE), compiler technologies (JAX, Triton), and inference-time serving systems (vLLM, CacheGen).
|
||||
4. **Cutting-Edge Topics and Systems Practice**: Includes retrieval-augmented generation (RAG), multimodal LLMs, RLHF systems, and end-to-end deployment, monitoring, and maintenance.
|
||||
|
||||
Compared to similar courses, this one stands out for its **tight integration with recent papers and open-source implementations** (hands-on work expanding CUDA support in the miniTorch framework), a **project-driven assignment structure** (five programming assignments + a final project), and **guest lectures from industry experts**, offering students real-world insights into LLM engineering challenges and solutions.
|
||||
|
||||
**Self-Study Tips**:
|
||||
|
||||
- Set up a CUDA-compatible environment in advance (NVIDIA GPU + CUDA Toolkit + PyTorch).
|
||||
- Review fundamentals of parallel computing and deep learning (autograd, tensor operations).
|
||||
- Carefully read the assigned papers and slides before each lecture, and follow the assignments to extend the miniTorch framework from pure Python to real CUDA kernels.
|
||||
|
||||
This course assumes a solid understanding of deep learning and is **not suitable for complete beginners**. See the [FAQ](https://llmsystem.github.io/llmsystem2024spring/docs/FAQ) for more on prerequisites.
|
||||
|
||||
The assignments are fairly challenging and include:
|
||||
|
||||
1. **Assignment 1**: Implement an autograd framework + custom CUDA ops + basic neural networks
|
||||
2. **Assignment 2**: Build a GPT2 model from scratch
|
||||
3. **Assignment 3**: Accelerate training with custom CUDA kernels for Softmax and LayerNorm
|
||||
4. **Assignment 4**: Implement distributed model training (difficult to configure independently for self-study)
|
||||
|
||||
## Course Resources
|
||||
|
||||
- Course Website: <https://llmsystem.github.io/llmsystem2025spring/>
|
||||
- Syllabus: <https://llmsystem.github.io/llmsystem2025spring/docs/Syllabus/>
|
||||
- Assignments: <https://llmsystem.github.io/llmsystem2025springhw/>
|
||||
- Course Texts: Selected research papers + selected chapters from *Programming Massively Parallel Processors (4th Edition)*
|
||||
|
|
@ -0,0 +1,39 @@
|
|||
# CMU 11-868: Large Language Model Systems
|
||||
|
||||
## 课程简介
|
||||
|
||||
- 所属大学:Carnegie Mellon University
|
||||
- 先修要求:强烈建议已修读 Deep Learning (11785) 或 Advanced NLP (11-611 或 11-711)
|
||||
- 编程语言:Python
|
||||
- 课程难度:🌟🌟🌟🌟
|
||||
- 预计学时:120 学时
|
||||
|
||||
该课程面向研究生开设,聚焦“从算法到工程”的大语言模型系统构建全过程。课程内容包括但不限于:
|
||||
|
||||
1. **GPU 编程与自动微分**:掌握 CUDA kernel 调用、并行编程基础,以及深度学习框架设计原理。
|
||||
2. **模型训练与分布式系统**:学习高效的训练算法、通信优化(ZeRO、FlashAttention)、分布式训练框架(DDP、GPipe、Megatron-LM)。
|
||||
3. **模型压缩与加速**:量化(GPTQ)、稀疏化(MoE)、编译技术(JAX、Triton)、以及推理时的服务化设计(vLLM、CacheGen)。
|
||||
4. **前沿技术与系统实践**:涵盖检索增强生成(RAG)、多模态 LLM、RLHF 系统,以及端到端的在线维护和监控。
|
||||
|
||||
与同类课程相比,本课程的优势在于**紧密结合最新论文与开源实现**(通过 miniTorch 框架动手扩展 CUDA 支持);**项目驱动**的作业体系(五次编程作业 + 期末大项目);以及**工业嘉宾讲座**,能让学生近距离了解真实世界中 LLM 工程实践的挑战与解决方案。
|
||||
|
||||
**自学建议**:
|
||||
|
||||
- 提前配置好支持 CUDA 的开发环境(NVIDIA GPU + CUDA Toolkit + PyTorch)。
|
||||
- 复习并行计算和深度学习基础(自动微分、张量运算)。
|
||||
- 阅读每次课前指定的论文与幻灯片,跟着作业把 miniTorch 框架从纯 Python 拓展到真实 CUDA 内核。
|
||||
|
||||
该课程要求你对深度学习有一定的预备知识,不适合纯小白入手,可见 [FAQ](https://llmsystem.github.io/llmsystem2024spring/docs/FAQ) 的先修要求。
|
||||
实验总体来说是有难度的,主要内容如下:
|
||||
|
||||
1. Assignment1: 自动微分框架 + CUDA 手写算子 + 基础神经网络构建
|
||||
2. Assignmant2: GPT2 模型构建
|
||||
3. Assignment3: 通过手写 CUDA 的 Softmax 和 LayerNorm 算子优化模型训练速度
|
||||
4. Assignment4: 分布式模型训练,自学的话可能不太好配置环境
|
||||
|
||||
## 课程资源
|
||||
|
||||
- 课程网站:<https://llmsystem.github.io/llmsystem2025spring/>
|
||||
- 课程大纲:<https://llmsystem.github.io/llmsystem2025spring/docs/Syllabus/>
|
||||
- 课程作业:<https://llmsystem.github.io/llmsystem2025springhw/>
|
||||
- 课程教材:精选论文 + 《Programming Massively Parallel Processors, 4th Ed》 部分章节
|
||||
|
|
@ -114,6 +114,7 @@ plugins:
|
|||
"国立台湾大学: 李宏毅机器学习": NTU Machine Learning
|
||||
深度生成模型: Deep Generative Models
|
||||
学习路线图: Roadmap
|
||||
"大语言模型": Large Language Models
|
||||
机器学习进阶: Advanced Machine Learning
|
||||
学习路线图: Roadmap
|
||||
后记: Postscript
|
||||
|
|
@ -282,6 +283,10 @@ nav:
|
|||
- "UCB CS285: Deep Reinforcement Learning": "深度学习/CS285.md"
|
||||
- 深度生成模型:
|
||||
- "学习路线图": "深度生成模型/roadmap.md"
|
||||
- "大语言模型":
|
||||
- "CMU 11-868: Large Language Model System": "深度生成模型/大语言模型/CMU11-868.md"
|
||||
- "CMU 11-667: Large Language Models: Methods and Applications": "深度生成模型/大语言模型/CMU11-667.md"
|
||||
- "CMU 11-711: Advanced Natural Language Processing": "深度生成模型/大语言模型/CMU11-711.md"
|
||||
- 机器学习进阶:
|
||||
- "学习路线图": "机器学习进阶/roadmap.md"
|
||||
- "CMU 10-708: Probabilistic Graphical Models": "机器学习进阶/CMU10-708.md"
|
||||
|
|
|
|||
Loading…
Reference in New Issue