diff --git a/CS学习规划/index.html b/CS学习规划/index.html index 59a46834..9e209f88 100644 --- a/CS学习规划/index.html +++ b/CS学习规划/index.html @@ -1,4 +1,4 @@ - 一个仅供参考的CS学习规划 - CS自学指南
跳转至

一个仅供参考的 CS 学习规划

计算机领域方向庞杂,知识浩如烟海,每个细分领域如果深究下去都可以说学无止境。因此,一个清晰明确的学习规划是非常重要的。我在多年自学的尝试中也走过不少弯路,最终提炼出了下面的内容,供大家参考。

不过,在开始学习之前,先向小白们强烈推荐一个科普向系列视频 Crash Course: Computer Science,在短短 8 个小时里非常生动且全面地科普了关于计算机科学的方方面面:计算机的历史、计算机是如何运作的、组成计算机的各个重要模块、计算机科学中的重要思想等等等等。正如它的口号所说的 Computers are not magic!,希望看完这个视频之后,大家能对计算机科学有个全貌性地感知,从而怀着兴趣去面对下面浩如烟海的更为细致且深入的学习内容。

必学工具

俗话说:磨刀不误砍柴工。如果你是一个刚刚接触计算机的24k纯小白,学会一些工具将会让你事半功倍。

学会提问:也许你会惊讶,提问也算计算机必备技能吗,还放在第一条?我觉得在开源社区中,学会提问是一项非常重要的能力,它包含两方面的事情。其一是会变相地培养你自主解决问题的能力,因为从形成问题、描述问题并发布、他人回答、最后再到理解回答这个周期是非常长的,如果遇到什么鸡毛蒜皮的事情都希望别人最好远程桌面手把手帮你完成,那计算机的世界基本与你无缘了。其二,如果真的经过尝试还无法解决,可以借助开源社区的帮助,但这时候如何通过简洁的文字让别人瞬间理解你的处境以及目的,就显得尤为重要。推荐阅读提问的智慧这篇文章,这不仅能提高你解决问题的概率和效率,也能让开源社区里无偿提供解答的人们拥有一个好心情。

MIT-Missing-Semester 这门课覆盖了这些工具中绝大部分,而且有相当详细的使用指导,强烈建议小白学习。不过需要注意的一点是,在课程中会不时提到一些与开发流程相关的术语。因此推荐至少在学完计算机导论级别的课程之后进行学习。

翻墙:由于一些众所周知的原因,谷歌、GitHub 等网站在大陆无法访问。然而很多时候,谷歌和 StackOverflow 可以解决你在开发过程中遇到的 99% 的问题。因此,学会翻墙几乎是一个内地 CSer 的必备技能。(考虑到法律问题,这个文档提供的翻墙方式仅对拥有北大邮箱的用户适用)。

命令行:熟练使用命令行是一种常常被忽视,或被认为难以掌握的技能,但实际上,它会极大地提高你作为工程师的灵活性以及生产力。命令行的艺术是一份非常经典的教程,它源于 Quora 的一个提问,但在各路大神的贡献努力下已经成为了一个 GitHub 十万 stars 的顶流项目,被翻译成了十几种语言。教程不长,非常建议大家反复通读,在实践中内化吸收。同时,掌握 Shell 脚本编程也是一项不容忽视的技术,可以参考这个教程

IDE (Integrated Development Environment):集成开发环境,说白了就是你写代码的地方。作为一个码农,IDE 的重要性不言而喻,但由于很多 IDE 是为大型工程项目设计的,体量较大,功能也过于丰富。其实如今一些轻便的文本编辑器配合丰富的插件生态基本可以满足日常的轻量编程需求。个人常用的编辑器是 VS Code 和 Sublime(前者的插件配置非常简单,后者略显复杂但颜值很高)。当然对于大型项目我还是会采用略重型的 IDE,例如 Pycharm (Python),IDEA (Java) 等等(免责申明:所有的 IDE 都是世界上最好的 IDE)。

Vim:一款命令行编辑工具。这是一个学习曲线有些陡峭的编辑器,不过学会它我觉得是非常有必要的,因为它将极大地提高你的开发效率。现在绝大多数 IDE 也都支持 Vim 插件,让你在享受现代开发环境的同时保留极客的炫酷(yue)。

Emacs:与 Vim 齐名的经典编辑器,同样具有极高的开发效率,同时具有更为强大的扩展性,它既可以配置为一个轻量编辑器,也可以扩展成一个个人定制的 IDE,甚至可以有更多奇技淫巧。

Git:一款代码版本控制工具。Git的学习曲线可能更为陡峭,但出自 Linux 之父 Linus 之手的 Git 绝对是每个学 CS 的童鞋必须掌握的神器之一。

GitHub:基于 Git 的代码托管平台。全世界最大的代码开源社区,大佬集聚地。

GNU Make:一款工程构建工具。善用 GNU Make 会让你养成代码模块化的习惯,同时也能让你熟悉一些大型工程的编译链接流程。

CMake:一款功能比 GNU Make 更为强大的构建工具,建议掌握 GNU Make 之后再加以学习。

LaTeX逼格提升 论文排版工具。

Docker:一款相较于虚拟机更轻量级的软件打包与环境部署工具。

实用工具箱:除了上面提到的这些在开发中使用频率极高的工具之外,我还收集了很多实用有趣的免费工具,例如一些下载工具、设计工具、学习网站等等。

Thesis:毕业论文 Word 写作教程。

好书推荐

私以为一本好的教材应当是以人为本的,而不是炫技式的理论堆砌。告诉读者“是什么”固然重要,但更好的应当是教材作者将其在这个领域深耕几十年的经验融汇进书中,向读者娓娓道来“为什么”以及未来应该“怎么做”。

链接戳这里

环境配置

你以为的开发 —— 在 IDE 里疯狂码代码数小时。

实际上的开发 —— 配环境配几天还没开始写代码。

PC 端环境配置

如果你是 Mac 用户,那么你很幸运,这份指南 将会手把手地带你搭建起整套开发环境。如果你是 Windows 用户,在开源社区的努力下,你同样可以获得与其他平台类似的体验:Scoop

另外大家可以参考一份灵感来自 6.NULL MIT-Missing-Semester环境配置指南,重点在于终端的美化配置。此外还包括常用软件源(如 GitHub, Anaconda, PyPI 等)的加速与替换以及一些 IDE 的配置与激活教程。

服务器端环境配置

服务器端的运维需要掌握 Linux(或者其他类 Unix 系统)的基本使用以及进程、设备、网络等系统相关的基本概念,小白可以参考中国科学技术大学 Linux 用户协会编写的《Linux 101》在线讲义。如果想深入学习系统运维相关的知识,可以参考 Aspects of System Administration 这门课程。

另外,如果需要学习某个具体的概念或工具,推荐一个非常不错的 GitHub 项目 DevOps-Guide,其中涵盖了非常多的运维方面的基础知识和教程,例如 Docker, Kubernetes, Linux, CI-CD, GitHub Actions 等等。

课程地图

正如这章开头提到的,这份课程地图仅仅是一个仅供参考的课程规划,我作为一个临近毕业的本科生。深感自己没有权利也没有能力向别人宣扬“应该怎么学”。因此如果你觉得以下的课程分类与选择有不合理之处,我全盘接受,并深感抱歉。你可以在下一节定制属于你的课程地图

以下课程类别中除了含有 基础入门 字眼的以外,并无明确的先后次序,大家只要满足某个课程的先修要求,完全可以根据自己的需要和喜好选择想要学习的课程。

数学基础

微积分与线性代数

作为大一新生,学好微积分线代是和写代码至少同等重要的事情,相信已经有无数的前人经验提到过这一点,但我还是要不厌其烦地再强调一遍:学好微积分线代真的很重要!你也许会吐槽这些东西岂不是考完就忘,那我觉得你是并没有把握住它们本质,对它们的理解还没有达到刻骨铭心的程度。如果觉得老师课上讲的内容晦涩难懂,不妨参考 MIT 的 Calculus Course18.06: Linear Algebra 的课程 notes,至少于我而言,它帮助我深刻理解了微积分和线性代数的许多本质。顺道再安利一个油管数学网红 3Blue1Brown,他的频道有很多用生动形象的动画阐释数学本质内核的视频,兼具深度和广度,质量非常高。

信息论入门

作为计算机系的学生,及早了解一些信息论的基础知识,我觉得是大有裨益的。但大多信息论课程都面向高年级本科生甚至研究生,对新手极不友好。而 MIT 的 6.050J: Information theory and Entropy 这门课正是为大一新生量身定制的,几乎没有先修要求,涵盖了编码、压缩、通信、信息熵等等内容,非常有趣。

数学进阶

离散数学与概率论

集合论、图论、概率论等等是算法推导与证明的重要工具,也是后续高阶数学课程的基础。但我觉得这类课程的讲授很容易落入理论化与形式化的窠臼,让课堂成为定理结论的堆砌,而无法使学生深刻把握理论的本质,进而造成学了就背,考了就忘的怪圈。如果能在理论教学中穿插算法运用实例,学生在拓展算法知识的同时也能窥见理论的力量和魅力。

UCB CS70 : discrete Math and probability theoryUCB CS126 : Probability theory 是 UC Berkeley 的概率论课程,前者覆盖了离散数学和概率论基础,后者则涉及随机过程以及深入的理论内容。两者都非常注重理论和实践的结合,有丰富的算法实际运用实例,后者还有大量的 Python 编程作业来让学生运用概率论的知识解决实际问题。

数值分析

作为计算机系的学生,培养计算思维是很重要的,实际问题的建模、离散化,计算机的模拟、分析,是一项很重要的能力。而这两年开始风靡的,由 MIT 打造的 Julia 编程语言以其 C 一样的速度和 Python 一样友好的语法在数值计算领域有一统天下之势,MIT 的许多数学课程也开始用 Julia 作为教学工具,把艰深的数学理论用直观清晰的代码展示出来。

ComputationalThinking 是 MIT 开设的一门计算思维入门课,所有课程内容全部开源,可以在课程网站直接访问。这门课利用 Julia 编程语言,在图像处理、社会科学与数据科学、气候学建模三个 topic 下带领学生理解算法、数学建模、数据分析、交互设计、图例展示,让学生体验计算与科学的美妙结合。内容虽然不难,但给我最深刻的感受就是,科学的魅力并不是故弄玄虚的艰深理论,不是诘屈聱牙的术语行话,而是用直观生动的案例,用简练深刻的语言,让每个普通人都能理解。

上完上面的体验课之后,如果意犹未尽的话,不妨试试 MIT 的 18.330 : Introduction to numerical analysis,这门课的编程作业同样会用 Julia 编程语言,不过难度和深度上都上了一个台阶。内容涉及了浮点编码、Root finding、线性系统、微分方程等等方面,整门课的主旨就是让你利用离散化的计算机表示去估计和逼近一个数学上连续的概念。这门课的教授还专门撰写了一本配套的开源教材 Fundamentals of Numerical Computation,里面附有丰富的 Julia 代码实例和严谨的公式推导。

如果你还意犹未尽的话,还有 MIT 的数值分析研究生课程 18.335: Introduction to numerical method 供你参考。

微分方程

如果世间万物的运动发展都能用方程来刻画和描述,这是一件多么酷的事情呀!虽然几乎任何一所学校的 CS 培养方案中都没有微分方程相关的必修课程,但我还是觉得掌握它会赋予你一个新的视角来审视这个世界。

由于微分方程中往往会用到很多复变函数的知识,所以大家可以参考 MIT18.04: Complex variables functions 的课程 notes 来补齐先修知识。

MIT18.03: differential equations 主要覆盖了常微分方程的求解,在此基础之上 MIT18.152: Partial differential equations 则会深入偏微分方程的建模与求解。掌握了微分方程这一有力工具,相信对于你的实际问题的建模能力以及从众多噪声变量中把握本质的直觉都会有很大帮助。

数学高阶

作为计算机系的学生,我经常听到数学无用论的论断,对此我不敢苟同但也无权反对,但若凡事都硬要争出个有用和无用的区别来,倒也着实无趣,因此下面这些面向高年级甚至研究生的数学课程,大家按兴趣自取所需。

凸优化

Standford EE364A: Convex Optimization

信息论

MIT6.441: Information Theory

应用统计学

MIT18.650: Statistics for Applications

初等数论

MIT18.781: Theory of Numbers

密码学

Standford CS255: Cryptography

编程入门

Languages are tools, you choose the right tool to do the right thing. Since there's no universally perfect tool, there's no universally perfect language.

General

Java

Python

C++

Rust

OCaml

电子基础

电路基础

作为计算机系的学生,了解一些基础的电路知识,感受从传感器收集数据到数据分析再到算法预测整条流水线,对于后续知识的学习以及计算思维的培养还是很有帮助的。EE16A&B: Designing Information Devices and Systems I&II 是伯克利 EE 学生的大一入门课,其中 EE16A 注重通过电路从实际环境中收集和分析数据,而 EE16B 则侧重从这些收集到的数据进行分析并做出预测行为。

信号与系统

信号与系统是一门我觉得非常值得一上的课,最初学它只是为了满足我对傅里叶变换的好奇,但学完之后我才不禁感叹,傅立叶变换给我提供了一个全新的视角去看待这个世界,就如同微分方程一样,让你沉浸在用数学去精确描绘和刻画这个世界的优雅与神奇之中。

MIT 6.003: signal and systems 提供了全部的课程录影、书面作业以及答案。也可以去看这门课的远古版本

UCB EE120: Signal and Systems 关于傅立叶变换的 notes 写得非常好,并且提供了6 个非常有趣的 Python 编程作业,让你实践中运用信号与系统的理论与算法。

数据结构与算法

算法是计算机科学的核心,也是几乎一切专业课程的基础。如何将实际问题通过数学抽象转化为算法问题,并选用合适的数据结构在时间和内存大小的限制下将其解决是算法课的永恒主题。如果你受够了老师的照本宣科,那么我强烈推荐伯克利的 UCB CS61B: Data Structures and Algorithms 和普林斯顿的 Coursera: Algorithms I & II,这两门课的都讲得深入浅出并且会有丰富且有趣的编程实验将理论与知识结合起来。

以上两门课程都是基于 Java 语言,如果你想学习 C/C++ 描述的版本,可以参考斯坦福的数据结构与基础算法课程 Stanford CS106B/X: Programming Abstractions。偏好 Python 的同学可以学习 MIT 的算法入门课 MIT 6.006: Introduction to Algorithms

对一些更高级的算法以及 NP 问题感兴趣的同学可以学习伯克利的算法设计与分析课程 UCB CS170: Efficient Algorithms and Intractable Problems 或者 MIT 的高阶算法 MIT 6.046: Design and Analysis of Algorithms

软件工程

入门课

一份“能跑”的代码,和一份高质量的工业级代码是有本质区别的。因此我非常推荐低年级的同学学习一下 MIT 6.031: Software Construction 这门课,它会以 Java 语言为基础,以丰富细致的阅读材料和精心设计的编程练习传授如何编写不易出 bug、简明易懂、易于维护修改的高质量代码。大到宏观数据结构设计,小到如何写注释,遵循这些前人总结的细节和经验,对于你此后的编程生涯大有裨益。

专业课

当然,如果你想系统性地上一门软件工程的课程,那我推荐的是伯克利的 UCB CS169: software engineering。但需要提醒的是,和大多学校(包括贵校)的软件工程课程不同,这门课不会涉及传统的 design and document 模式,即强调各种类图、流程图及文档设计,而是采用近些年流行起来的小团队快速迭代 Agile Develepment 开发模式以及利用云平台的 Software as a service 服务模式。

体系结构

入门课

从小我就一直听说,计算机的世界是由 01 构成的,我不理解但大受震撼。如果你的内心也怀有这份好奇,不妨花一到两个月的时间学习 Coursera: Nand2Tetris 这门无门槛的计算机课程。这门麻雀虽小五脏俱全的课程会从 01 开始让你亲手造出一台计算机,并在上面运行俄罗斯方块小游戏。一门课里涵盖了编译、虚拟机、汇编、体系结构、数字电路、逻辑门等等从上至下、从软至硬的各类知识,非常全面。难度上也是通过精心的设计,略去了众多现代计算机复杂的细节,提取出了最核心本质的东西,力图让每个人都能理解。在低年级,如果就能从宏观上建立对整个计算机体系的鸟瞰图,是大有裨益的。

专业课

当然,如果想深入现代计算机体系结构的复杂细节,还得上一门大学本科难度的课程 UCB CS61C: Great Ideas in Computer Architecture。UC Berkeley 作为 RISC-V 架构的发源地,在体系结构领域算得上首屈一指。其课程非常注重实践,你会在 Project 中手写汇编构造神经网络,从零开始搭建一个 CPU,这些实践都会让你对计算机体系结构有更为深入的理解,而不是仅停留于“取指译码执行访存写回”的单调背诵里。

系统入门

计算机系统是一个庞杂而深刻的主题,在深入学习某个细分领域之前,对各个领域有一个宏观概念性的理解,对一些通用性的设计原则有所知晓,会让你在之后的深入学习中不断强化一些最为核心乃至哲学的概念,而不会桎梏于复杂的内部细节和各种 trick。因为在我看来,学习系统最关键的还是想让你领悟到这些最核心的东西,从而能够设计和实现出属于自己的系统。

MIT6.033: System Engineering 是 MIT 的系统入门课,主题涉及了操作系统、网络、分布式和系统安全,除了知识点的传授外,这门课还会讲授一些写作和表达上的技巧,让你学会如何设计并向别人介绍和分析自己的系统。这本书配套的教材 Principles of Computer System Design: An Introduction 也写得非常好,推荐大家阅读。

CMU 15-213: Introduction to Computer System 是 CMU 的系统入门课,内容覆盖了体系结构、操作系统、链接、并行、网络等等,兼具广度和深度,配套的教材 Computer Systems: A Programmer's Perspective 也是质量极高,强烈建议阅读。

操作系统

没有什么能比自己写个内核更能加深对操作系统的理解了。

操作系统作为各类纷繁复杂的底层硬件虚拟化出一套规范优雅的抽象,给所有应用软件提供丰富的功能支持。了解操作系统的设计原则和内部原理对于一个不满足于当调包侠的程序员来说是大有裨益的。出于对操作系统的热爱,我上过国内外很多操作系统课程,它们各有侧重和优劣,大家可以根据兴趣各取所需。

MIT 6.S081: Operating System Engineering,MIT 著名 PDOS 实验室出品,11 个 Project 让你在一个实现非常优雅的类Unix操作系统xv6上增加各类功能模块。这门课也让我深刻认识到,做系统不是靠 PPT 念出来的,是得几万行代码一点点累起来的。

UCB CS162: Operating System,伯克利的操作系统课,采用和 Stanford 同样的 Project —— 一个教学用操作系统 Pintos。我作为北京大学2022年和2023年春季学期操作系统实验班的助教,引入并改善了这个 Project,课程资源也会全部开源,具体参见课程网站

NJU: Operating System Design and Implementation,南京大学的蒋炎岩老师开设的操作系统课程。蒋老师以其独到的系统视角结合丰富的代码示例将众多操作系统的概念讲得深入浅出,此外这门课的全部课程内容都是中文的,非常方便大家学习。

HIT OS: Operating System,哈尔滨工业大学的李治军老师开设的中文操作系统课程。李老师的课程基于 Linux 0.11 源码,十分注重代码实践,并站在学生视角将操作系统的来龙去脉娓娓道来。

并行与分布式系统

想必这两年各类 CS 讲座里最常听到的话就是“摩尔定律正在走向终结”,此话不假,当单核能力达到上限时,多核乃至众核架构如日中天。硬件的变化带来的是上层编程逻辑的适应与改变,要想充分利用硬件性能,编写并行程序几乎成了程序员的必备技能。与此同时,深度学习的兴起对计算机算力与存储的要求都达到了前所未有的高度,大规模集群的部署和优化也成为热门技术话题。

并行计算

CMU 15-418/Stanford CS149: Parallel Computing

分布式系统

MIT 6.824: Distributed System

系统安全

不知道你当年选择计算机是不是因为怀着一个中二的黑客梦想,但现实却是成为黑客道阻且长。

理论课程

UCB CS161: Computer Security 是伯克利的系统安全课程,会涵盖栈攻击、密码学、网站安全、网络安全等等内容。

ASU CSE365: Introduction to Cybersecurity 亚利桑那州立大学的 Web 安全课程,主要涉及注入、汇编与密码学的内容。

ASU CSE466: Computer Systems Security 亚利桑那州立大学的系统安全课程,涉及内容全面。门槛较高,需要对 Linux, C 与 Python 充分熟悉。

SU SEED Labs 雪城大学的网安课程,由 NSF 提供130万美元的资金支持,为网安教育开发了动手实践性的实验练习(称为 SEED Lab)。课程理论教学和动手实践并重,包含详细的开源讲义、视频教程、教科书(被印刷为多种语言)、开箱即用的基于虚拟机和 docker 的攻防环境等。目前全球有1050家研究机构在使用该项目。涵盖计算机和信息安全领域的广泛主题,包括软件安全、网络安全、Web 安全、操作系统安全和移动应用安全。

实践课程

掌握这些理论知识之后,还需要在实践中培养和锻炼这些“黑客素养”。CTF 夺旗赛是一项比较热门的系统安全比赛,赛题中会融会贯通地考察你对计算机各个领域知识的理解和运用。北大今年也成功举办了第 0 届和第 1 届,鼓励大家后期踊跃参与,在实践中提高自己。下面列举一些我平时学习(摸鱼)用到的资源:

计算机网络

没有什么能比自己写个 TCP/IP 协议栈更能加深对计算机网络的理解了。

大名鼎鼎的 Stanford CS144: Computer Network,8 个 Project 带你实现整个 TCP/IP 协议栈。

如果你只是想在理论上对计算机网络有所了解,那么推荐计网著名教材《自顶向下方法》的配套学习资源 Computer Networking: A Top-Down Approach

数据库系统

没有什么能比自己写个关系型数据库更能加深对数据库系统的理解了。

CMU 的著名数据库神课 CMU 15-445: Introduction to Database System 会通过 4 个 Project 带你为一个用于教学的关系型数据库 bustub 添加各种功能。实验的评测框架也免费开源了,非常适合大家自学。此外课程实验会用到 C++11 的众多新特性,也是一个锻炼 C++ 代码能力的好机会。

Berkeley 作为著名开源数据库 postgres 的发源地也不遑多让,UCB CS186: Introduction to Database System 会让你用 Java 语言实现一个支持 SQL 并发查询、B+ 树索引和故障恢复的关系型数据库。

编译原理

没有什么能比自己写个编译器更能加深对编译器的理解了。

Stanford CS143: Compilers 带你手写编译器。

Web开发

前后端开发很少在计算机的培养方案里被重视,但其实掌握这项技能还是好处多多的,例如搭建自己的个人主页,抑或是给自己的课程项目做一个精彩的展示网页。

两周速成版

MIT web development course

系统学习版

Stanford CS142: Web Applications

计算机图形学

数据科学

其实数据科学和机器学习与深度学习有着很紧密的联系,但可能更侧重于实践。Berkeley 的 UCB Data100: Principles and Techniques of Data Science 通过丰富的编程练习让你在实践中掌握各类数据分析工具和算法,并带领你体验从海量的数据集中提取出想要的结果,并对未来的数据或用户的行为做出相应的预测。但这只是一门基础课,如果想学习工业级别的数据挖掘与分析技术,可以尝试 Stanford 的大数据挖掘课程 CS246: Mining Massive Data Sets

人工智能

近十年人工智能应该算是计算机界最火爆的领域。如果你不满足于整日听各路媒体争相报道人工智能相关的进展,而想真正一探究竟,那么非常推荐学习 Harvard 神课 CS50 系列的人工智能课程 Harvard CS50: Introduction to AI with Python。课程短小精悍,覆盖了传统人工智能领域的几大分支,并配有丰富有趣的 Python 编程练习来巩固你对人工智能算法的理解。美中不足的是这门课因为面向在线自学者的缘故内容较为精简,并且不会涉及特别深入的数学理论,如果想要系统深入地学习还需要一门本科生难度的课程,例如 Berkeley 的 UCB CS188: Introduction to Artificial Intelligence。这门课的 Project 复刻了经典游戏糖豆人,让你运用人工智能算法玩游戏,非常有趣。

机器学习

机器学习领域近些年最重要的进展就是发展出了基于神经网络的深度学习分支,但其实很多基于统计学习的算法依然在数据分析领域有着广泛的应用。如果你之前从未接触过机器学习的相关知识,而且不想一开始就陷入艰深晦涩的数学证明,那么不妨先从 Andrew Ng (吴恩达)的 Coursera: Machine Learning 学起。这门课在机器学习领域基本无人不晓,吴恩达以其深厚的理论功底和出色的表达能力把很多艰深的算法讲得深入浅出,并且非常实用。其配套的作业也是质量相当上乘,可以帮助你快速入门。

但上过这门课只能让你从宏观上对机器学习这一领域有一定了解,如果想真正理解那些“神奇”算法背后的数学原理甚至从事相关领域的科研工作,那么还需要一门更“数学”的课程,例如 Stanford CS229: Machine Learning 或者 UCB CS189: Introduction to Machine Learning

深度学习

前几年 AlphaGo 的大热让深度学习进入了大众的视野,不少大学甚至专门成立了相关专业。很多计算机的其他领域也会借助深度学习的技术来做研究,因此基本不管你干啥多少都会接触到一些神经网络、深度学习相关的技术需求。如果想快速入门,同样推荐 Andrew Ng (吴恩达)的 Coursera: Deep Learning,质量无需多言,Coursera 上罕见的满分课程。此外如果你觉得英文课程学习起来有难度,推荐李宏毅老师的 国立台湾大学:机器学习 课程。这门课打着机器学习的名号,却囊括了深度学习领域的几乎所有方向,非常全面,很适合你从宏观上对这个领域有一个大致的了解。而且老师本人也非常幽默,课堂金句频出。

当然因为深度学习领域发展非常迅速,已经拥有了众多研究分支,如果想要进一步深入,可以按需学习下面罗列的代表课程,

计算机视觉

UMich EECS 498-007 / 598-005: Deep Learning for Computer Vision

Stanford CS231n: CNN for Visual Recognition

自然语言处理

Stanford CS224n: Natural Language Processing

图神经网络

Stanford CS224w: Machine Learning with Graphs

强化学习

UCB CS285: Deep Reinforcement Learning

定制属于你的课程地图

授人以鱼不如授人以渔。

以上的课程规划难免带有强烈的个人偏好,不一定适合所有人,更多是起到抛砖引玉的作用。如果你想挑选自己感兴趣的方向和内容加以学习,可以参考我在下面列出来的资源。

一个仅供参考的 CS 学习规划

计算机领域方向庞杂,知识浩如烟海,每个细分领域如果深究下去都可以说学无止境。因此,一个清晰明确的学习规划是非常重要的。我在多年自学的尝试中也走过不少弯路,最终提炼出了下面的内容,供大家参考。

不过,在开始学习之前,先向小白们强烈推荐一个科普向系列视频 Crash Course: Computer Science,在短短 8 个小时里非常生动且全面地科普了关于计算机科学的方方面面:计算机的历史、计算机是如何运作的、组成计算机的各个重要模块、计算机科学中的重要思想等等等等。正如它的口号所说的 Computers are not magic!,希望看完这个视频之后,大家能对计算机科学有个全貌性地感知,从而怀着兴趣去面对下面浩如烟海的更为细致且深入的学习内容。

必学工具

俗话说:磨刀不误砍柴工。如果你是一个刚刚接触计算机的24k纯小白,学会一些工具将会让你事半功倍。

学会提问:也许你会惊讶,提问也算计算机必备技能吗,还放在第一条?我觉得在开源社区中,学会提问是一项非常重要的能力,它包含两方面的事情。其一是会变相地培养你自主解决问题的能力,因为从形成问题、描述问题并发布、他人回答、最后再到理解回答这个周期是非常长的,如果遇到什么鸡毛蒜皮的事情都希望别人最好远程桌面手把手帮你完成,那计算机的世界基本与你无缘了。其二,如果真的经过尝试还无法解决,可以借助开源社区的帮助,但这时候如何通过简洁的文字让别人瞬间理解你的处境以及目的,就显得尤为重要。推荐阅读提问的智慧这篇文章,这不仅能提高你解决问题的概率和效率,也能让开源社区里无偿提供解答的人们拥有一个好心情。

MIT-Missing-Semester 这门课覆盖了这些工具中绝大部分,而且有相当详细的使用指导,强烈建议小白学习。不过需要注意的一点是,在课程中会不时提到一些与开发流程相关的术语。因此推荐至少在学完计算机导论级别的课程之后进行学习。

翻墙:由于一些众所周知的原因,谷歌、GitHub 等网站在大陆无法访问。然而很多时候,谷歌和 StackOverflow 可以解决你在开发过程中遇到的 99% 的问题。因此,学会翻墙几乎是一个内地 CSer 的必备技能。(考虑到法律问题,这个文档提供的翻墙方式仅对拥有北大邮箱的用户适用)。

命令行:熟练使用命令行是一种常常被忽视,或被认为难以掌握的技能,但实际上,它会极大地提高你作为工程师的灵活性以及生产力。命令行的艺术是一份非常经典的教程,它源于 Quora 的一个提问,但在各路大神的贡献努力下已经成为了一个 GitHub 十万 stars 的顶流项目,被翻译成了十几种语言。教程不长,非常建议大家反复通读,在实践中内化吸收。同时,掌握 Shell 脚本编程也是一项不容忽视的技术,可以参考这个教程

IDE (Integrated Development Environment):集成开发环境,说白了就是你写代码的地方。作为一个码农,IDE 的重要性不言而喻,但由于很多 IDE 是为大型工程项目设计的,体量较大,功能也过于丰富。其实如今一些轻便的文本编辑器配合丰富的插件生态基本可以满足日常的轻量编程需求。个人常用的编辑器是 VS Code 和 Sublime(前者的插件配置非常简单,后者略显复杂但颜值很高)。当然对于大型项目我还是会采用略重型的 IDE,例如 Pycharm (Python),IDEA (Java) 等等(免责申明:所有的 IDE 都是世界上最好的 IDE)。

Vim:一款命令行编辑工具。这是一个学习曲线有些陡峭的编辑器,不过学会它我觉得是非常有必要的,因为它将极大地提高你的开发效率。现在绝大多数 IDE 也都支持 Vim 插件,让你在享受现代开发环境的同时保留极客的炫酷(yue)。

Emacs:与 Vim 齐名的经典编辑器,同样具有极高的开发效率,同时具有更为强大的扩展性,它既可以配置为一个轻量编辑器,也可以扩展成一个个人定制的 IDE,甚至可以有更多奇技淫巧。

Git:一款代码版本控制工具。Git的学习曲线可能更为陡峭,但出自 Linux 之父 Linus 之手的 Git 绝对是每个学 CS 的童鞋必须掌握的神器之一。

GitHub:基于 Git 的代码托管平台。全世界最大的代码开源社区,大佬集聚地。

GNU Make:一款工程构建工具。善用 GNU Make 会让你养成代码模块化的习惯,同时也能让你熟悉一些大型工程的编译链接流程。

CMake:一款功能比 GNU Make 更为强大的构建工具,建议掌握 GNU Make 之后再加以学习。

LaTeX逼格提升 论文排版工具。

Docker:一款相较于虚拟机更轻量级的软件打包与环境部署工具。

实用工具箱:除了上面提到的这些在开发中使用频率极高的工具之外,我还收集了很多实用有趣的免费工具,例如一些下载工具、设计工具、学习网站等等。

Thesis:毕业论文 Word 写作教程。

好书推荐

私以为一本好的教材应当是以人为本的,而不是炫技式的理论堆砌。告诉读者“是什么”固然重要,但更好的应当是教材作者将其在这个领域深耕几十年的经验融汇进书中,向读者娓娓道来“为什么”以及未来应该“怎么做”。

链接戳这里

环境配置

你以为的开发 —— 在 IDE 里疯狂码代码数小时。

实际上的开发 —— 配环境配几天还没开始写代码。

PC 端环境配置

如果你是 Mac 用户,那么你很幸运,这份指南 将会手把手地带你搭建起整套开发环境。如果你是 Windows 用户,在开源社区的努力下,你同样可以获得与其他平台类似的体验:Scoop

另外大家可以参考一份灵感来自 6.NULL MIT-Missing-Semester环境配置指南,重点在于终端的美化配置。此外还包括常用软件源(如 GitHub, Anaconda, PyPI 等)的加速与替换以及一些 IDE 的配置与激活教程。

服务器端环境配置

服务器端的运维需要掌握 Linux(或者其他类 Unix 系统)的基本使用以及进程、设备、网络等系统相关的基本概念,小白可以参考中国科学技术大学 Linux 用户协会编写的《Linux 101》在线讲义。如果想深入学习系统运维相关的知识,可以参考 Aspects of System Administration 这门课程。

另外,如果需要学习某个具体的概念或工具,推荐一个非常不错的 GitHub 项目 DevOps-Guide,其中涵盖了非常多的运维方面的基础知识和教程,例如 Docker, Kubernetes, Linux, CI-CD, GitHub Actions 等等。

课程地图

正如这章开头提到的,这份课程地图仅仅是一个仅供参考的课程规划,我作为一个临近毕业的本科生。深感自己没有权利也没有能力向别人宣扬“应该怎么学”。因此如果你觉得以下的课程分类与选择有不合理之处,我全盘接受,并深感抱歉。你可以在下一节定制属于你的课程地图

以下课程类别中除了含有 基础入门 字眼的以外,并无明确的先后次序,大家只要满足某个课程的先修要求,完全可以根据自己的需要和喜好选择想要学习的课程。

数学基础

微积分与线性代数

作为大一新生,学好微积分线代是和写代码至少同等重要的事情,相信已经有无数的前人经验提到过这一点,但我还是要不厌其烦地再强调一遍:学好微积分线代真的很重要!你也许会吐槽这些东西岂不是考完就忘,那我觉得你是并没有把握住它们本质,对它们的理解还没有达到刻骨铭心的程度。如果觉得老师课上讲的内容晦涩难懂,不妨参考 MIT 的 Calculus Course18.06: Linear Algebra 的课程 notes,至少于我而言,它帮助我深刻理解了微积分和线性代数的许多本质。顺道再安利一个油管数学网红 3Blue1Brown,他的频道有很多用生动形象的动画阐释数学本质内核的视频,兼具深度和广度,质量非常高。

信息论入门

作为计算机系的学生,及早了解一些信息论的基础知识,我觉得是大有裨益的。但大多信息论课程都面向高年级本科生甚至研究生,对新手极不友好。而 MIT 的 6.050J: Information theory and Entropy 这门课正是为大一新生量身定制的,几乎没有先修要求,涵盖了编码、压缩、通信、信息熵等等内容,非常有趣。

数学进阶

离散数学与概率论

集合论、图论、概率论等等是算法推导与证明的重要工具,也是后续高阶数学课程的基础。但我觉得这类课程的讲授很容易落入理论化与形式化的窠臼,让课堂成为定理结论的堆砌,而无法使学生深刻把握理论的本质,进而造成学了就背,考了就忘的怪圈。如果能在理论教学中穿插算法运用实例,学生在拓展算法知识的同时也能窥见理论的力量和魅力。

UCB CS70 : discrete Math and probability theoryUCB CS126 : Probability theory 是 UC Berkeley 的概率论课程,前者覆盖了离散数学和概率论基础,后者则涉及随机过程以及深入的理论内容。两者都非常注重理论和实践的结合,有丰富的算法实际运用实例,后者还有大量的 Python 编程作业来让学生运用概率论的知识解决实际问题。

数值分析

作为计算机系的学生,培养计算思维是很重要的,实际问题的建模、离散化,计算机的模拟、分析,是一项很重要的能力。而这两年开始风靡的,由 MIT 打造的 Julia 编程语言以其 C 一样的速度和 Python 一样友好的语法在数值计算领域有一统天下之势,MIT 的许多数学课程也开始用 Julia 作为教学工具,把艰深的数学理论用直观清晰的代码展示出来。

ComputationalThinking 是 MIT 开设的一门计算思维入门课,所有课程内容全部开源,可以在课程网站直接访问。这门课利用 Julia 编程语言,在图像处理、社会科学与数据科学、气候学建模三个 topic 下带领学生理解算法、数学建模、数据分析、交互设计、图例展示,让学生体验计算与科学的美妙结合。内容虽然不难,但给我最深刻的感受就是,科学的魅力并不是故弄玄虚的艰深理论,不是诘屈聱牙的术语行话,而是用直观生动的案例,用简练深刻的语言,让每个普通人都能理解。

上完上面的体验课之后,如果意犹未尽的话,不妨试试 MIT 的 18.330 : Introduction to numerical analysis,这门课的编程作业同样会用 Julia 编程语言,不过难度和深度上都上了一个台阶。内容涉及了浮点编码、Root finding、线性系统、微分方程等等方面,整门课的主旨就是让你利用离散化的计算机表示去估计和逼近一个数学上连续的概念。这门课的教授还专门撰写了一本配套的开源教材 Fundamentals of Numerical Computation,里面附有丰富的 Julia 代码实例和严谨的公式推导。

如果你还意犹未尽的话,还有 MIT 的数值分析研究生课程 18.335: Introduction to numerical method 供你参考。

微分方程

如果世间万物的运动发展都能用方程来刻画和描述,这是一件多么酷的事情呀!虽然几乎任何一所学校的 CS 培养方案中都没有微分方程相关的必修课程,但我还是觉得掌握它会赋予你一个新的视角来审视这个世界。

由于微分方程中往往会用到很多复变函数的知识,所以大家可以参考 MIT18.04: Complex variables functions 的课程 notes 来补齐先修知识。

MIT18.03: differential equations 主要覆盖了常微分方程的求解,在此基础之上 MIT18.152: Partial differential equations 则会深入偏微分方程的建模与求解。掌握了微分方程这一有力工具,相信对于你的实际问题的建模能力以及从众多噪声变量中把握本质的直觉都会有很大帮助。

数学高阶

作为计算机系的学生,我经常听到数学无用论的论断,对此我不敢苟同但也无权反对,但若凡事都硬要争出个有用和无用的区别来,倒也着实无趣,因此下面这些面向高年级甚至研究生的数学课程,大家按兴趣自取所需。

凸优化

Standford EE364A: Convex Optimization

信息论

MIT6.441: Information Theory

应用统计学

MIT18.650: Statistics for Applications

初等数论

MIT18.781: Theory of Numbers

密码学

Standford CS255: Cryptography

编程入门

Languages are tools, you choose the right tool to do the right thing. Since there's no universally perfect tool, there's no universally perfect language.

General

Java

Python

C++

Rust

OCaml

电子基础

电路基础

作为计算机系的学生,了解一些基础的电路知识,感受从传感器收集数据到数据分析再到算法预测整条流水线,对于后续知识的学习以及计算思维的培养还是很有帮助的。EE16A&B: Designing Information Devices and Systems I&II 是伯克利 EE 学生的大一入门课,其中 EE16A 注重通过电路从实际环境中收集和分析数据,而 EE16B 则侧重从这些收集到的数据进行分析并做出预测行为。

信号与系统

信号与系统是一门我觉得非常值得一上的课,最初学它只是为了满足我对傅里叶变换的好奇,但学完之后我才不禁感叹,傅立叶变换给我提供了一个全新的视角去看待这个世界,就如同微分方程一样,让你沉浸在用数学去精确描绘和刻画这个世界的优雅与神奇之中。

MIT 6.003: signal and systems 提供了全部的课程录影、书面作业以及答案。也可以去看这门课的远古版本

UCB EE120: Signal and Systems 关于傅立叶变换的 notes 写得非常好,并且提供了6 个非常有趣的 Python 编程作业,让你实践中运用信号与系统的理论与算法。

数据结构与算法

算法是计算机科学的核心,也是几乎一切专业课程的基础。如何将实际问题通过数学抽象转化为算法问题,并选用合适的数据结构在时间和内存大小的限制下将其解决是算法课的永恒主题。如果你受够了老师的照本宣科,那么我强烈推荐伯克利的 UCB CS61B: Data Structures and Algorithms 和普林斯顿的 Coursera: Algorithms I & II,这两门课的都讲得深入浅出并且会有丰富且有趣的编程实验将理论与知识结合起来。

以上两门课程都是基于 Java 语言,如果你想学习 C/C++ 描述的版本,可以参考斯坦福的数据结构与基础算法课程 Stanford CS106B/X: Programming Abstractions。偏好 Python 的同学可以学习 MIT 的算法入门课 MIT 6.006: Introduction to Algorithms

对一些更高级的算法以及 NP 问题感兴趣的同学可以学习伯克利的算法设计与分析课程 UCB CS170: Efficient Algorithms and Intractable Problems 或者 MIT 的高阶算法 MIT 6.046: Design and Analysis of Algorithms

软件工程

入门课

一份“能跑”的代码,和一份高质量的工业级代码是有本质区别的。因此我非常推荐低年级的同学学习一下 MIT 6.031: Software Construction 这门课,它会以 Java 语言为基础,以丰富细致的阅读材料和精心设计的编程练习传授如何编写不易出 bug、简明易懂、易于维护修改的高质量代码。大到宏观数据结构设计,小到如何写注释,遵循这些前人总结的细节和经验,对于你此后的编程生涯大有裨益。

专业课

当然,如果你想系统性地上一门软件工程的课程,那我推荐的是伯克利的 UCB CS169: software engineering。但需要提醒的是,和大多学校(包括贵校)的软件工程课程不同,这门课不会涉及传统的 design and document 模式,即强调各种类图、流程图及文档设计,而是采用近些年流行起来的小团队快速迭代 Agile Develepment 开发模式以及利用云平台的 Software as a service 服务模式。

体系结构

入门课

从小我就一直听说,计算机的世界是由 01 构成的,我不理解但大受震撼。如果你的内心也怀有这份好奇,不妨花一到两个月的时间学习 Coursera: Nand2Tetris 这门无门槛的计算机课程。这门麻雀虽小五脏俱全的课程会从 01 开始让你亲手造出一台计算机,并在上面运行俄罗斯方块小游戏。一门课里涵盖了编译、虚拟机、汇编、体系结构、数字电路、逻辑门等等从上至下、从软至硬的各类知识,非常全面。难度上也是通过精心的设计,略去了众多现代计算机复杂的细节,提取出了最核心本质的东西,力图让每个人都能理解。在低年级,如果就能从宏观上建立对整个计算机体系的鸟瞰图,是大有裨益的。

专业课

当然,如果想深入现代计算机体系结构的复杂细节,还得上一门大学本科难度的课程 UCB CS61C: Great Ideas in Computer Architecture。UC Berkeley 作为 RISC-V 架构的发源地,在体系结构领域算得上首屈一指。其课程非常注重实践,你会在 Project 中手写汇编构造神经网络,从零开始搭建一个 CPU,这些实践都会让你对计算机体系结构有更为深入的理解,而不是仅停留于“取指译码执行访存写回”的单调背诵里。

系统入门

计算机系统是一个庞杂而深刻的主题,在深入学习某个细分领域之前,对各个领域有一个宏观概念性的理解,对一些通用性的设计原则有所知晓,会让你在之后的深入学习中不断强化一些最为核心乃至哲学的概念,而不会桎梏于复杂的内部细节和各种 trick。因为在我看来,学习系统最关键的还是想让你领悟到这些最核心的东西,从而能够设计和实现出属于自己的系统。

MIT6.033: System Engineering 是 MIT 的系统入门课,主题涉及了操作系统、网络、分布式和系统安全,除了知识点的传授外,这门课还会讲授一些写作和表达上的技巧,让你学会如何设计并向别人介绍和分析自己的系统。这本书配套的教材 Principles of Computer System Design: An Introduction 也写得非常好,推荐大家阅读。

CMU 15-213: Introduction to Computer System 是 CMU 的系统入门课,内容覆盖了体系结构、操作系统、链接、并行、网络等等,兼具广度和深度,配套的教材 Computer Systems: A Programmer's Perspective 也是质量极高,强烈建议阅读。

操作系统

没有什么能比自己写个内核更能加深对操作系统的理解了。

操作系统作为各类纷繁复杂的底层硬件虚拟化出一套规范优雅的抽象,给所有应用软件提供丰富的功能支持。了解操作系统的设计原则和内部原理对于一个不满足于当调包侠的程序员来说是大有裨益的。出于对操作系统的热爱,我上过国内外很多操作系统课程,它们各有侧重和优劣,大家可以根据兴趣各取所需。

MIT 6.S081: Operating System Engineering,MIT 著名 PDOS 实验室出品,11 个 Project 让你在一个实现非常优雅的类Unix操作系统xv6上增加各类功能模块。这门课也让我深刻认识到,做系统不是靠 PPT 念出来的,是得几万行代码一点点累起来的。

UCB CS162: Operating System,伯克利的操作系统课,采用和 Stanford 同样的 Project —— 一个教学用操作系统 Pintos。我作为北京大学2022年和2023年春季学期操作系统实验班的助教,引入并改善了这个 Project,课程资源也会全部开源,具体参见课程网站

NJU: Operating System Design and Implementation,南京大学的蒋炎岩老师开设的操作系统课程。蒋老师以其独到的系统视角结合丰富的代码示例将众多操作系统的概念讲得深入浅出,此外这门课的全部课程内容都是中文的,非常方便大家学习。

HIT OS: Operating System,哈尔滨工业大学的李治军老师开设的中文操作系统课程。李老师的课程基于 Linux 0.11 源码,十分注重代码实践,并站在学生视角将操作系统的来龙去脉娓娓道来。

并行与分布式系统

想必这两年各类 CS 讲座里最常听到的话就是“摩尔定律正在走向终结”,此话不假,当单核能力达到上限时,多核乃至众核架构如日中天。硬件的变化带来的是上层编程逻辑的适应与改变,要想充分利用硬件性能,编写并行程序几乎成了程序员的必备技能。与此同时,深度学习的兴起对计算机算力与存储的要求都达到了前所未有的高度,大规模集群的部署和优化也成为热门技术话题。

并行计算

CMU 15-418/Stanford CS149: Parallel Computing 会带你深入理解现代并行计算架构的设计原则与必要权衡,并学会如何充分利用硬件资源以及软件编程框架(例如 CUDA,MPI,OpenMP 等)编写高性能的并行程序。

分布式系统

MIT 6.824: Distributed System 和 MIT 6.S081 一样,出品自 MIT 大名鼎鼎的 PDOS 实验室,授课老师 Robert Morris 教授曾是一位顶尖黑客,世界上第一个蠕虫病毒 Morris 病毒就是出自他之手。这门课每节课都会精读一篇分布式系统领域的经典论文,并由此传授分布式系统设计与实现的重要原则和关键技术。同时其课程 Project 也是以难度之大而闻名遐迩,4 个编程作业循序渐进带你实现一个基于 Raft 共识算法的 KV-store 框架,让你在痛苦的 debug 中体会并行与分布式带来的随机性和复杂性。

系统安全

不知道你当年选择计算机是不是因为怀着一个中二的黑客梦想,但现实却是成为黑客道阻且长。

理论课程

UCB CS161: Computer Security 是伯克利的系统安全课程,会涵盖栈攻击、密码学、网站安全、网络安全等等内容。

SU SEED Labs 是雪城大学的网安课程,由 NSF 提供130万美元的资金支持,为网安教育开发了动手实践性的实验练习(称为 SEED Lab)。课程理论教学和动手实践并重,包含详细的开源讲义、视频教程、教科书(被印刷为多种语言)、开箱即用的基于虚拟机和 docker 的攻防环境等。目前全球有1050家研究机构在使用该项目。涵盖计算机和信息安全领域的广泛主题,包括软件安全、网络安全、Web 安全、操作系统安全和移动应用安全。

CTF 实践

掌握这些理论知识之后,还需要在实践中培养和锻炼这些“黑客素养”。CTF 夺旗赛是一项比较热门的系统安全比赛,赛题中会融会贯通地考察你对计算机各个领域知识的理解和运用。北大每年会举办相关赛事,鼓励大家踊跃参与,在实践中提高自己。下面列举一些我平时学习(摸鱼)用到的资源:

计算机网络

没有什么能比自己写个 TCP/IP 协议栈更能加深对计算机网络的理解了。

大名鼎鼎的 Stanford CS144: Computer Network,8 个 Project 带你实现整个 TCP/IP 协议栈。

如果你只是想在理论上对计算机网络有所了解,那么推荐阅读 UCB CS168 这门课程配套的教材

数据库系统

没有什么能比自己写个关系型数据库更能加深对数据库系统的理解了。

CMU 的著名数据库神课 CMU 15-445: Introduction to Database System 会通过 4 个 Project 带你为一个用于教学的关系型数据库 bustub 添加各种功能。实验的评测框架也免费开源了,非常适合大家自学。此外课程实验会用到 C++11 的众多新特性,也是一个锻炼 C++ 代码能力的好机会。

Berkeley 作为著名开源数据库 postgres 的发源地也不遑多让,UCB CS186: Introduction to Database System 会让你用 Java 语言实现一个支持 SQL 并发查询、B+ 树索引和故障恢复的关系型数据库。

编译原理

没有什么能比自己写个编译器更能加深对编译器的理解了。

理论学习推荐阅读大名鼎鼎的《龙书》。当然动手实践才是掌握编译原理最好的方式,推荐北京大学编译原理实践课程,丰富的实验配套和循序渐进的文档带你实现一个类C语言到 RISC-V 汇编的编译器。当然编译原理课程目录下也有众多其他优质实验供你选择。

Web 开发

前后端开发很少在计算机的培养方案里被重视,但其实掌握这项技能还是好处多多的,例如搭建自己的个人主页,抑或是给自己的课程项目做一个精彩的展示网页。如果你只是想两周速成,那么推荐 MIT web development course。如果想系统学习,推荐 Stanford CS142: Web Applications

计算机图形学

我本人对计算机图形学了解不多,这里收录了一些社区推荐的优质课程供大家选择:

数据科学

其实数据科学和机器学习与深度学习有着很紧密的联系,但可能更侧重于实践。Berkeley 的 UCB Data100: Principles and Techniques of Data Science 通过丰富的编程练习让你在实践中掌握各类数据分析工具和算法,并带领你体验从海量的数据集中提取出想要的结果,并对未来的数据或用户的行为做出相应的预测。但这只是一门基础课,如果想学习工业级别的数据挖掘与分析技术,可以尝试 Stanford 的大数据挖掘课程 CS246: Mining Massive Data Sets

人工智能

近十年人工智能应该算是计算机界最火爆的领域。如果你不满足于整日听各路媒体争相报道人工智能相关的进展,而想真正一探究竟,那么非常推荐学习 Harvard 神课 CS50 系列的人工智能课程 Harvard CS50: Introduction to AI with Python。课程短小精悍,覆盖了传统人工智能领域的几大分支,并配有丰富有趣的 Python 编程练习来巩固你对人工智能算法的理解。美中不足的是这门课因为面向在线自学者的缘故内容较为精简,并且不会涉及特别深入的数学理论,如果想要系统深入地学习还需要一门本科生难度的课程,例如 Berkeley 的 UCB CS188: Introduction to Artificial Intelligence。这门课的 Project 复刻了经典游戏糖豆人,让你运用人工智能算法玩游戏,非常有趣。

机器学习

机器学习领域近些年最重要的进展就是发展出了基于神经网络的深度学习分支,但其实很多基于统计学习的算法依然在数据分析领域有着广泛的应用。如果你之前从未接触过机器学习的相关知识,而且不想一开始就陷入艰深晦涩的数学证明,那么不妨先从 Andrew Ng (吴恩达)的 Coursera: Machine Learning 学起。这门课在机器学习领域基本无人不晓,吴恩达以其深厚的理论功底和出色的表达能力把很多艰深的算法讲得深入浅出,并且非常实用。其配套的作业也是质量相当上乘,可以帮助你快速入门。

但上过这门课只能让你从宏观上对机器学习这一领域有一定了解,如果想真正理解那些“神奇”算法背后的数学原理甚至从事相关领域的科研工作,那么还需要一门更“数学”的课程,例如 Stanford CS229: Machine Learning 或者 UCB CS189: Introduction to Machine Learning

当然,如果你之后致力于从事机器学习理论相关的科学研究,那么可以参考 Yao Fu 分享的进阶学习路线学习一些更深入的研究生难度的课程。

深度学习

前几年 AlphaGo 的大热让深度学习进入了大众的视野,不少大学专门成立了相关专业。很多计算机的其他领域也会借助深度学习的技术来做研究,因此基本不管你干啥多少都会接触到一些神经网络、深度学习相关的技术需求。如果想快速入门,同样推荐 Andrew Ng (吴恩达)的 Coursera: Deep Learning,质量无需多言,Coursera 上罕见的满分课程。此外如果你觉得英文课程学习起来有难度,推荐李宏毅老师的 国立台湾大学:机器学习 课程。这门课打着机器学习的名号,却囊括了深度学习领域的几乎所有方向,非常全面,很适合你从宏观上对这个领域有一个大致的了解。而且老师本人也非常幽默,课堂金句频出。

当然因为深度学习领域发展非常迅速,已经拥有了众多研究分支,如果想要进一步深入,可以按需学习下面罗列的代表课程:

计算机视觉

UMich EECS 498-007 / 598-005: Deep Learning for Computer Vision

Stanford CS231n: CNN for Visual Recognition

自然语言处理

Stanford CS224n: Natural Language Processing

图神经网络

Stanford CS224w: Machine Learning with Graphs

强化学习

UCB CS285: Deep Reinforcement Learning

深度学习系统

随着深度学习模型的重要性和资源需求越来越大,针对其训练和推理相关的底层系统优化也越发重要。如果想入门这个领域,推荐 CMU 10-414/714: Deep Learning Systems,内容覆盖了深度学习系统“全栈”的知识体系。从现代深度学习系统框架的顶层设计,到自微分算法的原理和实现,再到底层硬件加速和实际生产部署。为了更好地掌握理论知识,学生将会在课程作业中从头开始设计和实现一个完整的深度学习库 Needle,使其能对计算图进行自动微分,能在 GPU 上实现硬件加速,并且支持各类损失函数、数据加载器和优化器。在此基础上,学生将实现几类常见的神经网络,包括 CNN,RNN,LSTM,Transformer 等等。有一定基础后,还可以学习 Song Han 老师开设的 MIT6.5940: TinyML and Efficient Deep Learning Computing 课程,了解让神经网络轻量化的各种关键技术,例如剪枝、量化、蒸馏、网络架构搜索等等。此外,课程中还会涉及很多更前沿的深度学习模型例如大语言模型相关的系统优化。

深度生成模型

随着大语言模型的爆火,了解其背后的原理才能紧跟时代潮流。可以参考笔者推荐的学习路线进行学习。

定制属于你的课程地图

授人以鱼不如授人以渔。

以上的课程规划难免带有强烈的个人偏好,不一定适合所有人,更多是起到抛砖引玉的作用。如果你想挑选自己感兴趣的方向和内容加以学习,可以参考我在下面列出来的资源。

A Reference Guide for CS Learning

The field of computer science is vast and complex, with a seemingly endless sea of knowledge. Each specialized area can lead to limitless learning if pursued deeply. Therefore, a clear and definite study plan is very important. I've taken some detours in my years of self-study and finally distilled the following content for your reference.

Before you start learning, I highly recommend a popular science video series for beginners: Crash Course: Computer Science. In just 8 hours, it vividly and comprehensively covers various aspects of computer science: the history of computers, how computers operate, the important modules that make up a computer, key ideas in computer science, and so on. As its slogan says, Computers are not magic! I hope that after watching this video, everyone will have a holistic perception of computer science and embark on the detailed and in-depth learning content below with interest.

Essential Tools

As the saying goes: sharpening your axe will not delay your job of chopping wood. If you are a pure beginner in the world of computers, learning some tools will make you more efficient.

Learn to ask questions: You might be surprised that asking questions is the first one listed? I think in the open-source community, learning to ask questions is a very important ability. It involves two aspects. First, it indirectly cultivates your ability to solve problems independently, as the cycle of forming a question, describing it, getting answers from others, and then understanding the response is quite long. If you expect others to remotely assist you with every trivial issue, then the world of computers might not suit you. Second, if after trying, you still can't solve a problem, you can seek help from the open-source community. But at that point, how to concisely explain your situation and goal to others becomes particularly important. I recommend reading the article How To Ask Questions The Smart Way, which not only increases the probability and efficiency of solving your problems but also keeps those who provide answers in the open-source community in a good mood.

Learn to be a hacker: MIT-Missing-Semester covers many useful tools for a hacker and provides detailed usage instructions. I strongly recommend beginners to study this course. However, one thing to note is that the course occasionally refers to terms related to the development process. Therefore, it is recommended to study it at least after completing an introductory computer science course.

GFW: For well-known reasons, sites like Google and GitHub are not accessible in mainland China. However, in many cases, Google and StackOverflow can solve 99% of the problems encountered during development. Therefore, learning to use a VPN is almost an essential skill for a mainland CSer. (Considering legal issues, the methods provided in this book are only applicable to users with a Peking University email address).

Command Line: Proficiency in using the command line is often overlooked or considered difficult to master, but in reality, it greatly enhances your flexibility and productivity as an engineer. The Art of Command Line is a classic tutorial that started as a question on Quora, but with the contribution of many experts, it has become a top GitHub project with over 100,000 stars, translated into dozens of languages. The tutorial is not long, and I highly recommend everyone to read it repeatedly and internalize it through practice. Also, mastering shell script programming should not be overlooked, and you can refer to this tutorial.

IDE (Integrated Development Environment): Simply put, it's where you write your code. The importance of an IDE for a programmer goes without saying, but many IDEs are designed for large-scale projects and are quite bulky and overly feature-rich. Nowadays, some lightweight text editors with rich plugin ecosystems can basically meet the needs of daily lightweight programming. My personal favorites are VS Code and Sublime (the former has a very simple plugin configuration, while the latter is a bit more complex but aesthetically pleasing). Of course, for large projects, I would still use slightly heavier IDEs, such as Pycharm (Python), IDEA (Java), etc. (Disclaimer: all IDEs are the best in the world).

Vim: A command-line editor. Vim has a somewhat steep learning curve, but mastering it, I think, is very necessary because it will greatly improve your development efficiency. Most modern IDEs also support Vim plugins, allowing you to retain the coolness of a geek while enjoying a modern development environment.

Emacs: A classic editor that stands alongside Vim, with equally high development efficiency and more powerful expandability. It can be configured as a lightweight editor or expanded into a custom IDE, and even more sophisticated tricks.

Git: A version control tool for your project. Git, created by the father of Linux, Linus, is definitely one of the must-have tools for every CS student.

GitHub: A code hosting platform based on Git. The world's largest open-source community and a gathering place for CS experts.

GNU Make: An engineering build tool. Proficiency in GNU Make will help you develop a habit of modularizing your code and familiarize you with the compilation and linking processes of large projects.

CMake: A more powerful build tool than GNU Make, recommended for study after mastering GNU Make.

LaTex: Pretentious Paper typesetting tool.

Docker: A lighter-weight software packaging and deployment tool compared to virtual machines.

Practical Toolkit: In addition to the tools mentioned above that are frequently used in development, I have also collected many practical and interesting free tools, such as download tools, design tools, learning websites, etc.

Thesis: Tutorial for writing graduation thesis in Word.

I believe a good textbook should be people-oriented, rather than a display of technical jargon. It's certainly important to tell readers "what it is," but a better approach would be for the author to integrate decades of experience in the field into the book and narratively convey to the reader "why it is" and what should be done in the future.

Link here

Environment Setup

What you think of as development — coding frantically in an IDE for hours.

Actual development — setting up the environment for several days without starting to code.

PC Environment Setup

If you are a Mac user, you're in luck, as this guide will walk you through setting up the entire development environment. If you are a Windows user, thanks to the efforts of the open-source community, you can enjoy a similar experience with Scoop.

Additionally, you can refer to an environment setup guide inspired by 6.NULL MIT-Missing-Semester, focusing on terminal beautification. It also includes common software sources (such as GitHub, Anaconda, PyPI) for acceleration and replacement, as well as some IDE configuration and activation tutorials.

Server-Side Environment Setup

Server-side operation and maintenance require basic use of Linux (or other Unix-like systems) and fundamental concepts like processes, devices, networks, etc. Beginners can refer to the Linux 101 online notes compiled by the Linux User Association of the University of Science and Technology of China. If you want to delve deeper into system operation and maintenance, you can refer to the Aspects of System Administration course.

Additionally, if you need to learn a specific concept or tool, I recommend a great GitHub project, DevOps-Guide, which covers a lot of foundational knowledge and tutorials in the administration field, such as Docker, Kubernetes, Linux, CI-CD, GitHub Actions, and more.

Course Map

As mentioned at the beginning of this chapter, this course map is merely a reference guide for course planning, from my perspective as an undergraduate nearing graduation. I am acutely aware that I neither have the right nor the capability to preach to others about “how one should learn”. Therefore, if you find any issues with the course categorization and selection below, I fully accept and deeply apologize for them. You can tailor your own course map in the next section Customize Your Own Course Map.

Apart from courses labeled as basic or introductory, there is no explicit sequence in the following categories. As long as you meet the prerequisites for a course, you are free to choose any course according to your needs and interests.

Mathematical Foundations

Calculus and Linear Algebra

As a freshman, mastering calculus and linear algebra is as important as learning to code. This point has been reiterated countless times by predecessors, but I feel compelled to emphasize it again: mastering calculus and linear algebra is really important! You might complain that these subjects are forgotten after exams, but I believe that indicates a lack of deep understanding of their essence. If you find the content taught in class to be obscure, consider referring to MIT’s Calculus Course and 18.06: Linear Algebra course notes. For me, they greatly deepened my understanding of the essence of calculus and linear algebra. Also, I highly recommend the maths YouTuber 3Blue1Brown, whose channel features videos explaining the core of mathematics with vivid animations, offering both depth and breadth of high quality.

Introduction to Information Theory

For computer science students, gaining some foundational knowledge in information theory early on is beneficial. However, most information theory courses are targeted towards senior or even graduate students, making them quite inaccessible to beginners. MIT’s 6.050J: Information theory and Entropy is tailored for freshmen, with almost no prerequisites, covering coding, compression, communication, information entropy, and more, which is very interesting.

Advanced Mathematics

Discrete Mathematics and Probability Theory

Set theory, graph theory, and probability theory are essential tools for algorithm derivation and proof, as well as foundations for more advanced mathematical courses. However, the teaching of these subjects often falls into a rut of being overly theoretical and formalistic, turning classes into mere recitations of theorems and conclusions without helping students grasp the essence of these theories. If theory teaching can be interspersed with examples of algorithm application, students can expand their algorithm knowledge while appreciating the power and charm of theory.

UCB CS70: Discrete Math and Probability Theory and UCB CS126: Probability Theory are UC Berkeley’s probability courses. The former covers the basics of discrete mathematics and probability theory, while the latter delves into stochastic processes and more advanced theoretical content. Both emphasize the integration of theory and practice and feature abundant examples of algorithm application, with the latter including numerous Python programming assignments to apply probability theory to real-world problems.

Numerical Analysis

For computer science students, developing computational thinking is crucial. Modeling and discretizing real-world problems, and simulating and analyzing them on computers, are vital skills. Recently, the Julia programming language, developed by MIT, has become popular in the field of numerical computation with its C-like speed and Python-friendly syntax. Many MIT mathematics courses have started using Julia as a teaching tool, presenting complex mathematical theories through clear and intuitive code.

ComputationalThinking is an introductory course in computational thinking offered by MIT. All course materials are open source and accessible on the course website. Using the Julia programming language, the course covers image processing, social science and data science, and climatology modeling, helping students understand algorithms, mathematical modeling, data analysis, interactive design, and graph presentation. The course content, though not difficult, profoundly impressed me with the idea that the allure of science lies not in obscure theories or jargon but in presenting complex concepts through vivid examples and concise, deep language.

After completing this experience course, if you’re still eager for more, consider MIT’s 18.330: Introduction to Numerical Analysis. This course also uses Julia for programming assignments but is more challenging and in-depth. It covers floating-point encoding, root finding, linear systems, differential equations, and more, with the main goal of using discrete computer representations to estimate and approximate continuous mathematical concepts. The course instructor has also written an accompanying open-source textbook, Fundamentals of Numerical Computation, which includes abundant Julia code examples and rigorous formula derivations.

If you’re still not satisfied, MIT’s graduate course in numerical analysis, 18.335: Introduction to Numerical Methods, is also available for reference.

Differential Equations

Wouldn't it be cool if the motion and development of everything in the world could be described and depicted with equations? Although differential equations are not a mandatory part of any CS curriculum, I believe mastering them provides a new perspective to view the world.

Since differential equations often involve complex variable functions, you can refer to MIT18.04: Complex Variables Functions course notes to fill in prerequisite knowledge.

MIT18.03: Differential Equations mainly covers the solution of ordinary differential equations, and on this basis, MIT18.152: Partial Differential Equations dives into the modeling and solving of partial differential equations. With the powerful tool of differential equations, you will gain enhanced capabilities in modeling real-world problems and intuitively grasping the essence among various noisy variables.

Advanced Mathematical Topics

As a computer science student, I often hear arguments about the uselessness of mathematics. While I neither agree nor have the authority to oppose such views, if everything is forcibly categorized as useful or useless, it indeed becomes quite dull. Therefore, the following advanced mathematics courses, aimed at senior and even graduate students, are available for those interested.

Convex Optimization

Standford EE364A: Convex Optimization

Information Theory

MIT6.441: Information Theory

Applied Statistics

MIT18.650: Statistics for Applications

Elementary Number Theory

MIT18.781: Theory of Numbers

Cryptography

Standford CS255: Cryptography

Programming Fundamentals

Languages are tools, and you choose the right tool for the right job. Since there's no universally perfect tool, there's no universally perfect language.

General

Java

Python

C++

Rust

OCaml

Electronics Fundamentals

Basics of Circuits

For computer science students, understanding basic circuit knowledge and experiencing the entire pipeline from sensor data collection to data analysis and algorithm prediction can be very helpful for future learning and developing computational thinking. EE16A&B: Designing Information Devices and Systems I&II at UC Berkeley are introductory courses for freshmen in electrical engineering. EE16A focuses on collecting and analyzing data from the real environment through circuits, while EE16B focuses on analyzing these collected data to make predictive actions.

Signals and Systems

Signals and Systems is a course I find very worthwhile. Initially, I studied it out of curiosity about Fourier Transform, but after completing it, I was amazed at how Fourier Transform provided a new perspective to view the world, just like differential equations, immersing you in the elegance and magic of precisely depicting the world with mathematics.

MIT 6.003: Signal and Systems provides all course recordings, written assignments, and answers. You can also check out this course's ancient version.

UCB EE120: Signal and Systems has very well-written notes on Fourier Transform and provides many interesting Python programming assignments to practically apply the theories and algorithms of signals and systems.

Data Structures and Algorithms

Algorithms are the core of computer science and the foundation for almost all professional courses. How to abstract real-world problems into algorithmic problems mathematically and solve them under time and memory constraints using appropriate data structures is the eternal theme of algorithm courses. If you are fed up with your teacher's rote teaching, I highly recommend UC Berkeley's UCB CS61B: Data Structures and Algorithms and Princeton's Coursera: Algorithms I & II. Both courses are taught in a deep yet simple manner and have rich and interesting programming experiments to integrate theory with knowledge.

Both of these courses are based on Java. If you prefer C/C++, you can refer to Stanford's data structure and basic algorithm course Stanford CS106B/X: Programming Abstractions. For those who prefer Python, you can learn MIT's introductory algorithm course MIT 6.006: Introduction to Algorithms.

For those interested in more advanced algorithms and NP problems, consider UC Berkeley's course on algorithm design and analysis UCB CS170: Efficient Algorithms and Intractable Problems or MIT's advanced algorithms course MIT 6.046: Design and Analysis of Algorithms.

Software Engineering

Introductory Course

There is a fundamental difference between “working” code and high-quality industrial code. Therefore, I highly recommend senior students to take MIT 6.031: Software Construction. Based on Java, this course teaches how to write high-quality code that is bug-resistant, clear, and easy to maintain and modify with rich and detailed reading materials and well-designed programming exercises. From macro data structure design to minor details like how to write comments, following these details and experiences summarized by predecessors can greatly benefit your future programming career.

Professional Course

Of course, if you want to systematically take a software engineering course, I recommend UC Berkeley’s UCB CS169: Software Engineering. However, unlike most software engineering courses, this course does not involve the traditional design and document model that emphasizes various class diagrams, flowcharts, and document design. Instead, it adopts the Agile Development model, which has become popular in recent years, featuring small team rapid iterations and the Software as a Service model using cloud platforms.

Computer Architecture

Introductory Course

Since childhood, I've always heard that the world of computers is made of 0s and 1s, which I didn't understand but was deeply impressed by. If you also have this curiosity, consider spending one to two months learning the barrier-free computer course Coursera: Nand2Tetris. This comprehensive course starts from 0s and 1s, allowing you to build a computer by hand and run a Tetris game on it. It covers compilation, virtual machines, assembly, architecture, digital circuits, logic gates, etc., from top to bottom, from software to hardware. Its difficulty is carefully designed to omit many complex details of modern computers, extracting the most core essence, aiming to make it understandable to everyone. In lower levels, establishing a bird's-eye view of the entire computer system is very beneficial.

Professional Course

Of course, if you want to delve into the complex details of modern computer architecture, you still need to take a university-level course UCB CS61C: Great Ideas in Computer Architecture. This course emphasizes practice, and you will hand-write assembly to construct neural networks in projects, build a CPU from scratch, and more, all of which will give you a deeper understanding of computer architecture, beyond the monotony of "fetch, decode, execute, memory access, write back."

Introduction to Computer Systems

Computer systems are a vast and profound topic. Before delving into a specific area, having a macro conceptual understanding of each field and some general design principles will reinforce core and even philosophical concepts in your subsequent in-depth study, rather than being shackled by complex internal details and various tricks. In my opinion, the key to learning systems is to grasp these core concepts to design and implement your own systems.

MIT6.033: System Engineering is MIT's introductory course to systems, covering topics like operating systems, networks, distributed systems, and system security. In addition to the theory, this course also teaches some writing and expression skills, helping you learn how to design, introduce, and analyze your own systems. The accompanying textbook Principles of Computer System Design: An Introduction is also very well written and recommended for reading.

CMU 15-213: Introduction to Computer System is CMU’s introductory systems course, covering architecture, operating systems, linking, parallelism, networks, etc., with both breadth and depth. The accompanying textbook Computer Systems: A Programmer's Perspective is also of very high quality and strongly recommended for reading.

Operating Systems

There’s nothing like writing your own kernel to deepen your understanding of operating systems.

Operating systems provide a set of elegant abstractions to virtualize various complex underlying hardware, providing rich functional support for all application software. Understanding the design principles and internal mechanisms of operating systems is greatly beneficial for a programmer who is not satisfied with just being a coder. Out of love for operating systems, I have taken many operating system courses in different colleges, each with its own focus and merits. You can choose based on your interests.

MIT 6.S081: Operating System Engineering, offered by the famous PDOS lab at MIT, features 11 projects that modify an elegantly implemented Unix-like operating system xv6. This course made me realize that systems is not about reading PPTs; it's about writing tens of thousands of lines of code.

UCB CS162: Operating System, UC Berkeley’s operating system course, uses the same Project as Stanford — an educational operating system, Pintos. As the teaching assistant for Peking University’s 2022 and 2023 Spring Semester Operating Systems Course, I introduced and improved this Project. The course resources are fully open-sourced, with details on the course website.

NJU: Operating System Design and Implementation, offered by Professor Yanyan Jiang at Nanjing University, provides an in-depth and accessible explanation of various operating system concepts, combining a unique system perspective with rich code examples. All course content is in Chinese, making it very convenient for students.

HIT OS: Operating System, taught by Professor Zhijun Li at Harbin Institute of Technology, is a Chinese course on operating systems. Based on the Linux 0.11 source code, the course places great emphasis on code practice, explaining the intricacies of operating systems from the student's perspective.

Parallel and Distributed Systems

In recent years, the most common phrase heard in CS lectures is "Moore's Law is coming to an end." As single-core capabilities reach their limits, multi-core and many-core architectures are becoming increasingly important. The changes in hardware necessitate adaptations and changes in the upper-level programming logic. Writing parallel programs has nearly become a mandatory skill for programmers to fully utilize hardware performance. Meanwhile, the rise of deep learning has brought unprecedented demands on computing power and storage, making the deployment and optimization of large-scale clusters a hot topic.

Parallel Computing

CMU 15-418/Stanford CS149: Parallel Computing

Distributed Systems

MIT 6.824: Distributed System

System Security

Whether you chose computer science because of a youthful dream of becoming a hacker, the reality is that becoming a hacker is a long and difficult journey.

Theoretical Courses

UCB CS161: Computer Security at UC Berkeley covers stack attacks, cryptography, website security, network security, and more.

ASU CSE365: Introduction to Cybersecurity at Arizona State University focuses mainly on injections, assembly, and cryptography.

ASU CSE466: Computer Systems Security at Arizona State University covers a wide range of topics in system security. It has a high barrier to entry, requiring familiarity with Linux, C, and Python.

SU SEED Labs at Syracuse University, supported by a $1.3 million grant from the NSF, has developed hands-on experimental exercises (called SEED Labs) for cybersecurity education. The course emphasizes both theoretical teaching and practical exercises, including detailed open-source lectures, video tutorials, textbooks (printed in multiple languages), and a ready-to-use virtual machine and Docker-based attack-defense environment. This project is currently used by 1,050 institutions worldwide and covers a wide range of topics in computer and information security, including software security, network security, web security, operating system security, and mobile app security.

Practical Courses

After mastering this theoretical knowledge, it's essential to cultivate and hone these "hacker skills" in practice. CTF competitions are a popular way to comprehensively test your understanding and application of computer knowledge in various fields. Peking University also successfully held the 0th and 1st editions, encouraging participation to improve skills through practice. Here are some resources I use for learning (and relaxing):

Computer Networks

There’s nothing like writing your own TCP/IP protocol stack to deepen your understanding of computer networks.

The renowned Stanford CS144: Computer Network includes 8 projects that guide you in implementing the entire TCP/IP protocol stack.

If you're just looking to understand computer networks theoretically, I recommend the famous networking textbook "A Top-Down Approach" and its accompanying learning resources Computer Networking: A Top-Down Approach.

Database Systems

There’s nothing like building your own relational database to deepen your understanding of database systems.

CMU's famous database course CMU 15-445: Introduction to Database System guides you through 4 projects to add various functionalities to the educational relational database bustub. The experimental evaluation framework is also open-source, making it very suitable for self-learning. The course experiments also use many new features of C++11, offering a great opportunity to strengthen C++ coding skills.

Berkeley, as the birthplace of the famous open-source database PostgreSQL, has its own course UCB CS186: Introduction to Database System where you will implement a relational database in Java that supports SQL concurrent queries, B+ tree indexing, and fault recovery.

Compiler Theory

There’s nothing like writing your own compiler to deepen your understanding of compilers.

Stanford CS143: Compilers guides you through the process of writing a compiler.

Web Development

Front-end development is often overlooked in computer science curricula, but mastering these skills has many benefits, such as building your personal website or creating an impressive presentation website for your course projects.

Two-Week Crash Course

MIT web development course

Systematic Study Version

Stanford CS142: Web Applications

Computer Graphics

Data Science

Data science, machine learning, and deep learning are closely related, with a focus on practical application. Berkeley's UCB Data100: Principles and Techniques of Data Science lets you master various data analysis tools and algorithms through extensive programming exercises. The course guides you through extracting desired results from massive datasets and making predictions about future data or user behavior. For those looking to learn industrial-level data mining and analysis techniques, Stanford's big data mining course CS246: Mining Massive Data Sets is an option.

Artificial Intelligence

Artificial intelligence has been one of the hottest fields in computer science over the past decade. If you're not content with just hearing about AI advancements in the media and want to delve into the subject, I highly recommend Harvard's renowned CS50 series AI course Harvard CS50: Introduction to AI with Python. The course is concise and covers several major branches of traditional AI, supplemented with rich and interesting Python programming exercises to reinforce your understanding of AI algorithms. However, the content is somewhat simplified for online learners and doesn't delve into deep mathematical theories. For a more systematic and in-depth study, consider an undergraduate-level course like Berkeley's UCB CS188: Introduction to Artificial Intelligence. This course's projects feature the classic game "Pac-Man," allowing you to use AI algorithms to play the game, which is very fun.

Machine Learning

The most significant recent progress in the field of machine learning is the emergence of deep learning, a branch based on deep neural networks. However, many algorithms based on statistical learning are still widely used in data analysis. If you're new to machine learning and don't want to get bogged down in complex mathematical proofs, start with Andrew Ng's (Enda Wu) Coursera: Machine Learning. This course is well-known in the field of machine learning, and Enda Wu, with his profound theoretical knowledge and excellent presentation skills, makes many complex algorithms accessible and practical. The accompanying assignments are also of high quality, helping you get started quickly.

However, completing this course will only give you a general understanding of the field of machine learning. To truly understand the mathematical principles behind these "magical" algorithms or to engage in related research, you need a more "mathematical" course, such as Stanford CS229: Machine Learning or UCB CS189: Introduction to Machine Learning.

Deep Learning

The popularity of AlphaGo a few years ago brought deep learning to the public eye, leading many universities to establish related majors. Many other areas of computer science also use deep learning technology for research, so regardless of your field, you will likely encounter some needs related to neural networks and deep learning. For a quick introduction, I again recommend Andrew Ng's (Enda Wu) Coursera: Deep Learning, a top-rated course on Coursera. Additionally, if you find English-language courses challenging, consider Professor Hongyi Li's course National Taiwan University: Machine Learning. Although titled "Machine Learning," this course covers almost all areas of deep learning and is very comprehensive, making it suitable for getting a broad overview of the field. The professor is also very humorous, with frequent witty remarks in class.

Due to the rapid development of deep learning, there are now many research branches. For further in-depth study, consider the following representative courses:

Computer Vision

Natural Language Processing

Graph Neural Networks

Reinforcement Learning

Customize Your Course Map

Better to teach fishing than to give fish.

The course map above inevitably carries strong personal preferences and may not suit everyone. It is more intended to serve as a starting point for exploration. If you want to select your own areas of interest for study, you can refer to the following resources:

A Reference Guide for CS Learning

The field of computer science is vast and complex, with a seemingly endless sea of knowledge. Each specialized area can lead to limitless learning if pursued deeply. Therefore, a clear and definite study plan is very important. I've taken some detours in my years of self-study and finally distilled the following content for your reference.

Before you start learning, I highly recommend a popular science video series for beginners: Crash Course: Computer Science. In just 8 hours, it vividly and comprehensively covers various aspects of computer science: the history of computers, how computers operate, the important modules that make up a computer, key ideas in computer science, and so on. As its slogan says, Computers are not magic! I hope that after watching this video, everyone will have a holistic perception of computer science and embark on the detailed and in-depth learning content below with interest.

Essential Tools

As the saying goes: sharpening your axe will not delay your job of chopping wood. If you are a pure beginner in the world of computers, learning some tools will make you more efficient.

Learn to ask questions: You might be surprised that asking questions is the first one listed? I think in the open-source community, learning to ask questions is a very important ability. It involves two aspects. First, it indirectly cultivates your ability to solve problems independently, as the cycle of forming a question, describing it, getting answers from others, and then understanding the response is quite long. If you expect others to remotely assist you with every trivial issue, then the world of computers might not suit you. Second, if after trying, you still can't solve a problem, you can seek help from the open-source community. But at that point, how to concisely explain your situation and goal to others becomes particularly important. I recommend reading the article How To Ask Questions The Smart Way, which not only increases the probability and efficiency of solving your problems but also keeps those who provide answers in the open-source community in a good mood.

Learn to be a hacker: MIT-Missing-Semester covers many useful tools for a hacker and provides detailed usage instructions. I strongly recommend beginners to study this course. However, one thing to note is that the course occasionally refers to terms related to the development process. Therefore, it is recommended to study it at least after completing an introductory computer science course.

GFW: For well-known reasons, sites like Google and GitHub are not accessible in mainland China. However, in many cases, Google and StackOverflow can solve 99% of the problems encountered during development. Therefore, learning to use a VPN is almost an essential skill for a mainland CSer. (Considering legal issues, the methods provided in this book are only applicable to users with a Peking University email address).

Command Line: Proficiency in using the command line is often overlooked or considered difficult to master, but in reality, it greatly enhances your flexibility and productivity as an engineer. The Art of Command Line is a classic tutorial that started as a question on Quora, but with the contribution of many experts, it has become a top GitHub project with over 100,000 stars, translated into dozens of languages. The tutorial is not long, and I highly recommend everyone to read it repeatedly and internalize it through practice. Also, mastering shell script programming should not be overlooked, and you can refer to this tutorial.

IDE (Integrated Development Environment): Simply put, it's where you write your code. The importance of an IDE for a programmer goes without saying, but many IDEs are designed for large-scale projects and are quite bulky and overly feature-rich. Nowadays, some lightweight text editors with rich plugin ecosystems can basically meet the needs of daily lightweight programming. My personal favorites are VS Code and Sublime (the former has a very simple plugin configuration, while the latter is a bit more complex but aesthetically pleasing). Of course, for large projects, I would still use slightly heavier IDEs, such as Pycharm (Python), IDEA (Java), etc. (Disclaimer: all IDEs are the best in the world).

Vim: A command-line editor. Vim has a somewhat steep learning curve, but mastering it, I think, is very necessary because it will greatly improve your development efficiency. Most modern IDEs also support Vim plugins, allowing you to retain the coolness of a geek while enjoying a modern development environment.

Emacs: A classic editor that stands alongside Vim, with equally high development efficiency and more powerful expandability. It can be configured as a lightweight editor or expanded into a custom IDE, and even more sophisticated tricks.

Git: A version control tool for your project. Git, created by the father of Linux, Linus, is definitely one of the must-have tools for every CS student.

GitHub: A code hosting platform based on Git. The world's largest open-source community and a gathering place for CS experts.

GNU Make: An engineering build tool. Proficiency in GNU Make will help you develop a habit of modularizing your code and familiarize you with the compilation and linking processes of large projects.

CMake: A more powerful build tool than GNU Make, recommended for study after mastering GNU Make.

LaTex: Pretentious Paper typesetting tool.

Docker: A lighter-weight software packaging and deployment tool compared to virtual machines.

Practical Toolkit: In addition to the tools mentioned above that are frequently used in development, I have also collected many practical and interesting free tools, such as download tools, design tools, learning websites, etc.

Thesis: Tutorial for writing graduation thesis in Word.

I believe a good textbook should be people-oriented, rather than a display of technical jargon. It's certainly important to tell readers "what it is," but a better approach would be for the author to integrate decades of experience in the field into the book and narratively convey to the reader "why it is" and what should be done in the future.

Link here

Environment Setup

What you think of as development — coding frantically in an IDE for hours.

Actual development — setting up the environment for several days without starting to code.

PC Environment Setup

If you are a Mac user, you're in luck, as this guide will walk you through setting up the entire development environment. If you are a Windows user, thanks to the efforts of the open-source community, you can enjoy a similar experience with Scoop.

Additionally, you can refer to an environment setup guide inspired by 6.NULL MIT-Missing-Semester, focusing on terminal beautification. It also includes common software sources (such as GitHub, Anaconda, PyPI) for acceleration and replacement, as well as some IDE configuration and activation tutorials.

Server-Side Environment Setup

Server-side operation and maintenance require basic use of Linux (or other Unix-like systems) and fundamental concepts like processes, devices, networks, etc. Beginners can refer to the Linux 101 online notes compiled by the Linux User Association of the University of Science and Technology of China. If you want to delve deeper into system operation and maintenance, you can refer to the Aspects of System Administration course.

Additionally, if you need to learn a specific concept or tool, I recommend a great GitHub project, DevOps-Guide, which covers a lot of foundational knowledge and tutorials in the administration field, such as Docker, Kubernetes, Linux, CI-CD, GitHub Actions, and more.

Course Map

As mentioned at the beginning of this chapter, this course map is merely a reference guide for course planning, from my perspective as an undergraduate nearing graduation. I am acutely aware that I neither have the right nor the capability to preach to others about “how one should learn”. Therefore, if you find any issues with the course categorization and selection below, I fully accept and deeply apologize for them. You can tailor your own course map in the next section Customize Your Own Course Map.

Apart from courses labeled as basic or introductory, there is no explicit sequence in the following categories. As long as you meet the prerequisites for a course, you are free to choose any course according to your needs and interests.

Mathematical Foundations

Calculus and Linear Algebra

As a freshman, mastering calculus and linear algebra is as important as learning to code. This point has been reiterated countless times by predecessors, but I feel compelled to emphasize it again: mastering calculus and linear algebra is really important! You might complain that these subjects are forgotten after exams, but I believe that indicates a lack of deep understanding of their essence. If you find the content taught in class to be obscure, consider referring to MIT’s Calculus Course and 18.06: Linear Algebra course notes. For me, they greatly deepened my understanding of the essence of calculus and linear algebra. Also, I highly recommend the maths YouTuber 3Blue1Brown, whose channel features videos explaining the core of mathematics with vivid animations, offering both depth and breadth of high quality.

Introduction to Information Theory

For computer science students, gaining some foundational knowledge in information theory early on is beneficial. However, most information theory courses are targeted towards senior or even graduate students, making them quite inaccessible to beginners. MIT’s 6.050J: Information theory and Entropy is tailored for freshmen, with almost no prerequisites, covering coding, compression, communication, information entropy, and more, which is very interesting.

Advanced Mathematics

Discrete Mathematics and Probability Theory

Set theory, graph theory, and probability theory are essential tools for algorithm derivation and proof, as well as foundations for more advanced mathematical courses. However, the teaching of these subjects often falls into a rut of being overly theoretical and formalistic, turning classes into mere recitations of theorems and conclusions without helping students grasp the essence of these theories. If theory teaching can be interspersed with examples of algorithm application, students can expand their algorithm knowledge while appreciating the power and charm of theory.

UCB CS70: Discrete Math and Probability Theory and UCB CS126: Probability Theory are UC Berkeley’s probability courses. The former covers the basics of discrete mathematics and probability theory, while the latter delves into stochastic processes and more advanced theoretical content. Both emphasize the integration of theory and practice and feature abundant examples of algorithm application, with the latter including numerous Python programming assignments to apply probability theory to real-world problems.

Numerical Analysis

For computer science students, developing computational thinking is crucial. Modeling and discretizing real-world problems, and simulating and analyzing them on computers, are vital skills. Recently, the Julia programming language, developed by MIT, has become popular in the field of numerical computation with its C-like speed and Python-friendly syntax. Many MIT mathematics courses have started using Julia as a teaching tool, presenting complex mathematical theories through clear and intuitive code.

ComputationalThinking is an introductory course in computational thinking offered by MIT. All course materials are open source and accessible on the course website. Using the Julia programming language, the course covers image processing, social science and data science, and climatology modeling, helping students understand algorithms, mathematical modeling, data analysis, interactive design, and graph presentation. The course content, though not difficult, profoundly impressed me with the idea that the allure of science lies not in obscure theories or jargon but in presenting complex concepts through vivid examples and concise, deep language.

After completing this experience course, if you’re still eager for more, consider MIT’s 18.330: Introduction to Numerical Analysis. This course also uses Julia for programming assignments but is more challenging and in-depth. It covers floating-point encoding, root finding, linear systems, differential equations, and more, with the main goal of using discrete computer representations to estimate and approximate continuous mathematical concepts. The course instructor has also written an accompanying open-source textbook, Fundamentals of Numerical Computation, which includes abundant Julia code examples and rigorous formula derivations.

If you’re still not satisfied, MIT’s graduate course in numerical analysis, 18.335: Introduction to Numerical Methods, is also available for reference.

Differential Equations

Wouldn't it be cool if the motion and development of everything in the world could be described and depicted with equations? Although differential equations are not a mandatory part of any CS curriculum, I believe mastering them provides a new perspective to view the world.

Since differential equations often involve complex variable functions, you can refer to MIT18.04: Complex Variables Functions course notes to fill in prerequisite knowledge.

MIT18.03: Differential Equations mainly covers the solution of ordinary differential equations, and on this basis, MIT18.152: Partial Differential Equations dives into the modeling and solving of partial differential equations. With the powerful tool of differential equations, you will gain enhanced capabilities in modeling real-world problems and intuitively grasping the essence among various noisy variables.

Advanced Mathematical Topics

As a computer science student, I often hear arguments about the uselessness of mathematics. While I neither agree nor have the authority to oppose such views, if everything is forcibly categorized as useful or useless, it indeed becomes quite dull. Therefore, the following advanced mathematics courses, aimed at senior and even graduate students, are available for those interested.

Convex Optimization

Standford EE364A: Convex Optimization

Information Theory

MIT6.441: Information Theory

Applied Statistics

MIT18.650: Statistics for Applications

Elementary Number Theory

MIT18.781: Theory of Numbers

Cryptography

Standford CS255: Cryptography

Programming Fundamentals

Languages are tools, and you choose the right tool for the right job. Since there's no universally perfect tool, there's no universally perfect language.

General

Java

Python

C++

Rust

OCaml

Electronics Fundamentals

Basics of Circuits

For computer science students, understanding basic circuit knowledge and experiencing the entire pipeline from sensor data collection to data analysis and algorithm prediction can be very helpful for future learning and developing computational thinking. EE16A&B: Designing Information Devices and Systems I&II at UC Berkeley are introductory courses for freshmen in electrical engineering. EE16A focuses on collecting and analyzing data from the real environment through circuits, while EE16B focuses on analyzing these collected data to make predictive actions.

Signals and Systems

Signals and Systems is a course I find very worthwhile. Initially, I studied it out of curiosity about Fourier Transform, but after completing it, I was amazed at how Fourier Transform provided a new perspective to view the world, just like differential equations, immersing you in the elegance and magic of precisely depicting the world with mathematics.

MIT 6.003: Signal and Systems provides all course recordings, written assignments, and answers. You can also check out this course's ancient version.

UCB EE120: Signal and Systems has very well-written notes on Fourier Transform and provides many interesting Python programming assignments to practically apply the theories and algorithms of signals and systems.

Data Structures and Algorithms

Algorithms are the core of computer science and the foundation for almost all professional courses. How to abstract real-world problems into algorithmic problems mathematically and solve them under time and memory constraints using appropriate data structures is the eternal theme of algorithm courses. If you are fed up with your teacher's rote teaching, I highly recommend UC Berkeley's UCB CS61B: Data Structures and Algorithms and Princeton's Coursera: Algorithms I & II. Both courses are taught in a deep yet simple manner and have rich and interesting programming experiments to integrate theory with knowledge.

Both of these courses are based on Java. If you prefer C/C++, you can refer to Stanford's data structure and basic algorithm course Stanford CS106B/X: Programming Abstractions. For those who prefer Python, you can learn MIT's introductory algorithm course MIT 6.006: Introduction to Algorithms.

For those interested in more advanced algorithms and NP problems, consider UC Berkeley's course on algorithm design and analysis UCB CS170: Efficient Algorithms and Intractable Problems or MIT's advanced algorithms course MIT 6.046: Design and Analysis of Algorithms.

Software Engineering

Introductory Course

There is a fundamental difference between “working” code and high-quality industrial code. Therefore, I highly recommend senior students to take MIT 6.031: Software Construction. Based on Java, this course teaches how to write high-quality code that is bug-resistant, clear, and easy to maintain and modify with rich and detailed reading materials and well-designed programming exercises. From macro data structure design to minor details like how to write comments, following these details and experiences summarized by predecessors can greatly benefit your future programming career.

Professional Course

Of course, if you want to systematically take a software engineering course, I recommend UC Berkeley’s UCB CS169: Software Engineering. However, unlike most software engineering courses, this course does not involve the traditional design and document model that emphasizes various class diagrams, flowcharts, and document design. Instead, it adopts the Agile Development model, which has become popular in recent years, featuring small team rapid iterations and the Software as a Service model using cloud platforms.

Computer Architecture

Introductory Course

Since childhood, I've always heard that the world of computers is made of 0s and 1s, which I didn't understand but was deeply impressed by. If you also have this curiosity, consider spending one to two months learning the barrier-free computer course Coursera: Nand2Tetris. This comprehensive course starts from 0s and 1s, allowing you to build a computer by hand and run a Tetris game on it. It covers compilation, virtual machines, assembly, architecture, digital circuits, logic gates, etc., from top to bottom, from software to hardware. Its difficulty is carefully designed to omit many complex details of modern computers, extracting the most core essence, aiming to make it understandable to everyone. In lower levels, establishing a bird's-eye view of the entire computer system is very beneficial.

Professional Course

Of course, if you want to delve into the complex details of modern computer architecture, you still need to take a university-level course UCB CS61C: Great Ideas in Computer Architecture. This course emphasizes practice, and you will hand-write assembly to construct neural networks in projects, build a CPU from scratch, and more, all of which will give you a deeper understanding of computer architecture, beyond the monotony of "fetch, decode, execute, memory access, write back."

Introduction to Computer Systems

Computer systems are a vast and profound topic. Before delving into a specific area, having a macro conceptual understanding of each field and some general design principles will reinforce core and even philosophical concepts in your subsequent in-depth study, rather than being shackled by complex internal details and various tricks. In my opinion, the key to learning systems is to grasp these core concepts to design and implement your own systems.

MIT6.033: System Engineering is MIT's introductory course to systems, covering topics like operating systems, networks, distributed systems, and system security. In addition to the theory, this course also teaches some writing and expression skills, helping you learn how to design, introduce, and analyze your own systems. The accompanying textbook Principles of Computer System Design: An Introduction is also very well written and recommended for reading.

CMU 15-213: Introduction to Computer System is CMU’s introductory systems course, covering architecture, operating systems, linking, parallelism, networks, etc., with both breadth and depth. The accompanying textbook Computer Systems: A Programmer's Perspective is also of very high quality and strongly recommended for reading.

Operating Systems

There’s nothing like writing your own kernel to deepen your understanding of operating systems.

Operating systems provide a set of elegant abstractions to virtualize various complex underlying hardware, providing rich functional support for all application software. Understanding the design principles and internal mechanisms of operating systems is greatly beneficial for a programmer who is not satisfied with just being a coder. Out of love for operating systems, I have taken many operating system courses in different colleges, each with its own focus and merits. You can choose based on your interests.

MIT 6.S081: Operating System Engineering, offered by the famous PDOS lab at MIT, features 11 projects that modify an elegantly implemented Unix-like operating system xv6. This course made me realize that systems is not about reading PPTs; it's about writing tens of thousands of lines of code.

UCB CS162: Operating System, UC Berkeley’s operating system course, uses the same Project as Stanford — an educational operating system, Pintos. As the teaching assistant for Peking University’s 2022 and 2023 Spring Semester Operating Systems Course, I introduced and improved this Project. The course resources are fully open-sourced, with details on the course website.

NJU: Operating System Design and Implementation, offered by Professor Yanyan Jiang at Nanjing University, provides an in-depth and accessible explanation of various operating system concepts, combining a unique system perspective with rich code examples. All course content is in Chinese, making it very convenient for students.

HIT OS: Operating System, taught by Professor Zhijun Li at Harbin Institute of Technology, is a Chinese course on operating systems. Based on the Linux 0.11 source code, the course places great emphasis on code practice, explaining the intricacies of operating systems from the student's perspective.

Parallel and Distributed Systems

In recent years, the most common phrase heard in CS lectures is "Moore's Law is coming to an end." As single-core capabilities reach their limits, multi-core and many-core architectures are becoming increasingly important. The changes in hardware necessitate adaptations and changes in the upper-level programming logic. Writing parallel programs has nearly become a mandatory skill for programmers to fully utilize hardware performance. Meanwhile, the rise of deep learning has brought unprecedented demands on computing power and storage, making the deployment and optimization of large-scale clusters a hot topic.

Parallel Computing

CMU 15-418 / Stanford CS149: Parallel Computing takes you deep into the design principles and trade-offs of modern parallel computing architectures. The course teaches you how to fully leverage hardware resources and software programming frameworks—such as CUDA, MPI, and OpenMP—to write high-performance parallel programs.

Distributed Systems

MIT 6.824: Distributed Systems, like MIT 6.S081, is offered by MIT’s renowned PDOS (Parallel and Distributed Operating Systems) lab. The course is taught by Professor Robert Morris, who was once a legendary hacker—famously known for creating the first computer worm, the Morris Worm.

Each lecture focuses on an in-depth reading of a classic paper in the field of distributed systems, through which the course conveys essential principles and key techniques for designing and implementing distributed systems. The course is also famous for its challenging projects: over the course of four progressively difficult programming assignments, students build a key-value store framework based on the Raft consensus algorithm. These projects offer a firsthand experience of the randomness and complexity brought by concurrency and distribution—often felt most acutely during painful debugging sessions.

System Security

Whether you chose computer science because of a youthful dream of becoming a hacker, the reality is that becoming a hacker is a long and difficult journey.

Theoretical Courses

UCB CS161: Computer Security at UC Berkeley covers stack attacks, cryptography, website security, network security, and more.

SU SEED Labs at Syracuse University, supported by a $1.3 million grant from the NSF, has developed hands-on experimental exercises (called SEED Labs) for cybersecurity education. The course emphasizes both theoretical teaching and practical exercises, including detailed open-source lectures, video tutorials, textbooks (printed in multiple languages), and a ready-to-use virtual machine and Docker-based attack-defense environment. This project is currently used by 1,050 institutions worldwide and covers a wide range of topics in computer and information security, including software security, network security, web security, operating system security, and mobile app security.

Practical Courses

After mastering this theoretical knowledge, it's essential to cultivate and hone these "hacker skills" in practice. CTF competitions are a popular way to comprehensively test your understanding and application of computer knowledge in various fields. Peking University also successfully held the 0th and 1st editions, encouraging participation to improve skills through practice. Here are some resources I use for learning (and relaxing):

Computer Networks

There’s nothing like writing your own TCP/IP protocol stack to deepen your understanding of computer networks.

The renowned Stanford CS144: Computer Network includes 8 projects that guide you in implementing the entire TCP/IP protocol stack.

If you're mainly interested in gaining a theoretical understanding of computer networks, it's recommended to read the textbook that accompanies the course UCB CS168.

Database Systems

There’s nothing like building your own relational database to deepen your understanding of database systems.

CMU's famous database course CMU 15-445: Introduction to Database System guides you through 4 projects to add various functionalities to the educational relational database bustub. The experimental evaluation framework is also open-source, making it very suitable for self-learning. The course experiments also use many new features of C++11, offering a great opportunity to strengthen C++ coding skills.

Berkeley, as the birthplace of the famous open-source database PostgreSQL, has its own course UCB CS186: Introduction to Database System where you will implement a relational database in Java that supports SQL concurrent queries, B+ tree indexing, and fault recovery.

Compiler Theory

There’s nothing like writing your own compiler to deepen your understanding of compilers.

Stanford CS143: Compilers guides you through the process of writing a compiler.

Web Development

Front-end and back-end development are often overlooked in standard computer science curricula, but in reality, having these skills can be extremely beneficial—for example, creating your own personal website or building a polished demo page for a course project.

If you're looking for a quick, two-week crash course, I recommend the MIT Web Development Course. For a more comprehensive and structured learning experience, check out Stanford CS142: Web Applications.

Computer Graphics

I personally don't have much background in computer graphics, so I've collected a selection of high-quality courses recommended by the community for those interested in exploring the field.

Data Science

Data science, machine learning, and deep learning are closely related, with a focus on practical application. Berkeley's UCB Data100: Principles and Techniques of Data Science lets you master various data analysis tools and algorithms through extensive programming exercises. The course guides you through extracting desired results from massive datasets and making predictions about future data or user behavior. For those looking to learn industrial-level data mining and analysis techniques, Stanford's big data mining course CS246: Mining Massive Data Sets is an option.

Artificial Intelligence

Artificial intelligence has been one of the hottest fields in computer science over the past decade. If you're not content with just hearing about AI advancements in the media and want to delve into the subject, I highly recommend Harvard's renowned CS50 series AI course Harvard CS50: Introduction to AI with Python. The course is concise and covers several major branches of traditional AI, supplemented with rich and interesting Python programming exercises to reinforce your understanding of AI algorithms. However, the content is somewhat simplified for online learners and doesn't delve into deep mathematical theories. For a more systematic and in-depth study, consider an undergraduate-level course like Berkeley's UCB CS188: Introduction to Artificial Intelligence. This course's projects feature the classic game "Pac-Man," allowing you to use AI algorithms to play the game, which is very fun.

Machine Learning

The most significant recent progress in the field of machine learning is the emergence of deep learning, a branch based on deep neural networks. However, many algorithms based on statistical learning are still widely used in data analysis. If you're new to machine learning and don't want to get bogged down in complex mathematical proofs, start with Andrew Ng's (Enda Wu) Coursera: Machine Learning. This course is well-known in the field of machine learning, and Enda Wu, with his profound theoretical knowledge and excellent presentation skills, makes many complex algorithms accessible and practical. The accompanying assignments are also of high quality, helping you get started quickly.

However, completing this course will only give you a general understanding of the field of machine learning. To truly understand the mathematical principles behind these "magical" algorithms or to engage in related research, you need a more "mathematical" course, such as Stanford CS229: Machine Learning or UCB CS189: Introduction to Machine Learning.

If you plan to pursue scientific research in machine learning theory, you can refer to the advanced learning roadmap shared by Yao Fu, which includes more in-depth, graduate-level courses.

Deep Learning

The popularity of AlphaGo a few years ago brought deep learning to the public eye, leading many universities to establish related majors. Many other areas of computer science also use deep learning technology for research, so regardless of your field, you will likely encounter some needs related to neural networks and deep learning. For a quick introduction, I again recommend Andrew Ng's (Enda Wu) Coursera: Deep Learning, a top-rated course on Coursera. Additionally, if you find English-language courses challenging, consider Professor Hongyi Li's course National Taiwan University: Machine Learning. Although titled "Machine Learning," this course covers almost all areas of deep learning and is very comprehensive, making it suitable for getting a broad overview of the field. The professor is also very humorous, with frequent witty remarks in class.

Due to the rapid development of deep learning, there are now many research branches. For further in-depth study, consider the following representative courses:

Computer Vision

Natural Language Processing

Graph Neural Networks

Reinforcement Learning

Deep Learning Systems

As deep learning models grow in importance and demand increasing computational resources, optimizing the underlying systems for training and inference has become increasingly critical. For those looking to enter this field, a highly recommended resource is CMU 10-414/714: Deep Learning Systems. This course provides a comprehensive "full-stack" understanding of deep learning systems—from high-level architectural design of modern frameworks, to the principles and implementation of automatic differentiation, down to low-level hardware acceleration and real-world deployment.

To deepen theoretical understanding, students are tasked with building a deep learning library from scratch, called Needle, as part of the coursework. This library supports automatic differentiation on computational graphs, GPU-based acceleration, and includes modules for loss functions, data loaders, and optimizers. On top of this, students will implement several common neural network architectures including CNNs, RNNs, LSTMs, and Transformers.

For those with foundational knowledge, the next step would be to explore MIT 6.5940: TinyML and Efficient Deep Learning Computing, taught by Professor Song Han. This course dives into techniques for making neural networks more efficient, such as pruning, quantization, distillation, and neural architecture search. It also covers cutting-edge system optimizations for advanced models, including large language models.

Deep Generative Models

With the explosive popularity of large language models, understanding the principles behind them is essential to staying at the forefront of the field. You can refer to my recommended learning roadmap for a guided approach to studying this area.

Customize Your Course Map

Better to teach fishing than to give fish.

The course map above inevitably carries strong personal preferences and may not suit everyone. It is more intended to serve as a starting point for exploration. If you want to select your own areas of interest for study, you can refer to the following resources:

Image title

前言

最近更新:Release v1.1.0 已发布 🎉

这是一本计算机的自学指南,也是对自己大学三年自学生涯的一个纪念。

这同时也是一份献给北大信科学弟学妹们的礼物。如果这本书能对你们的信科生涯有哪怕一丝一毫的帮助,都是对我极大的鼓励和慰藉。

本书目前包括了以下部分(如果你有其他好的建议,或者想加入贡献者的行列,欢迎邮件 zhongyinmin@pku.edu.cn 或者在 issue 里提问):

  • 本书使用指南:由于书内涵盖资源众多,我根据不同人群的空闲时间和学习目标制定了对应的使用指南。
  • 一份供参考的 CS 学习规划:我根据自己的自学经历制定的全面的、系统化的 CS 自学规划。
  • 必学工具:一些 CSer 效率工具介绍,例如 IDE, 翻墙, StackOverflow, Git, GitHub, Vim, LaTeX, GNU Make, Docker, 工作流 等等。
  • 经典书籍推荐:你是否苦于教材的晦涩难懂不知所云?别从自己身上找原因了,可能只是教材写得太烂。看过 CSAPP 这本书的同学一定会感叹好书的重要,我将列举推荐各个计算机领域的必看好书与资源链接。
  • 国内外高质量 CS 课程汇总:我将把我上过的以及开源社区贡献的高质量的国内外 CS 课程分门别类进行汇总,介绍其课程内容特点并给出相应的自学建议,大部分课程都会有一个独立的仓库维护相关的资源以及作业实现供大家学习参考。

梦开始的地方 —— CS61A

大一入学时我是一个对计算机一无所知的小白,装了几十个 G 的 Visual Studio 天天和 OJ 你死我活。凭着高中的数学底子我数学课学得还不错,但在专业课上对竞赛大佬只有仰望。提到编程我只会打开那笨重的 IDE,新建一个我也不知道具体是干啥的命令行项目,然后就是 cin, cout, for 循环,然后 CE, RE, WA 循环。当时的我就处在一种拼命想学好但不知道怎么学,课上认真听讲但题还不会做,课后做作业完全是用时间和它硬耗的痛苦状态。我至今电脑里还存着自己大一上学期计算概论大作业的源代码 —— 一个 1200 行的 C++ 文件,没有头文件、没有类、没有封装、没有 unit test、没有 Makefile、没有 Git,唯一的优点是它确实能跑,缺点是“能跑”的补集。我一度怀疑我是不是不适合学计算机,因为童年对于极客的所有想象,已经被我第一个学期的体验彻底粉碎了。

这一切的转机发生在我大一的寒假,我心血来潮想学习 Python。无意间看到知乎有人推荐了 CS61A 这门课,说是 UC Berkeley 的大一入门课程,讲的就是 Python。我永远不会忘记那一天,打开 CS61A 课程网站的那个瞬间,就像哥伦布发现了新大陆一样,我开启了新世界的大门。

我一口气 3 个星期上完了这门课,它让我第一次感觉到原来 CS 可以学得如此充实而有趣,原来这世上竟有如此精华的课程。

为避免有崇洋媚外之嫌,我单纯从一个学生的视角来讲讲自学 CS61A 的体验:

  • 独立搭建的课程网站: 一个网站将所有课程资源整合一体,条理分明的课程 schedule、所有 slides, homework, discussion 的文件链接、详细明确的课程给分说明、历年的考试题与答案。这样一个网站抛开美观程度不谈,既方便学生,也让资源公正透明。

  • 课程教授亲自编写的教材:CS61A 这门课的开课老师将 MIT 的经典教材 Structure and Interpretation of Computer Programs (SICP) 用Python这门语言进行改编(原教材基于 Scheme 语言),保证了课堂内容与教材内容的一致性,同时补充了更多细节,可以说诚意满满。而且全书开源,可以直接线上阅读。

  • 丰富到让人眼花缭乱的课程作业:14 个 lab 巩固随堂知识点,10 个 homework,还有 4 个代码量均上千行的 project。与大家熟悉的 OJ 和 Word 文档式的作业不同,所有作业均有完善的代码框架,保姆级的作业说明。每个 Project 都有详尽的 handout 文档、全自动的评分脚本。CS61A 甚至专门开发了一个自动化的作业提交评分系统(据说还发了论文)。当然,有人会说“一个 project 几千行代码大部分都是助教帮你写好的,你还能学到啥?”。此言差矣,作为一个刚刚接触计算机,连安装 Python 都磕磕绊绊的小白来说,这样完善的代码框架既可以让你专注于巩固课堂上学习到的核心知识点,又能有“我才学了一个月就能做一个小游戏了!”的成就感,还能有机会阅读学习别人高质量的代码,从而为自己所用。我觉得在低年级,这种代码框架可以说百利而无一害。就是苦了老师和助教,因为开发这样的作业可想而知需要相当大的时间投入和多年的迭代积累。

  • 每周 Discussion 讨论课:助教会讲解知识难点和考试例题,习题全部用 LaTeX 撰写,相当规范并且会给出详细的解答,让学生及时查漏补缺巩固知识点。

这样的课程,你完全不需要任何计算机的基础,你只需要努力、认真、花时间就够了。此前那种有劲没处使的感觉,那种付出再多时间却得不到回报的感觉,从此烟消云散。这太适合我了,我从此爱上了自学。

试想如果有人能把艰深的知识点嚼碎嚼烂,用生动直白的方式呈现给你,还有那么多听起来就很 fancy,种类繁多的 project 来巩固你的理论知识,你会觉得他们真的是在倾尽全力想方设法地让你完全掌握这门课,你会觉得不学好它简直是对这些课程建设者的侮辱。

如果你觉得我在夸大其词,那么不妨从 CS61A 开始,因为它是我的梦开始的地方。

为什么写这本书

在我2020年秋季学期担任《深入理解计算机系统》(CSAPP)这门课的助教时,我已经自学一年多了。这一年多来我无比享受这种自学模式,为了分享这种快乐,我为自己的研讨班学生做过一个 CS自学资料整理仓库。当时纯粹是心血来潮,因为我也不敢公然鼓励大家翘课自学。

但随着又一年时间的维护,这个仓库的内容已经相当丰富,基本覆盖了计科、智能系、软工系的绝大多数课程,我也为每个课程都建了各自的 GitHub 仓库,汇总我用到的自学资料以及作业实现。

直到大四开始凑学分毕业的时候,我打开自己的培养方案,我发现它已经是我这个自学仓库的子集了,而这距离我开始自学也才两年半而已。于是,一个大胆的想法在我脑海中浮现:也许,我可以打造一个自学式的培养方案,把我这三年自学经历中遇到的坑、走过的路记录下来,以期能为后来的学弟学妹们贡献自己的一份微薄之力。

如果大家可以在三年不到的时间里就能建立起整座 CS 的基础大厦,能有相对扎实的数学功底和代码能力,经历过数十个千行代码量的 Project 的洗礼,掌握至少 C/C++/Java/JS/Python/Go/Rust 等主流语言,对算法、电路、体系、网络、操统、编译、人工智能、机器学习、计算机视觉、自然语言处理、强化学习、密码学、信息论、博弈论、数值分析、统计学、分布式、数据库、图形学、Web开发、云服务、超算等等方面均有涉猎。我想,你将有足够的底气和自信选择自己感兴趣的方向,无论是就业还是科研,你都将有相当的竞争力。

因为我坚信,既然你能坚持听我 BB 到这里,你一定不缺学好 CS 的能力,你只是没有一个好的老师,给你讲一门好的课程。而我,将力图根据我三年的体验,为你挑选这样的课程。

自学的好处

对我来说,自学最大的好处就在于可以完全根据自己的进度来调整学习速度。对于一些疑难知识点,我可以反复回看视频,在网上谷歌相关的内容,上 StackOverflow 提问题,直到完全将它弄明白。而对于自己掌握得相对较快的内容,则可以两倍速甚至三倍速略过。

自学的另一大好处就是博采众长。计算机系的几大核心课程:体系、网络、操统、编译,每一门我基本都上过不同大学的课程,不同的教材、不同的知识点侧重、不同的 project 将会极大丰富你的视野,也会让你理解错误的一些内容得到及时纠正。

自学的第三个好处是时间自由。大学的课余时间本就相对自由,再加上不用去上课的话更是可以放飞自我地安排自学时间和进度。我大二的时候赶上疫情在家窝了大半年,返校之后也基本没有线下去过教室上课,对绩点也毫无影响。

自学的坏处

当然,作为 CS 自学主义的忠实拥趸,我不得不承认自学也有它的坏处。

第一就是交流沟通的不便。我其实是一个很热衷于提问的人,对于所有没有弄明白的点,我都喜欢穷追到底。但当你面对着屏幕听到老师讲了一个你没明白的知识点的时候,你无法顺着网线到另一端向老师问个明白。我努力通过独立思考和善用 Google 来缓解这一点,但是,如果能有几个志同道合的伙伴结伴自学,那将是极好的。关于交流群的建立,大家可以参考仓库 README 中的教程。

第二就是这些自学的课程基本都是英文的。从视频到课件再到作业全是英文,所以有一定的门槛。我尽量在汇总课程视频资源的时候寻找带中文字幕的搬运视频,但大多数课程还是只有机翻或者生肉,而课件和作业肯定都是英文的。不过我觉得这是个值得努力克服的挑战,因为在当下,虽然我很不情愿,但也不得不承认,在计算机领域,很多优质的文档、论坛、网站都是英文居多。养成英文阅读的习惯,在赤旗插遍世界之前,还是有一定好处的(狗头保命)。

第三,也是我觉得最困难的一点,就是自律。因为没有 DDL 有时候真的是一件可怕的事情。特别是随着学习的深入,国外的很多课程是相当虐的。你得有足够的驱动力强迫自己静下心来,阅读几十页的 Project Handout,理解上千行的代码框架,忍受数个小时的 debug 时光。而这一切,没有学分,没有绩点,没有老师,没有同学,只有一个信念 —— 你在变强。

这本书适合谁

正如我在前言里说的,任何有志于自学计算机的朋友都可以参考这本书。如果你已经有了一定的计算机基础,只是对某个特定的领域感兴趣,可以选择性地挑选你感兴趣的内容进行学习。当然,如果你是一个像我当年一样对计算机一无所知的小白,初入大学的校门,我希望这本书能成为你的攻略,让你花最少的时间掌握你所需要的知识和能力。某种程度上,这本书更像是一个根据我的体验来排序的课程搜索引擎,帮助大家足不出户,体验世界顶级名校的计算机优质课程。

当然,作为一个还未毕业的本科生,我深感自己没有能力也没有权利去宣扬一种学习方式,我只是希望这份资料能让那些同样有自学之心和毅力朋友可以少走些弯路,收获更丰富、更多样、更满足的学习体验。

特别鸣谢

在这里,我怀着崇敬之心真诚地感谢所有将课程资源无偿开源的各位教授们。这些课程倾注了他们数十年教学生涯的积淀和心血,他们却选择无私地让所有人享受到如此高质量的 CS 教育。没有他们,我的大学生活不会这样充实而快乐。很多教授在我给他们发了感谢邮件之后,甚至会回复上百字的长文,真的让我无比感动。他们也时刻激励着我,做一件事,就得用心做好,无论是学习、科研、还是为人。

你也想加入到贡献者的行列

一个人的力量终究是有限的,这本书也是我在繁重的科研之余熬夜抽空写出来的,难免有不够完善之处。另外,由于个人做的是系统方向,很多课程侧重系统领域,对于数学、理论计算机、高级算法相关的内容则相对少些。如果有大佬想在其他领域分享自己的自学经历与资源,可以直接在项目中发起 Pull Request,也欢迎和我邮件联系(zhongyinmin@pku.edu.cn)。

关于交流群的建立

本书支持页面评论功能,因此如果你想自学某课程,可以自己建立群聊后(QQ 微信皆可)在对应的课程页面下方发表评论,注明你的学习目标以及加入交流群的途径。此外,过去已有不少朋友在 issue 里建立了类似群聊,可以自行选择直接加入。

请作者喝杯下午茶

本书的内容是完全开源免费的,如果你觉得该项目对你真的有帮助,可以给仓库点个 star 或者请作者喝一杯下午茶。

Image title

Image title

前言

🎉🎉 Release v1.2.0: 更新了深度生成模型学习路线 🎉🎉

这是一本计算机的自学指南,也是对自己大学三年自学生涯的一个纪念。

这同时也是一份献给北大信科学弟学妹们的礼物。如果这本书能对你们的信科生涯有哪怕一丝一毫的帮助,都是对我极大的鼓励和慰藉。

本书目前包括了以下部分(如果你有其他好的建议,或者想加入贡献者的行列,欢迎邮件 zhongyinmin@pku.edu.cn 或者在 issue 里提问):

  • 本书使用指南:由于书内涵盖资源众多,我根据不同人群的空闲时间和学习目标制定了对应的使用指南。
  • 一份供参考的 CS 学习规划:我根据自己的自学经历制定的全面的、系统化的 CS 自学规划。
  • 必学工具:一些 CSer 效率工具介绍,例如 IDE, 翻墙, StackOverflow, Git, GitHub, Vim, LaTeX, GNU Make, Docker, 工作流 等等。
  • 经典书籍推荐:你是否苦于教材的晦涩难懂不知所云?别从自己身上找原因了,可能只是教材写得太烂。看过 CSAPP 这本书的同学一定会感叹好书的重要,我将列举推荐各个计算机领域的必看好书与资源链接。
  • 国内外高质量 CS 课程汇总:我将把我上过的以及开源社区贡献的高质量的国内外 CS 课程分门别类进行汇总,介绍其课程内容特点并给出相应的自学建议,大部分课程都会有一个独立的仓库维护相关的资源以及作业实现供大家学习参考。

梦开始的地方 —— CS61A

大一入学时我是一个对计算机一无所知的小白,装了几十个 G 的 Visual Studio 天天和 OJ 你死我活。凭着高中的数学底子我数学课学得还不错,但在专业课上对竞赛大佬只有仰望。提到编程我只会打开那笨重的 IDE,新建一个我也不知道具体是干啥的命令行项目,然后就是 cin, cout, for 循环,然后 CE, RE, WA 循环。当时的我就处在一种拼命想学好但不知道怎么学,课上认真听讲但题还不会做,课后做作业完全是用时间和它硬耗的痛苦状态。我至今电脑里还存着自己大一上学期计算概论大作业的源代码 —— 一个 1200 行的 C++ 文件,没有头文件、没有类、没有封装、没有 unit test、没有 Makefile、没有 Git,唯一的优点是它确实能跑,缺点是“能跑”的补集。我一度怀疑我是不是不适合学计算机,因为童年对于极客的所有想象,已经被我第一个学期的体验彻底粉碎了。

这一切的转机发生在我大一的寒假,我心血来潮想学习 Python。无意间看到知乎有人推荐了 CS61A 这门课,说是 UC Berkeley 的大一入门课程,讲的就是 Python。我永远不会忘记那一天,打开 CS61A 课程网站的那个瞬间,就像哥伦布发现了新大陆一样,我开启了新世界的大门。

我一口气 3 个星期上完了这门课,它让我第一次感觉到原来 CS 可以学得如此充实而有趣,原来这世上竟有如此精华的课程。

为避免有崇洋媚外之嫌,我单纯从一个学生的视角来讲讲自学 CS61A 的体验:

  • 独立搭建的课程网站: 一个网站将所有课程资源整合一体,条理分明的课程 schedule、所有 slides, homework, discussion 的文件链接、详细明确的课程给分说明、历年的考试题与答案。这样一个网站抛开美观程度不谈,既方便学生,也让资源公正透明。

  • 课程教授亲自编写的教材:CS61A 这门课的开课老师将 MIT 的经典教材 Structure and Interpretation of Computer Programs (SICP) 用Python这门语言进行改编(原教材基于 Scheme 语言),保证了课堂内容与教材内容的一致性,同时补充了更多细节,可以说诚意满满。而且全书开源,可以直接线上阅读。

  • 丰富到让人眼花缭乱的课程作业:14 个 lab 巩固随堂知识点,10 个 homework,还有 4 个代码量均上千行的 project。与大家熟悉的 OJ 和 Word 文档式的作业不同,所有作业均有完善的代码框架,保姆级的作业说明。每个 Project 都有详尽的 handout 文档、全自动的评分脚本。CS61A 甚至专门开发了一个自动化的作业提交评分系统(据说还发了论文)。当然,有人会说“一个 project 几千行代码大部分都是助教帮你写好的,你还能学到啥?”。此言差矣,作为一个刚刚接触计算机,连安装 Python 都磕磕绊绊的小白来说,这样完善的代码框架既可以让你专注于巩固课堂上学习到的核心知识点,又能有“我才学了一个月就能做一个小游戏了!”的成就感,还能有机会阅读学习别人高质量的代码,从而为自己所用。我觉得在低年级,这种代码框架可以说百利而无一害。就是苦了老师和助教,因为开发这样的作业可想而知需要相当大的时间投入和多年的迭代积累。

  • 每周 Discussion 讨论课:助教会讲解知识难点和考试例题,习题全部用 LaTeX 撰写,相当规范并且会给出详细的解答,让学生及时查漏补缺巩固知识点。

这样的课程,你完全不需要任何计算机的基础,你只需要努力、认真、花时间就够了。此前那种有劲没处使的感觉,那种付出再多时间却得不到回报的感觉,从此烟消云散。这太适合我了,我从此爱上了自学。

试想如果有人能把艰深的知识点嚼碎嚼烂,用生动直白的方式呈现给你,还有那么多听起来就很 fancy,种类繁多的 project 来巩固你的理论知识,你会觉得他们真的是在倾尽全力想方设法地让你完全掌握这门课,你会觉得不学好它简直是对这些课程建设者的侮辱。

如果你觉得我在夸大其词,那么不妨从 CS61A 开始,因为它是我的梦开始的地方。

为什么写这本书

在我2020年秋季学期担任《深入理解计算机系统》(CSAPP)这门课的助教时,我已经自学一年多了。这一年多来我无比享受这种自学模式,为了分享这种快乐,我为自己的研讨班学生做过一个 CS自学资料整理仓库。当时纯粹是心血来潮,因为我也不敢公然鼓励大家翘课自学。

但随着又一年时间的维护,这个仓库的内容已经相当丰富,基本覆盖了计科、智能系、软工系的绝大多数课程,我也为每个课程都建了各自的 GitHub 仓库,汇总我用到的自学资料以及作业实现。

直到大四开始凑学分毕业的时候,我打开自己的培养方案,我发现它已经是我这个自学仓库的子集了,而这距离我开始自学也才两年半而已。于是,一个大胆的想法在我脑海中浮现:也许,我可以打造一个自学式的培养方案,把我这三年自学经历中遇到的坑、走过的路记录下来,以期能为后来的学弟学妹们贡献自己的一份微薄之力。

如果大家可以在三年不到的时间里就能建立起整座 CS 的基础大厦,能有相对扎实的数学功底和代码能力,经历过数十个千行代码量的 Project 的洗礼,掌握至少 C/C++/Java/JS/Python/Go/Rust 等主流语言,对算法、电路、体系、网络、操统、编译、人工智能、机器学习、计算机视觉、自然语言处理、强化学习、密码学、信息论、博弈论、数值分析、统计学、分布式、数据库、图形学、Web开发、云服务、超算等等方面均有涉猎。我想,你将有足够的底气和自信选择自己感兴趣的方向,无论是就业还是科研,你都将有相当的竞争力。

因为我坚信,既然你能坚持听我 BB 到这里,你一定不缺学好 CS 的能力,你只是没有一个好的老师,给你讲一门好的课程。而我,将力图根据我三年的体验,为你挑选这样的课程。

自学的好处

对我来说,自学最大的好处就在于可以完全根据自己的进度来调整学习速度。对于一些疑难知识点,我可以反复回看视频,在网上谷歌相关的内容,上 StackOverflow 提问题,直到完全将它弄明白。而对于自己掌握得相对较快的内容,则可以两倍速甚至三倍速略过。

自学的另一大好处就是博采众长。计算机系的几大核心课程:体系、网络、操统、编译,每一门我基本都上过不同大学的课程,不同的教材、不同的知识点侧重、不同的 project 将会极大丰富你的视野,也会让你理解错误的一些内容得到及时纠正。

自学的第三个好处是时间自由。大学的课余时间本就相对自由,再加上不用去上课的话更是可以放飞自我地安排自学时间和进度。我大二的时候赶上疫情在家窝了大半年,返校之后也基本没有线下去过教室上课,对绩点也毫无影响。

自学的坏处

当然,作为 CS 自学主义的忠实拥趸,我不得不承认自学也有它的坏处。

第一就是交流沟通的不便。我其实是一个很热衷于提问的人,对于所有没有弄明白的点,我都喜欢穷追到底。但当你面对着屏幕听到老师讲了一个你没明白的知识点的时候,你无法顺着网线到另一端向老师问个明白。我努力通过独立思考和善用 Google 来缓解这一点,但是,如果能有几个志同道合的伙伴结伴自学,那将是极好的。关于交流群的建立,大家可以参考仓库 README 中的教程。

第二就是这些自学的课程基本都是英文的。从视频到课件再到作业全是英文,所以有一定的门槛。我尽量在汇总课程视频资源的时候寻找带中文字幕的搬运视频,但大多数课程还是只有机翻或者生肉,而课件和作业肯定都是英文的。不过我觉得这是个值得努力克服的挑战,因为在当下,虽然我很不情愿,但也不得不承认,在计算机领域,很多优质的文档、论坛、网站都是英文居多。养成英文阅读的习惯,在赤旗插遍世界之前,还是有一定好处的(狗头保命)。

第三,也是我觉得最困难的一点,就是自律。因为没有 DDL 有时候真的是一件可怕的事情。特别是随着学习的深入,国外的很多课程是相当虐的。你得有足够的驱动力强迫自己静下心来,阅读几十页的 Project Handout,理解上千行的代码框架,忍受数个小时的 debug 时光。而这一切,没有学分,没有绩点,没有老师,没有同学,只有一个信念 —— 你在变强。

这本书适合谁

正如我在前言里说的,任何有志于自学计算机的朋友都可以参考这本书。如果你已经有了一定的计算机基础,只是对某个特定的领域感兴趣,可以选择性地挑选你感兴趣的内容进行学习。当然,如果你是一个像我当年一样对计算机一无所知的小白,初入大学的校门,我希望这本书能成为你的攻略,让你花最少的时间掌握你所需要的知识和能力。某种程度上,这本书更像是一个根据我的体验来排序的课程搜索引擎,帮助大家足不出户,体验世界顶级名校的计算机优质课程。

当然,作为一个还未毕业的本科生,我深感自己没有能力也没有权利去宣扬一种学习方式,我只是希望这份资料能让那些同样有自学之心和毅力朋友可以少走些弯路,收获更丰富、更多样、更满足的学习体验。

特别鸣谢

在这里,我怀着崇敬之心真诚地感谢所有将课程资源无偿开源的各位教授们。这些课程倾注了他们数十年教学生涯的积淀和心血,他们却选择无私地让所有人享受到如此高质量的 CS 教育。没有他们,我的大学生活不会这样充实而快乐。很多教授在我给他们发了感谢邮件之后,甚至会回复上百字的长文,真的让我无比感动。他们也时刻激励着我,做一件事,就得用心做好,无论是学习、科研、还是为人。

你也想加入到贡献者的行列

一个人的力量终究是有限的,这本书也是我在繁重的科研之余熬夜抽空写出来的,难免有不够完善之处。另外,由于个人做的是系统方向,很多课程侧重系统领域,对于数学、理论计算机、高级算法相关的内容则相对少些。如果有大佬想在其他领域分享自己的自学经历与资源,可以直接在项目中发起 Pull Request,也欢迎和我邮件联系(zhongyinmin@pku.edu.cn)。

关于交流群的建立

本书支持页面评论功能,因此如果你想自学某课程,可以自己建立群聊后(QQ 微信皆可)在对应的课程页面下方发表评论,注明你的学习目标以及加入交流群的途径。此外,过去已有不少朋友在 issue 里建立了类似群聊,可以自行选择直接加入。

请作者喝杯下午茶

本书的内容是完全开源免费的,如果你觉得该项目对你真的有帮助,可以给仓库点个 star 或者请作者喝一杯下午茶。

Image title