Related Resources:
- Lecture Website: CS336 LLM from Scratch
- Lecture Recordings: YouTube Playlist
- My Solution Repo: GitHub
About this Course:
- This course has
17 Lectures and5 Assignments in total. - It might take around
200 hours to finish all the lectures and assignments.
For those who have
- LECTURE 1, 2, 3, 4 & Assignment01: After completing these, you will have a solid understanding of the fundamentals of LLMs, such as the Transformer Language Model architecture, attention mechanism, Mixture of Experts, and the training process of LLMs using autoregressive language modeling.
- LECTURE 15, 16, 17 & Assignment05: These cover advanced topics such as LLM aligment algorithms, such as SFT, RLHF(PPO, DPO), and RLVR(GRPO, Dr.GRPO). After completing these, you will understand how to align LLMs with human preferences and train a reasoning LLM.
- LECTURE 5, 6, 7, 8 & Assignment02: These focus on the hardware and parallelism techniques for training large models. After completing these, you will understand how to efficiently train large LLMs using distributed systems, such as data parallelism, model parallelism, and pipeline parallelism, and speed up the training process by leveraging the power of GPU, and undertand FlashAttention and its implementation.
- The remaining lectures and assignments are also important, but they can be studied at a later time based on your interests and needs. Those includes:
- Lecture 9 & 11: Scaling Laws
- Lecture 10: Inference Optimization
- Lecture 12 & Assignment 03: Evaluation of LLMs
- Lecture 13, 14 & Assignment 04: Data Collection and Processing
1 Lectures Notes
2 Assignments Solutions
No matching items




