In this page, I will share my learning notes for large language models (LLMs). It mostly based on the Stanford CS336 course with some additional notes that not covered in the course. It wil be updated regularly as I learn more about LLMs. Those notes are not meant to be comprehensive, but rather a summary of the key concepts and ideas that I find interesting and useful. Besides LLMs, I will also cover some related topics such as Multi-modality LLMs, and some typical and interesting papers and projects in the field of LLMs. I hope those notes can be helpful for anyone who is interested in learning about LLMs, and also for myself to review and consolidate my knowledge about LLMs.
LLM Learning Notes
Stanford CS336: LLM from Scratch Assignments
| Lecture Number | Related Post |
|---|---|
| Lecture 3&4: Architecture, hyper-parameters, attention, moe | LLM Architecture |
| Lecture 5&6: GPUs, TPUs, Kernels, Triton, XLA | LLM Alignment |
| Lecture 7&8: Parallelism | LLM Data |
| Lecture 9&11: Scaling Laws | LLM Scaling Laws |
| Lecture 10: Inference | LLM Data |
| Lecture 12: Evaluation | LLM Evaluation |
| Lecture 13&14: Data | LLM Data |
| Lecture 15, 16 & 17: Alignment | LLM Alignment |
Stanford CS336 course introduce the fundamental concepts of LLMs and how to build them from scratch. The course is designed to be hands-on, and the assignments are a key part of the learning process. I shared my solutions for those assignments in this section.

