In this page, I will share my learning notes for large language models (LLMs). It mostly based on the Stanford CS336 course with some additional notes that not covered in the course. It wil be updated regularly as I learn more about LLMs. Those notes are not meant to be comprehensive, but rather a summary of the key concepts and ideas that I find interesting and useful. Besides LLMs, I will also cover some related topics such as Multi-modality LLMs, and some typical and interesting papers and projects in the field of LLMs. I hope those notes can be helpful for anyone who is interested in learning about LLMs, and also for myself to review and consolidate my knowledge about LLMs.

LLM Learning Notes

Stanford CS336: LLM from Scratch Assignments

Lecture Number	Related Post
Lecture 3&4: Architecture, hyper-parameters, attention, moe	LLM Architecture
Lecture 5&6: GPUs, TPUs, Kernels, Triton, XLA	GPU Speedup
Lecture 7&8: Parallelism	Parallelism
Lecture 9&11: Scaling Laws	Scaling Laws
Lecture 10: Inference	Inference
Lecture 12: Evaluation	Evaluation
Lecture 13&14: Data	Data
Lecture 15, 16 & 17: Alignment	LLM Alignment, GRPOs

Stanford CS336 course introduce the fundamental concepts of LLMs and how to build them from scratch. The course is designed to be hands-on, and the assignments are a key part of the learning process. I shared my solutions for those assignments in this section.

Related Courses and Resources

LLM Learning Notes

Stanford CS336: LLM from Scratch Assignments