Where LLMs and Cosine Similarity Fit Into the Stack

Large Language Models (LLMs) sit at the top of this entire stack as enormous neural networks built from layers of tensor operations running on GPUs. At their core, LLMs are fundamentally prediction systems trained to model relationships between tokens, concepts, and patterns in language. Every word, sentence, or document is converted into high-dimensional numerical representations … Read more

Matrices, Tensors, TensorFlow, and the CUDA Stack โ€” The Mathematics and Infrastructure Behind Modern AI

Modern AI Runs on Mathematics Modern AI looks magical from the outside. You type a prompt into ChatGPT, an image appears from a diffusion model, or a voice assistant responds naturally in real time. Underneath all of it is something surprisingly fundamental: massive amounts of matrix multiplication. Modern AI is built on layers that stack … Read more