AI Learning Hub

A growing personal map of AI concepts, buzzwords, and practical knowledge — explained simply with analogies.

This page organizes AI concepts by category so I can learn the language of AI step by step.

📝Open AI Notes 🧠Open Flashcards 🤔Confused 💾Backup 🤖AI Tutor

📋All(257 concepts)

Recommended Start

🤖

Big Picture

Artificial Intelligence

Explanation: AI is the broad field of making computers perform tasks that normally require human intelligence, such as understanding language, recognizing images, making decisions, or solving problems.

Analogy: Like teaching a computer to act as a helpful assistant that can see patterns, answer questions, and make suggestions.

BeginnerAIbasicsintelligence

Recommended Start

📊

Learning From Data

Machine Learning

Explanation: Machine learning is a type of AI where systems learn patterns from data instead of being explicitly programmed with every rule.

Analogy: Instead of giving a child every rule about dogs, you show many dog pictures until they learn what dogs usually look like.

BeginnerMLdatapatterns

Recommended Start

🧠

Neural Networks

Deep Learning

Explanation: Deep learning is a type of machine learning that uses many layers of artificial neurons to learn complex patterns.

Analogy: Like a team of people passing information through multiple review levels, where each level notices something more advanced.

Beginnerdeep learningneural networkslayers

🛢️

Fuel for AI

Data

Explanation: Data is the information AI learns from, such as text, numbers, images, audio, video, transactions, or user behavior.

Analogy: Data is like the ingredients used to cook an AI model. Bad ingredients lead to a bad meal.

Beginnerdatadatasettraining

🗂️

Organized Data

Dataset

Explanation: A dataset is a structured collection of examples used to train, test, or evaluate a model.

Analogy: Like a workbook full of practice problems and answers.

Beginnerdatasetexamplestraining data

✅

Learning With Answers

Supervised Learning

Explanation: Supervised learning trains a model using examples that already include the correct answers.

Analogy: Like studying with an answer key. You try a problem, compare with the correct answer, and improve.

Beginnersupervised learninglabelsprediction

🔍

Finding Patterns

Unsupervised Learning

Explanation: Unsupervised learning finds patterns or groups in data without being given correct answers.

Analogy: Like sorting a pile of mixed coins by similarity without anyone telling you the coin types.

Beginnerclusteringpatternsunlabeled data

🎮

Learning by Reward

Reinforcement Learning

Explanation: Reinforcement learning trains an agent to make decisions by rewarding good actions and penalizing bad ones.

Analogy: Like training a dog with treats, or a video game player learning which moves earn more points.

Intermediatereinforcement learningrewardsagent

🧩

Input Signals

Features

Explanation: Features are the input pieces of information a model uses to make a prediction.

Analogy: For home price prediction, features are things like location, square footage, bedrooms, and school rating.

Beginnerfeaturesinputprediction

🏷️

Correct Answers

Labels

Explanation: Labels are the correct answers used during supervised learning.

Analogy: If the model sees an image, the label tells it whether the image is a cat, dog, car, or house.

Beginnerlabelsanswerssupervised learning

🧾

Predicting Categories

Classification

Explanation: Classification means predicting a category or class.

Analogy: Sorting emails into spam or not spam.

Beginnerclassificationcategoriesprediction

📈

Predicting Numbers

Regression

Explanation: Regression means predicting a number.

Analogy: Predicting a home price, stock price, temperature, or delivery time.

Beginnerregressionnumeric prediction

🏋️

Learning Set

Training Data

Explanation: Training data is the data used to teach the model.

Analogy: Like practice questions before an exam.

Beginnertrainingdatasetlearning

🧪

Final Check

Test Data

Explanation: Test data is separate data used to check how well the model performs on examples it has not seen.

Analogy: Like a final exam with new questions.

Beginnertest dataevaluationgeneralization

Recommended Start

🧠

Core Unit

Artificial Neuron

Explanation: The smallest building block of a neural network. It receives numbers as inputs, does some math, and passes one output number to the next layer.

Analogy: Like a dimmer switch that receives signals from multiple wires, adds them up, and decides how brightly to glow.

Beginnerneuronneural networkdeep learning

⚖️

Parameters

Weights & Bias

Explanation: Adjustable numbers inside neurons. Weights control how much each input matters; bias shifts the output up or down. Training finds the best values.

Analogy: Like a recipe. Weights are ingredient amounts; bias is the oven setting. Training finds the best recipe.

Beginnerweightsbiasparameters

➡️

Computation

Forward Pass

Explanation: Data flows from input through the network to produce a prediction. No learning happens here; it is just calculation.

Analogy: Like answering a quiz using what you currently know.

Input → Hidden Layers → Output

Beginnerforward passpredictioncomputation

🚦

Non-linearity

Activation Function

Explanation: A math function applied after a neuron. It decides whether and how strongly the neuron should fire, allowing networks to learn complex patterns.

Analogy: Like a bouncer at a club deciding who gets in.

BeginneractivationReLUsigmoidneural network

📏

Error Measurement

Loss Function

Explanation: A formula that measures how wrong the model's prediction is compared to the correct answer.

Analogy: Like GPS measuring how far your current route is from the destination.

Prediction vs Actual = Loss

Beginnerlosserrortraining

🔄

Learning Algorithm

Backpropagation

Explanation: Backpropagation works backward through the network to figure out which weights caused the error and how to adjust them.

Analogy: Like a coach watching a game replay in reverse to see who made each mistake.

Intermediatebackpropagationgradientstraining

🎚️

Hyperparameter

Learning Rate

Explanation: Learning rate controls how big a step the model takes when updating weights.

Analogy: Like adjusting shower temperature. Big turns may overshoot; tiny turns take forever.

Beginnerlearning rateoptimizationtraining

📉

Model Health

Overfitting

Explanation: Overfitting happens when a model memorizes training data but performs poorly on new data.

Analogy: Like a student memorizing exact exam answers but failing when questions are worded differently.

Beginneroverfittinggeneralizationmodel health

📈

Model Health

Underfitting

Explanation: Underfitting happens when a model is too simple to learn the real pattern.

Analogy: Like a student who barely studied and cannot answer even the practice questions.

Beginnerunderfittingmodel healthtraining

🛡️

Generalization

Regularization

Explanation: Regularization helps prevent overfitting by discouraging overly complex models.

Analogy: Like packing only what fits in a carry-on so you bring only what is truly needed.

Intermediateregularizationoverfittinggeneralization

⛰️

Optimization

Gradient Descent

Explanation: Gradient descent is an optimization method that updates model parameters step by step to reduce loss.

Analogy: Like walking downhill in fog by feeling the slope under your feet.

Intermediategradient descentoptimizationtraining

✨

Creating Content

Generative AI

Explanation: Generative AI creates new content such as text, images, audio, video, code, or summaries based on patterns it learned.

Analogy: Like a creative assistant that studied many examples and can produce something new in the same style.

Beginnergenerative AIcontentAI

🏗️

Base Model

Foundation Model

Explanation: A foundation model is a large model trained on broad data that can be adapted to many tasks.

Analogy: Like a general-purpose engine that can power many different vehicles.

Intermediatefoundation modelbase modelLLM

🖼️

Multiple Input Types

Multimodal AI

Explanation: Multimodal AI can work with more than one type of input or output, such as text, images, audio, or video.

Analogy: Like a person who can read, listen, see, and speak.

Intermediatemultimodalimageaudiovideo

Recommended Start

📚

Language AI

Large Language Model

Explanation: An LLM is an AI model trained on large amounts of text to understand and generate language.

Analogy: Like a very advanced autocomplete system that can write, explain, summarize, and reason with text.

BeginnerLLMlanguage modeltext

Recommended Start

🧱

Text Unit

Token

Explanation: Tokens are chunks of text that models process. A token can be a word, part of a word, punctuation, or space.

Analogy: Like cutting a sentence into small Lego blocks before the model reads it.

Beginnertokenstokenizertext

🪟

Model Memory During Request

Context Window

Explanation: The context window is the amount of text the model can consider at one time during a request.

Analogy: Like the number of pages you can keep open on your desk while answering a question.

Beginnercontext windowpromptLLM

🎯

Focus Mechanism

Attention

Explanation: Attention helps a model decide which parts of the input are most important when generating an output.

Analogy: Like highlighting the most important sentences in a document before answering questions.

IntermediateattentiontransformerLLM

🧭

Meaning as Numbers

Embedding

Explanation: An embedding converts text, images, or other data into a list of numbers that captures meaning.

Analogy: Like giving every idea a GPS coordinate so similar ideas are close together.

Beginnerembeddingvectorsemantic meaning

🌀

Model Risk

Hallucination

Explanation: A hallucination is when an AI model gives an answer that sounds confident but is false or unsupported.

Analogy: Like a student guessing confidently instead of admitting they do not know.

BeginnerhallucinationreliabilityAI risk

Recommended Start

📝

Instruction

Prompt

Explanation: A prompt is the instruction or question given to an AI model.

Analogy: Like giving directions to an employee. Clear directions usually produce better work.

BeginnerpromptinstructionLLM

🧭

Behavior Control

System Prompt

Explanation: A system prompt gives high-level instructions about how the model should behave.

Analogy: Like company policy for an assistant before they start doing tasks.

Beginnersystem promptbehaviorinstruction

🎓

Examples

Few-Shot Prompting

Explanation: Few-shot prompting gives the model a few examples before asking it to perform a similar task.

Analogy: Like showing a trainee three sample invoices before asking them to process a new one.

Intermediatefew-shotexamplesprompting

🧠

Reasoning Pattern

Chain-of-Thought Style Reasoning

Explanation: This is a prompting approach that encourages step-by-step reasoning, though private reasoning should usually be summarized rather than exposed.

Analogy: Like asking someone to solve a math problem carefully instead of blurting out the first answer.

Intermediatereasoningpromptingstep-by-step

🧾

Reliable Format

Structured Output

Explanation: Structured output makes the model return data in a specific format, such as JSON that follows a schema.

Analogy: Like asking someone to fill a form instead of writing a free-form paragraph.

Intermediatestructured outputJSONschema

Recommended Start

📚

Retrieval-Augmented Generation

RAG

Explanation: RAG helps an LLM answer using retrieved information from trusted documents or databases instead of relying only on memory.

Analogy: Like an open-book exam where the assistant looks up the right page before answering.

IntermediateRAGretrievalknowledge base

🗄️

Semantic Search Storage

Vector Database

Explanation: A vector database stores embeddings and helps find items with similar meaning.

Analogy: Like a library organized by meaning instead of alphabetically.

Intermediatevector databaseembeddingssemantic search

🔎

Meaning-Based Search

Semantic Search

Explanation: Semantic search finds results based on meaning, not just exact keywords.

Analogy: Searching for “car repair” can also find “auto mechanic” because the meaning is similar.

Beginnersemantic searchembeddingsretrieval

✂️

Document Preparation

Chunking

Explanation: Chunking splits large documents into smaller pieces so they can be embedded, searched, and retrieved.

Analogy: Like cutting a big book into useful index cards.

BeginnerchunkingRAGdocuments

📥

Finding Relevant Context

Retrieval

Explanation: Retrieval is the process of finding relevant documents or chunks before generating an answer.

Analogy: Like pulling the right folder from a filing cabinet before responding.

BeginnerretrievalRAGcontext

🤝

Goal-Oriented AI

AI Agent

Explanation: An AI agent is a system that can use tools, follow goals, make decisions, and take actions across multiple steps.

Analogy: Like an assistant who can not only answer questions but also check your calendar, draft an email, and update a task list.

Intermediateagenttoolsautomation

🛠️

External Actions

Tool Calling

Explanation: Tool calling lets an AI model use external tools such as APIs, databases, calculators, search, or file systems.

Analogy: Like a manager asking specialists to perform tasks instead of doing everything alone.

Intermediatetool callingAPIagents

🧠

Context Retention

Agent Memory

Explanation: Agent memory stores useful information so an AI system can remember preferences, past work, or ongoing tasks.

Analogy: Like a notebook an assistant keeps so you do not repeat yourself every time.

Intermediatememoryagentspersonalization

🔄

Multi-Step Tasks

Workflow Automation

Explanation: Workflow automation connects multiple steps so AI can help complete repeatable processes.

Analogy: Like an assembly line where each station does one part of the work.

Intermediateworkflowautomationagents

📖

Broad Learning

Pretraining

Explanation: Pretraining teaches a model broad patterns from large datasets before it is specialized.

Analogy: Like going to school before training for a specific job.

Intermediatepretrainingfoundation modeltraining

🎯

Specialization

Fine-Tuning

Explanation: Fine-tuning adapts a pretrained model to perform better on a specific task or style of output.

Analogy: Like taking a general doctor and training them further to become a specialist.

Intermediatefine-tuningmodel optimizationspecialization

🎛️

Training Settings

Hyperparameters

Explanation: Hyperparameters are settings chosen before training, such as learning rate, batch size, and number of epochs.

Analogy: Like oven temperature and baking time before making a cake.

Intermediatehyperparameterstraining settings

🔁

Training Cycle

Epoch

Explanation: An epoch is one full pass through the training dataset.

Analogy: Like reading the entire textbook one time.

Beginnerepochtrainingdataset

📦

Training Chunk

Batch Size

Explanation: Batch size is the number of examples processed before the model updates its weights.

Analogy: Like grading 32 homework papers at a time before adjusting how you teach.

Intermediatebatch sizetrainingoptimization

✅

Basic Metric

Accuracy

Explanation: Accuracy measures the percentage of predictions the model got right.

Analogy: Like a test score showing how many questions were answered correctly.

Beginneraccuracyevaluationmetric

🎯

Quality of Positive Predictions

Precision

Explanation: Precision measures how many predicted positives were actually correct.

Analogy: If a spam filter marks 100 emails as spam, precision asks how many really were spam.

Intermediateprecisionevaluationclassification

📡

Coverage of Actual Positives

Recall

Explanation: Recall measures how many actual positives the model successfully found.

Analogy: If there were 100 spam emails, recall asks how many the filter caught.

Intermediaterecallevaluationclassification

⚖️

Balanced Metric

F1 Score

Explanation: F1 score balances precision and recall into one metric.

Analogy: Like judging both how careful and how complete someone is.

IntermediateF1precisionrecall

🧪

Standard Test

Benchmark

Explanation: A benchmark is a standard test used to compare models.

Analogy: Like giving every runner the same race course to compare performance fairly.

Intermediatebenchmarkmodel comparisonevaluation

👥

Human Judgment

Human Evaluation

Explanation: Human evaluation uses people to judge model outputs for quality, helpfulness, accuracy, or tone.

Analogy: Like a teacher grading an essay when automatic scoring is not enough.

Beginnerhuman evaluationqualityreview

⚡

Model Usage

Inference

Explanation: Inference is when a trained model is used to make a prediction or generate an answer.

Analogy: Training is studying; inference is taking the actual test.

Beginnerinferencepredictiondeployment

🔌

Access Layer

API

Explanation: An API lets software systems communicate with each other.

Analogy: Like a restaurant menu: you request something from available options, and the kitchen returns it.

BeginnerAPIintegrationsoftware

⏱️

Response Speed

Latency

Explanation: Latency is how long it takes for a system to respond.

Analogy: Like waiting time after asking a question.

Beginnerlatencyperformancedeployment

🚦

System Capacity

Throughput

Explanation: Throughput is how many requests a system can handle in a period of time.

Analogy: Like how many cars can pass through a toll booth per minute.

Intermediatethroughputscalingperformance

📡

Operational Health

Monitoring

Explanation: Monitoring tracks system behavior such as errors, latency, request volume, cost, and uptime.

Analogy: Like a dashboard in a car showing speed, fuel, and warning lights.

BeginnermonitoringMLOpsoperations

🔬

Understanding Internals

Observability

Explanation: Observability helps teams understand why a system behaves a certain way using logs, metrics, and traces.

Analogy: Monitoring says the car is overheating; observability helps find whether it is the fan, coolant, or engine.

Intermediateobservabilitylogsmetricstraces

🌊

Performance Decay

Model Drift

Explanation: Model drift happens when model performance gets worse because real-world data changes over time.

Analogy: Like a map becoming outdated after new roads are built.

Intermediatemodel driftmonitoringretraining

⚠️

Fairness Risk

Bias

Explanation: Bias happens when an AI system produces unfair or skewed results because of data, design, or usage problems.

Analogy: Like a judge who unknowingly favors one group because past examples were unbalanced.

Beginnerbiasfairnessresponsible AI

💬

Understanding Decisions

Explainability

Explanation: Explainability is the ability to understand why a model made a certain prediction or recommendation.

Analogy: Like asking a loan officer to explain why an application was approved or denied.

Intermediateexplainabilitytransparencytrust

🔐

Data Protection

Privacy

Explanation: Privacy means protecting sensitive user data from misuse, exposure, or unnecessary collection.

Analogy: Like keeping personal documents in a locked cabinet and only sharing what is needed.

Beginnerprivacysecuritydata

🧱

Safety Controls

Guardrails

Explanation: Guardrails are rules and controls that keep AI systems from producing unsafe, incorrect, or inappropriate outputs.

Analogy: Like lane markings and barriers on a highway.

Intermediateguardrailssafetyresponsible AI

💼

Business Problem

AI Use Case

Explanation: An AI use case is a specific problem where AI can create value.

Analogy: Like identifying which business task needs a power tool instead of using a hammer everywhere.

Beginneruse casebusinessproduct

💰

Business Value

ROI

Explanation: ROI measures whether the benefit of an AI solution is worth the cost.

Analogy: Like asking whether buying a machine saves enough labor or time to justify the price.

BeginnerROIvaluecost

👤

Human Oversight

Human-in-the-Loop

Explanation: Human-in-the-loop means people review, guide, or approve AI outputs instead of letting AI act alone.

Analogy: Like autopilot with a pilot still watching and ready to take control.

Beginnerhuman reviewoversightAI product

🧭

Practical Application

AI Workflow

Explanation: An AI workflow is the end-to-end process where AI helps complete a business task.

Analogy: Like a checklist where AI handles some steps and humans handle others.

Beginnerworkflowproductautomation

💻

Development Environment

AI Coding IDE

Explanation: An AI coding IDE is a code editor or development environment with built-in AI help for writing, editing, debugging, and understanding code.

Analogy: Like having a senior developer sitting beside you while you code.

BeginnerIDEcodingdeveloper toolsAI coding

🤖

Autonomous Coding Help

Coding Agent

Explanation: A coding agent can inspect files, understand a task, edit code, run commands, check errors, and iterate toward a solution.

Analogy: Like assigning a junior developer a task and asking them to make code changes, test them, and report back.

Beginnercoding agentautomationdeveloper tools

🧑‍💻

Terminal Coding Agent

OpenCode

Explanation: OpenCode is a terminal-based AI coding agent that can help modify files, run commands, inspect errors, and work inside a project.

Analogy: Like a command-line developer assistant that works directly inside your codebase.

BeginnerOpenCodeterminalcoding agentCLI

🖱️

AI Code Editor

Cursor

Explanation: Cursor is an AI-powered code editor that helps with code generation, refactoring, debugging, and project navigation.

Analogy: Like VS Code with an AI pair programmer built in.

BeginnerCursorIDEAI editorcoding

🧠

Coding Agent

Claude Code

Explanation: Claude Code is an AI coding assistant that can work with codebases, understand files, and help implement changes through natural language instructions.

Analogy: Like asking a thoughtful engineer to inspect your project and carefully make changes.

IntermediateClaude Codecoding agentAnthropic

✨

Command Line AI

Gemini CLI

Explanation: Gemini CLI lets you use Gemini from the command line for coding, explanation, generation, and project assistance.

Analogy: Like chatting with Gemini directly from your terminal.

BeginnerGemini CLIGoogleterminalcoding

🧬

Command Line Coding Agent

Codex CLI

Explanation: Codex CLI is a command-line coding assistant that can help read, modify, and reason about code using OpenAI models.

Analogy: Like giving a coding task to an AI assistant from your terminal.

IntermediateCodex CLIOpenAIcoding agent

🚀

Code Completion

GitHub Copilot

Explanation: GitHub Copilot helps developers write code faster by suggesting completions, functions, explanations, and fixes inside the editor.

Analogy: Like autocomplete for programming, but much smarter.

BeginnerCopilotGitHubautocompleteIDE

🌊

AI IDE

Windsurf

Explanation: Windsurf is an AI-powered development environment focused on helping developers navigate, edit, and build code with AI assistance.

Analogy: Like a coding workspace where AI helps guide the flow of development.

BeginnerWindsurfIDEAI coding

🦘

Agentic Coding Extension

Roo Code

Explanation: Roo Code is an AI coding agent extension that can help plan, edit, and execute coding tasks inside a development environment.

Analogy: Like a project helper that can switch between planning, coding, and debugging modes.

IntermediateRoo Codecoding agentextension

🧩

Editor Intelligence

Language Server Protocol

Explanation: LSP is a protocol that gives coding tools smart language features like autocomplete, go-to-definition, error detection, and rename support.

Analogy: Like the grammar and spell-check engine for programming languages.

IntermediateLSPlanguage serverautocompleteeditor

🤝

Development Workflow

Pair Programming with AI

Explanation: Pair programming with AI means using an AI assistant as a coding partner to explain, suggest, debug, and improve code.

Analogy: Like coding with another developer who can quickly give suggestions and second opinions.

Beginnerpair programmingAI codingworkflow

📝

Developer Prompting

Prompting Coding Agents

Explanation: Prompting coding agents means giving clear instructions about the goal, scope, constraints, files to touch, and acceptance criteria.

Analogy: Like giving a contractor a precise work order instead of a vague request.

Beginnerpromptcoding agentinstructions

🛡️

Best Practice

Safe & Scoped Code Changes

Explanation: Safe and scoped code changes mean asking the AI to modify only the needed files and avoid broad refactors or unrelated changes.

Analogy: Like telling a mechanic to fix only the brakes, not rebuild the whole car.

Beginnersafe changesscoped changesrefactoring

🏗️

Agent Roles

Build Agent vs Chat Agent

Explanation: A chat agent mostly answers questions, while a build agent can actively inspect files, edit code, run commands, and fix issues.

Analogy: A chat agent gives advice; a build agent rolls up its sleeves and works on the project.

Beginnerbuild agentchat agentcoding workflow

🧰

Big Picture

AI Application Framework

Explanation: An AI application framework helps developers build apps that use LLMs, tools, memory, retrieval, workflows, and agents without wiring every piece manually.

Analogy: Like a construction toolkit for building AI apps instead of buying every tool separately.

BeginnerAI frameworkLLM appdeveloper toolsorchestration

🔗

LLM Application Framework

LangChain

Explanation: LangChain is a framework for building LLM-powered applications and agents by connecting models, prompts, tools, memory, and external data sources.

Analogy: Like plumbing that connects the AI model to tools, documents, APIs, and workflows.

IntermediateLangChainLLM appagentstools

🕸️

Agent Orchestration

LangGraph

Explanation: LangGraph is a framework for building stateful, multi-step AI agents and workflows using graph-based control. It is useful when an agent needs durable execution, branching, memory, or human approval.

Analogy: Like a flowchart engine for AI agents where each node is a step and the graph controls what happens next.

IntermediateLangGraphagentsorchestrationgraph workflow

🔬

Observability & Evaluation

LangSmith

Explanation: LangSmith is used to trace, debug, evaluate, and monitor LLM applications and agents.

Analogy: Like a flight recorder for AI apps that shows what happened at every step.

IntermediateLangSmithtracingevaluationobservability

🦙

Data & RAG Framework

LlamaIndex

Explanation: LlamaIndex helps connect private or external data to LLM applications. It is commonly used for document ingestion, indexing, retrieval, and RAG workflows.

Analogy: Like a librarian that prepares your documents so an AI assistant can search and use them.

IntermediateLlamaIndexRAGretrievaldocumentsindexing

🌾

RAG & Search Framework

Haystack

Explanation: Haystack is a framework for building search, question-answering, and RAG pipelines using components such as retrievers, readers, generators, and document stores.

Analogy: Like a pipeline factory for search and question-answering systems.

IntermediateHaystackRAGsearchquestion answering

👥

Multi-Agent Framework

CrewAI

Explanation: CrewAI helps developers build multi-agent systems where different agents have roles, goals, tools, and tasks that work together.

Analogy: Like assigning a team of specialists to solve a project together.

IntermediateCrewAImulti-agentagentsworkflows

🤝

Multi-Agent Conversation

AutoGen

Explanation: AutoGen is a framework for building systems where multiple AI agents can collaborate, chat, use tools, and solve tasks through conversation.

Analogy: Like a group chat of AI specialists discussing and completing work.

IntermediateAutoGenmulti-agentcollaborationagents

🧠

Enterprise AI Orchestration

Semantic Kernel

Explanation: Semantic Kernel is an SDK for integrating AI models with functions, plugins, memory, and planners, often used in enterprise-style AI applications.

Analogy: Like an orchestration layer that lets AI call business functions in a structured way.

IntermediateSemantic KernelMicrosoftpluginsorchestration

⚙️

Prompt Optimization

DSPy

Explanation: DSPy is a framework for programming and optimizing language model pipelines, reducing manual prompt tweaking by treating prompts and modules more like trainable components.

Analogy: Like moving from hand-writing every instruction to building a system that can tune its own instructions.

AdvancedDSPyprompt optimizationLLM pipelines

🧪

Agent Framework

Agno

Explanation: Agno is a framework for building AI agents with tools, memory, knowledge, and multi-agent workflows.

Analogy: Like a lightweight workshop for creating practical AI agents.

IntermediateAgnoagentstoolsworkflows

🔌

Model Context Protocol

MCP

Explanation: MCP, or Model Context Protocol, is a standard way for AI apps and agents to connect to external tools, data, and services through a common interface.

Analogy: Like a universal adapter that lets AI assistants plug into many different systems.

IntermediateMCPprotocoltoolsintegrationsagents

🔁

Agent-to-Agent Protocol

A2A

Explanation: A2A, or Agent-to-Agent communication, refers to protocols and patterns that let AI agents communicate, delegate, and coordinate work with other agents.

Analogy: Like giving AI agents a shared language so they can work as a team.

AdvancedA2Aagent protocolmulti-agentinteroperability

🎼

Workflow Control

Agent Orchestration

Explanation: Agent orchestration controls how agents, tools, memory, approvals, and steps are coordinated in a multi-step AI workflow.

Analogy: Like a conductor coordinating musicians so each one plays at the right time.

Intermediateorchestrationagentsworkflowstools

🧾

Agent Memory & Flow

Agent State

Explanation: Agent state is the information an agent keeps track of while working through a task, such as current step, prior messages, tool results, decisions, and intermediate outputs.

Analogy: Like a project notebook that keeps track of what has already happened and what still needs to happen.

Intermediateagent statememoryworkflowLangGraph

🧱

Reliability

Durable Execution

Explanation: Durable execution means a long-running workflow can survive interruptions, retries, failures, or restarts without losing progress.

Analogy: Like saving your game progress so you can continue after the computer restarts.

Advanceddurable executionreliabilityagentsworkflows

👤

Safety & Control

Human-in-the-Loop Approval

Explanation: Human-in-the-loop approval means an AI workflow pauses and asks a person to review or approve an important step before continuing.

Analogy: Like a manager approving a payment before it is sent.

Beginnerhuman in the loopapprovalsafetyagents

🧰

Tool Management

Tool Registry

Explanation: A tool registry is a list of tools an AI agent is allowed to use, including their names, descriptions, inputs, and rules.

Analogy: Like a toolbox inventory that tells the agent which tools exist and how to use them.

Intermediatetool registrytool callingagentsintegrations

📦

Developer Concept

Framework vs Library

Explanation: A library gives you functions you call when needed; a framework provides a structure and often controls the flow of how your app is built.

Analogy: A library is a box of tools; a framework is a partially built house with rules for where things go.

Beginnerframeworklibrarydeveloper tools

🧩

Workflow Builders

Low-Code AI Workflow

Explanation: Low-code AI workflow tools help users build AI automations visually or with minimal code, often by connecting prompts, tools, APIs, and triggers.

Analogy: Like building an AI process with Lego blocks instead of writing all the code by hand.

Beginnerlow-codeworkflowautomationAI apps

✅

Classification

Logistic Regression

Explanation: Logistic regression is a classification algorithm used to predict yes/no or category outcomes, such as spam vs not spam or approved vs denied.

Analogy: Like a decision boundary that says, 'If the score is above this line, choose yes; otherwise choose no.'

Inputs → Probability → Class

Beginnerclassificationlogistic regressionsupervised learningprobability

🧮

Classification

Naive Bayes

Explanation: Naive Bayes is a classification algorithm based on probability. It assumes features are mostly independent, which makes it fast and useful for text classification.

Analogy: Like guessing the topic of a document by counting clue words.

Beginnerclassificationnaive bayesprobabilitytext classification

👥

Classification

K-Nearest Neighbors

Explanation: KNN classifies a new example by looking at the most similar nearby examples and choosing the majority label.

Analogy: Like deciding a neighborhood's vibe by looking at the closest houses around it.

New point → nearest neighbors → majority class

BeginnerKNNclassificationsimilaritysupervised learning

🌳

Classification & Regression

Decision Tree

Explanation: A decision tree makes predictions by asking a sequence of yes/no questions until it reaches an answer.

Analogy: Like a flowchart that guides you to a decision.

Beginnerdecision treeclassificationregressionsupervised learning

🌲

Ensemble Learning

Random Forest

Explanation: Random forest combines many decision trees and averages or votes across them to make a more reliable prediction.

Analogy: Instead of asking one expert, you ask a panel of experts and go with the majority.

Intermediaterandom forestensembleclassificationregression

📏

Classification

Support Vector Machine

Explanation: SVM finds the best boundary that separates classes with the widest possible margin.

Analogy: Like drawing the cleanest fence between two groups of animals while leaving maximum space on both sides.

IntermediateSVMclassificationmarginsupervised learning

📈

Regression

Linear Regression

Explanation: Linear regression predicts a number by fitting a straight-line relationship between inputs and output.

Analogy: Like drawing the best straight line through points on a chart.

x → y = mx + b → predicted number

Beginnerlinear regressionregressionpredictionsupervised learning

📉

Regression

Support Vector Regression

Explanation: Support Vector Regression predicts numeric values using the same margin-based idea as SVM, but for regression problems.

Analogy: Like drawing a prediction road with a tolerance lane around it.

IntermediateSVRregressionSVMprediction

✂️

Regression & Feature Selection

Lasso Regression

Explanation: Lasso regression predicts numbers while shrinking less useful feature weights toward zero, which can help with feature selection.

Analogy: Like packing only the most useful tools and leaving unnecessary ones behind.

Intermediatelassoregressionregularizationfeature selection

🛡️

Regression & Regularization

Ridge Regression

Explanation: Ridge regression is linear regression with regularization that keeps weights smaller to reduce overfitting.

Analogy: Like telling the model not to rely too heavily on any one clue.

Intermediateridge regressionregularizationregressionoverfitting

🎯

Clustering

K-Means Clustering

Explanation: K-means groups data into K clusters by finding center points and assigning examples to the nearest center.

Analogy: Like sorting people into groups based on who stands closest together in a room.

Data points → cluster centers → groups

Beginnerk-meansclusteringunsupervised learning

🧲

Clustering

DBSCAN

Explanation: DBSCAN finds clusters based on dense regions of points and can identify outliers as noise.

Analogy: Like spotting crowded groups at a party and ignoring people standing alone.

IntermediateDBSCANclusteringoutliersunsupervised learning

🔔

Clustering

Gaussian Mixture Model

Explanation: A Gaussian Mixture Model groups data by assuming it comes from a mixture of bell-shaped distributions.

Analogy: Like guessing that a crowd is made of several overlapping groups, each with its own center and spread.

IntermediateGMMclusteringprobabilityunsupervised learning

🧬

Clustering

Agglomerative Hierarchical Clustering

Explanation: This clustering method starts with each item as its own group, then repeatedly merges the closest groups into a hierarchy.

Analogy: Like building a family tree of similar items.

Intermediatehierarchical clusteringclusteringunsupervised learning

🌀

Clustering

Mean Shift Clustering

Explanation: Mean shift finds clusters by moving points toward the densest nearby area.

Analogy: Like people gradually walking toward the busiest spot in a park.

Intermediatemean shiftclusteringdensityunsupervised learning

📐

Dimensionality Reduction

Principal Component Analysis

Explanation: PCA reduces many features into fewer important dimensions while preserving as much variation as possible.

Analogy: Like summarizing a large spreadsheet into the few columns that explain most of the pattern.

IntermediatePCAdimensionality reductionunsupervised learning

🎮

Reinforcement Learning

Q-Learning

Explanation: Q-learning teaches an agent which action is best in each situation by learning action values from rewards.

Analogy: Like learning the best moves in a game by trying actions and remembering what earns points.

IntermediateQ-learningreinforcement learningdecision making

🔁

Reinforcement Learning

Markov Decision Process

Explanation: An MDP is a mathematical way to model decision-making where an agent moves between states, takes actions, and receives rewards.

Analogy: Like a board game where each move changes your position and score.

IntermediateMDPreinforcement learningstatesactionsrewards

🚀

Ensemble Learning

Gradient Boosting

Explanation: Gradient boosting builds many weak models one after another, where each new model focuses on fixing previous mistakes.

Analogy: Like a team of editors where each editor improves the previous draft.

Intermediategradient boostingensembleXGBoostLightGBM

⚡

Ensemble Learning

XGBoost

Explanation: XGBoost is a powerful and efficient gradient boosting algorithm widely used for structured/tabular data.

Analogy: Like a highly optimized team of decision trees that correct each other's mistakes.

IntermediateXGBoostboostingtabular datasupervised learning

➡️

Basic Neural Network

Feedforward Neural Network

Explanation: A feedforward neural network sends data in one direction from input to output through hidden layers.

Analogy: Like an assembly line where each station transforms the input before passing it forward.

Beginnerneural networkfeedforwarddeep learning

👁️

Computer Vision

Convolutional Neural Network

Explanation: A CNN is designed to detect patterns in images, such as edges, shapes, textures, and objects.

Analogy: Like scanning a picture with small filters that look for visual clues.

Image → filters → feature maps → prediction

IntermediateCNNcomputer visionimage recognition

🔄

Sequence Modeling

Recurrent Neural Network

Explanation: An RNN processes sequential data by using information from previous steps.

Analogy: Like reading a sentence word by word while remembering what came before.

IntermediateRNNsequencetime seriesNLP

🧠

Sequence Modeling

LSTM

Explanation: LSTM is a type of RNN designed to remember important information for longer periods and forget less useful information.

Analogy: Like a smart notebook that keeps important facts and erases noise.

IntermediateLSTMRNNmemorysequence modeling

🚪

Sequence Modeling

GRU

Explanation: GRU is a simpler alternative to LSTM that also helps neural networks remember useful sequence information.

Analogy: Like a lighter version of LSTM with fewer gates to manage memory.

IntermediateGRURNNsequence modeling

🔁

Attention-Based Model

Transformer

Explanation: A transformer uses attention to understand relationships between tokens, even when they are far apart in the input.

Analogy: Like reading a paragraph and highlighting which words matter most to each other.

IntermediatetransformerattentionLLMdeep learning

📖

Encoder Model

BERT

Explanation: BERT is a transformer-based model designed to understand text by reading context from both directions.

Analogy: Like understanding a missing word by looking at the words before and after it.

IntermediateBERTencoderNLPtransformer

✍️

Decoder Model

GPT

Explanation: GPT is a transformer-based model designed to generate text by predicting the next token.

Analogy: Like a very advanced autocomplete system that continues writing based on context.

IntermediateGPTdecoderLLMtext generation

📦

Representation Learning

Autoencoder

Explanation: An autoencoder learns to compress data into a smaller representation and reconstruct it.

Analogy: Like summarizing a document and then trying to recreate the original from the summary.

Intermediateautoencodercompressionrepresentation learning

🎲

Generative Model

Variational Autoencoder

Explanation: A VAE learns a compressed probability-based representation that can generate new examples similar to the training data.

Analogy: Like learning the recipe pattern behind images so it can create new variations.

AdvancedVAEgenerative modelautoencoder

🎭

Generative Model

GAN

Explanation: A GAN has two neural networks: a generator that creates fake examples and a discriminator that tries to detect fakes.

Analogy: Like an art forger and an art detective improving together.

Generator vs Discriminator

AdvancedGANgenerative AIimage generation

🌫️

Generative Model

Diffusion Model

Explanation: A diffusion model learns to create data by reversing a noise process, gradually turning noise into a clear image or output.

Analogy: Like starting with TV static and slowly cleaning it until a picture appears.

Advanceddiffusionimage generationgenerative AI

🧬

Image Segmentation

U-Net

Explanation: U-Net is a neural network architecture often used for image segmentation, especially in medical imaging.

Analogy: Like tracing the exact outline of objects in an image.

AdvancedU-Netsegmentationcomputer vision

🖼️

Computer Vision

Vision Transformer

Explanation: A Vision Transformer applies transformer ideas to images by splitting an image into patches and using attention.

Analogy: Like cutting an image into puzzle pieces and deciding which pieces matter most.

Advancedvision transformerViTcomputer visiontransformer

🎛️

Multimodal AI

Multimodal Model

Explanation: A multimodal model can work with more than one type of data, such as text, images, audio, or video.

Analogy: Like a person who can read, see, listen, and speak.

Intermediatemultimodaltextimageaudiovideo

🔄

Sequence-to-Sequence

Encoder-Decoder Model

Explanation: An encoder-decoder model reads an input into a representation, then generates an output from it.

Analogy: Like reading a paragraph in one language and then writing it in another.

Intermediateencoder-decoderseq2seqtranslationtransformer

📜

NLP

Sequence-to-Sequence Model

Explanation: A sequence-to-sequence model converts one sequence into another, such as text in one language to text in another.

Analogy: Like converting a recipe from English to Spanish step by step.

Intermediateseq2seqNLPtranslationsummarization

🏗️

Big Picture

Neural Network Architecture

Explanation: A neural network architecture is the design pattern of a model, including its layers, connections, and how data flows.

Analogy: Like the blueprint of a building.

Beginnerarchitectureneural networkmodel design

🎓

Transfer Learning

Pretrained Model

Explanation: A pretrained model has already learned useful patterns from large datasets and can be reused or adapted for new tasks.

Analogy: Like hiring someone who already has general experience before training them for your company.

Beginnerpretrained modeltransfer learningfine-tuning

🔁

Model Reuse

Transfer Learning

Explanation: Transfer learning reuses knowledge from a pretrained model and adapts it to a new task.

Analogy: Like a doctor specializing in cardiology after already completing general medical training.

Beginnertransfer learningpretrained modelfine-tuning

🌡️

Creativity Control

Temperature

Explanation: Temperature controls how random or creative the model's response is. Lower temperature makes answers more predictable and focused. Higher temperature makes answers more varied and creative.

Analogy: Like choosing how adventurous a chef should be. Low temperature follows the recipe exactly; high temperature experiments with new flavors.

Low = focused · High = creative

BeginnertemperaturerandomnesscreativityLLM settingsgeneration

📏

Output Length

Max Tokens

Explanation: Max tokens limits how much text the model can generate in its response. A lower value forces shorter answers, while a higher value allows longer responses.

Analogy: Like setting a word limit on an essay. The model must stop once it reaches the allowed length.

Small limit = short answer · Large limit = longer answer

Beginnermax tokensoutput lengthtoken limitLLM settings

🛑

Output Stopping Rule

Stop Sequences

Explanation: Stop sequences are specific words, symbols, or text patterns that tell the model when to stop generating. When the model reaches that sequence, the response ends.

Analogy: Like telling someone, "Stop talking when you say 'END'."

Generate text → stop word found → output ends

Intermediatestop sequencestop wordsgeneration controlLLM settings

🎲

Sampling Control

Top P

Explanation: Top P, also called nucleus sampling, controls how many likely next-token choices the model considers. Lower Top P narrows choices to safer/common options. Higher Top P allows more variety.

Analogy: Like choosing from only the top few most likely menu items versus considering almost the whole menu.

Low Top P = narrow choices · High Top P = wider choices

Intermediatetop pnucleus samplingsamplinggenerationLLM settings

🔁

Repetition Control

Frequency Penalty

Explanation: Frequency penalty reduces the chance that the model repeats words or phrases it has already used many times. Higher frequency penalty discourages repetition.

Analogy: Like a teacher saying, "You already said that word too many times. Use different wording."

Higher penalty = less repeated wording

Intermediatefrequency penaltyrepetitiongeneration controlLLM settings

🧭

Topic Exploration

Presence Penalty

Explanation: Presence penalty encourages the model to introduce new topics or ideas instead of staying too close to what it has already mentioned. It penalizes tokens simply for appearing before, not based on how often they appeared.

Analogy: Like encouraging a speaker to bring up new points instead of circling around the same topic.

Higher penalty = more new ideas

Intermediatepresence penaltytopic explorationgeneration controlLLM settings

🧑‍⚖️

Human-Like Intelligence

Turing Test

Explanation: The Turing Test is a way to evaluate whether a machine can respond so naturally that a human judge may not be able to tell whether they are talking to a person or a computer.

Analogy: Like texting with two hidden participants — one human and one machine — and trying to guess which one is which based only on their answers.

Human judge → hidden conversation → human or machine?

BeginnerTuring TestAI historyhuman-like intelligenceAI evaluation

🎯

AI Types

Narrow AI

Explanation: Narrow AI is AI designed to perform one specific task or a limited set of tasks. Most AI systems used today are narrow AI, including chatbots, recommendation engines, image recognition systems, and fraud detection models.

Analogy: Like a specialist doctor who is very good in one area, such as heart care, but is not trained to do every type of medical work.

Specific task → specialized AI system

Beginnernarrow AIspecialized AIAI typesapplied AI

🧠

AI Types

General AI

Explanation: General AI, also called AGI, refers to a theoretical AI system that could understand, learn, and perform many different intellectual tasks at a human level. It would not be limited to one narrow task.

Analogy: Like a person who can learn many different jobs — teacher, engineer, writer, analyst — instead of being trained for only one task.

Many tasks → one flexible intelligence

IntermediateAGIgeneral AIAI typeshuman-level AI

🚀

AI Types

Superintelligence

Explanation: Superintelligence is a hypothetical form of AI that would surpass human intelligence across many areas, such as reasoning, creativity, science, strategy, and problem-solving.

Analogy: Like comparing a beginner chess player to a grandmaster — except the AI would be far beyond human experts across many fields, not just one game.

Human-level intelligence → beyond-human intelligence

Advancedsuperintelligencefuture AIAGIAI risk

🧩

Rule-Based AI

Symbolic AI

Explanation: Symbolic AI is an older approach to artificial intelligence that uses explicit rules, logic, and symbols instead of learning patterns from large amounts of data. It tries to represent knowledge in a structured, rule-based way.

Analogy: Like giving a computer a detailed rulebook and asking it to follow the rules step by step instead of learning from examples.

Rules + logic → AI decision

Intermediatesymbolic AIrule-based AIexpert systemsAI history

⚖️

Model Generalization

Bias-Variance Tradeoff

Explanation: The bias-variance tradeoff explains two common ways a model can fail. High bias means the model is too simple and misses important patterns. High variance means the model is too sensitive to training data and may memorize noise instead of learning the real pattern.

Analogy: Like a student preparing for an exam. High bias is a student who oversimplifies everything and misses details. High variance is a student who memorizes exact practice questions but struggles when the exam is worded differently.

High bias = underfitting · High variance = overfitting

Intermediatebiasvarianceoverfittingunderfittinggeneralization

🔁

Model Evaluation

Cross-Validation

Explanation: Cross-validation is a technique for evaluating a model by splitting the data multiple ways. The model is trained and tested on different portions of the data to get a more reliable estimate of performance.

Analogy: Like taking several practice exams instead of trusting one test score. If you perform well across many practice exams, your understanding is probably more reliable.

Split data → train/test multiple times → average performance

Intermediatecross-validationevaluationtrainingvalidation

🔲

Classification Evaluation

Confusion Matrix

Explanation: A confusion matrix is a table that shows how well a classification model performed by counting correct and incorrect predictions. It includes true positives, false positives, true negatives, and false negatives.

Analogy: Like a detailed scoreboard. It does not just say the team won or lost; it shows exactly where the mistakes happened.

Actual vs Predicted → TP / FP / TN / FN

Beginnerconfusion matrixclassificationtrue positivefalse positiveevaluation

📉

Classification Evaluation

ROC Curve & AUC

Explanation: An ROC curve shows how a classification model performs at different decision thresholds. AUC summarizes the curve into one score that represents how well the model separates classes.

Analogy: Like testing a smoke alarm at different sensitivity levels. You want it to catch real fires without too many false alarms.

Threshold changes → ROC curve → AUC score

IntermediateROCAUCclassificationthresholdevaluation

👥

Combining Models

Ensemble Methods

Explanation: Ensemble methods combine multiple models to make a stronger prediction than a single model. Common ensemble approaches include bagging, boosting, and stacking.

Analogy: Like asking several experts for their opinions and combining their answers instead of trusting only one person.

Model 1 + Model 2 + Model 3 → stronger prediction

Intermediateensemblebaggingboostingstackingmodel combination

🎲

Regularization

Dropout

Explanation: Dropout is a regularization technique used during neural network training. It randomly turns off some neurons during each training step so the model does not rely too heavily on any single neuron or pathway.

Analogy: Like making a basketball team practice while randomly benching a few players each time. Everyone learns to contribute instead of depending on one star player.

Training step → randomly disable neurons → stronger generalization

Intermediatedropoutregularizationoverfittingneural networksdeep learning

⚙️

Training Stability

Batch Normalization

Explanation: Batch normalization is a technique that normalizes values inside a neural network during training. It helps training become more stable, often faster, and less sensitive to poor starting conditions.

Analogy: Like making sure every ingredient in a recipe is measured consistently before cooking. When the measurements stay stable, the final result becomes easier to control.

Layer values → normalize → more stable training

Intermediatebatch normalizationtraining stabilityneural networksdeep learningoptimization

🫥

Training Problem

Vanishing Gradient

Explanation: Vanishing gradient is a training problem where the error signal becomes very small as it moves backward through many neural network layers. When this happens, earlier layers learn very slowly or almost stop learning.

Analogy: Like passing a message through many people, but each person whispers it more softly. By the time it reaches the first person, the message is almost gone.

Backpropagation → smaller gradients → early layers learn slowly

Advancedvanishing gradientbackpropagationdeep networkstraining problemdeep learning

💥

Training Problem

Exploding Gradient

Explanation: Exploding gradient is a training problem where the error signal becomes extremely large as it moves backward through neural network layers. This can cause weight updates to become unstable and make training fail.

Analogy: Like trying to slightly adjust a shower temperature, but the knob suddenly jumps too far and makes the water scalding hot or freezing cold.

Backpropagation → huge gradients → unstable updates

Advancedexploding gradientbackpropagationtraining instabilityneural networksdeep learning

🧱

LLM Architecture

Decoder-only Architecture

Explanation: Decoder-only architecture is the model design used by GPT-style language models. It generates text one token at a time by looking at the previous tokens and predicting what should come next.

Analogy: Like writing a sentence from left to right, choosing the next word based on everything already written.

Previous tokens → predict next token → repeat

Intermediatedecoder-onlyGPTLLM architecturenext token prediction

📥

LLM Architecture

Encoder-only Architecture

Explanation: Encoder-only architecture is used by BERT-style models that focus on understanding text rather than generating long responses. These models read the input and create rich representations of its meaning.

Analogy: Like a careful reader who studies a paragraph deeply to understand what it means, but is not mainly trying to write the next paragraph.

Text input → encoded meaning representation

Intermediateencoder-onlyBERTlanguage understandingLLM architecture

🔄

LLM Architecture

Encoder-Decoder Architecture

Explanation: Encoder-decoder architecture uses one part of the model to read and understand the input, and another part to generate the output. It is common in translation, summarization, and text-to-text tasks.

Analogy: Like one person reading a document carefully and another person writing a clean summary from that understanding.

Encoder reads input → decoder writes output

Intermediateencoder-decoderT5BARTseq2seqLLM architecture

🎭

Training Objective

Masked Language Modeling

Explanation: Masked language modeling is a training method where some words or tokens are hidden, and the model learns to predict the missing pieces from the surrounding context.

Analogy: Like a fill-in-the-blank worksheet where the model learns by guessing the missing words.

The cat sat on the [MASK] → model predicts mat

Intermediatemasked language modelingMLMBERTtraining objective

➡️

Training Objective

Causal Language Modeling

Explanation: Causal language modeling trains a model to predict the next token using only the tokens that came before it. This is the core training style for many generative LLMs.

Analogy: Like autocomplete that can only look at what you have already typed, not future words.

Past tokens → next token

Intermediatecausal language modelingCLMautoregressiveGPTtraining objective

🔮

Training Objective

Next Token Prediction

Explanation: Next token prediction is the task of predicting the next small piece of text in a sequence. Many generative language models learn by practicing this task at massive scale.

Analogy: Like a very advanced autocomplete system that keeps guessing what should come next.

I like machine → learning

Beginnernext token predictiontokensLLM trainingautocomplete

🎯

Attention Mechanism

Self-Attention

Explanation: Self-attention lets each token in a sequence look at other tokens in the same sequence to understand which words matter most for meaning.

Analogy: Like reading a sentence and highlighting the earlier words that help explain the current word.

Each token ↔ related tokens

Intermediateself-attentionattentiontransformerLLM

👀

Attention Mechanism

Multi-Head Attention

Explanation: Multi-head attention runs multiple attention mechanisms in parallel, allowing the model to focus on different relationships in the text at the same time.

Analogy: Like several reviewers reading the same paragraph, where one focuses on grammar, another on meaning, and another on references.

Multiple attention heads → combined understanding

Advancedmulti-head attentionattentiontransformerLLM architecture

📍

Transformer Input

Positional Encoding

Explanation: Positional encoding adds information about token order so a transformer can understand sequence position. Without it, the model would know the words but not their order.

Analogy: Like numbering each word in a sentence so the model knows what came first, second, and third.

Token + position information → transformer input

Intermediatepositional encodingposition embeddingstransformertokens

🎲

Sampling Control

Top-K Sampling

Explanation: Top-K sampling limits the model to choosing from only the K most likely next tokens. This can reduce strange outputs by ignoring very unlikely choices.

Analogy: Like choosing dinner only from the top 5 most popular menu items instead of considering the entire menu.

All tokens → top K choices → sample next token

Intermediatetop-ktop ksamplinggenerationLLM settings

🔦

Decoding Strategy

Beam Search

Explanation: Beam search is a decoding strategy that keeps several likely output paths at the same time and chooses the strongest sequence. It is often used when the goal is a high-probability, structured output.

Analogy: Like exploring several promising roads at once before choosing the best route to the destination.

Keep best paths → compare sequences → choose strongest output

Advancedbeam searchbeam decodingdecodinggeneration

🧭

Embeddings

Embedding Model

Explanation: An embedding model is a model designed to convert text, images, or other data into numeric vectors that capture meaning. These vectors can then be used for similarity search, recommendations, clustering, and RAG.

Analogy: Like a machine that gives every idea a GPS coordinate so similar ideas end up close together on the map.

Text or data → embedding model → meaning vector

Beginnerembedding modelembeddingsvectorssemantic searchRAG

📐

Vector Comparison

Cosine Similarity

Explanation: Cosine similarity compares the direction of two vectors to estimate how similar their meanings are. It is commonly used to compare embeddings in semantic search and RAG systems.

Analogy: Like checking whether two arrows point in the same direction. The closer their direction, the more similar the ideas.

Vector A ↗ + Vector B ↗ = high similarity

Intermediatecosine similarityembeddingsvectorssemantic searchRAG

⚖️

Retrieval Quality

Maximum Marginal Relevance

Explanation: Maximum Marginal Relevance, or MMR, is a retrieval method that balances relevance and diversity. It tries to return useful results without giving several nearly identical chunks.

Analogy: Like choosing search results that are all helpful, but not all repeating the same paragraph in different words.

Relevant results + diverse results → better retrieval set

AdvancedMMRmaximum marginal relevanceretrievaldiversityRAG

🔀

Search Strategy

Hybrid Search

Explanation: Hybrid search combines keyword search and semantic search. This helps systems find exact terms when needed while also understanding meaning when users phrase things differently.

Analogy: Like searching a library by both exact book title and topic meaning at the same time.

Keyword search + semantic search → stronger retrieval

Intermediatehybrid searchkeyword searchsemantic searchRAGretrieval

🧮

Retrieval Quality

Reranking

Explanation: Reranking is a second retrieval step that reorders candidate results so the most useful or relevant chunks appear first before the LLM receives context.

Analogy: Like first collecting many possible resumes, then having a recruiter rank the best candidates before sending them to the hiring manager.

Retrieve candidates → rerank → send best context

IntermediatererankingrerankerretrievalRAGsearch quality

🗣️

Structured Retrieval

Self-Query Retrieval

Explanation: Self-query retrieval uses an LLM to convert a natural-language question into a structured search query with filters, such as date, category, source, or metadata constraints.

Analogy: Like asking a librarian to translate your plain-English question into exact database filters.

User question → structured query → filtered retrieval

Advancedself-query retrievalstructured querymetadata filtersRAG

📄

Chunk Retrieval

Parent Document Retriever

Explanation: A parent document retriever searches smaller chunks for accuracy but returns a larger parent document or section for better context. This helps avoid giving the LLM tiny fragments without enough surrounding meaning.

Analogy: Like finding one sentence in an index card, then reading the full page around it to understand the context.

Search small chunk → return larger parent context

Advancedparent document retrieverchunkingretrievalRAGcontext

🪞

Agent Improvement

Reflection

Explanation: Reflection is an agent pattern where the AI reviews its own output, identifies possible mistakes, and uses that feedback to improve the next attempt.

Analogy: Like proofreading your own essay, noticing weak points, and rewriting it before submitting.

Answer → review → improve → final answer

Intermediatereflectionagent reflectionself-improvementAI agents

🗺️

Goal Decomposition

Planning

Explanation: Planning is when an AI agent breaks a larger goal into smaller steps before acting. This helps the agent organize work instead of jumping straight into action.

Analogy: Like making a travel itinerary before starting a trip, so you know where to go first, second, and third.

Goal → steps → actions

Intermediateplanningagent planningtask decompositionAI agents

👥

Agent Collaboration

Multi-Agent System

Explanation: A multi-agent system uses more than one AI agent to solve a problem. Agents may collaborate, divide responsibilities, review each other, or compete to improve results.

Analogy: Like a project team where one person researches, another writes, another reviews, and another manages the plan.

Agent A + Agent B + Agent C → coordinated result

Advancedmulti-agentmulti-agent systemagent collaborationAI agents

🎼

Agent Coordination

Orchestration

Explanation: Orchestration is the coordination of agent steps, tools, memory, and workflows so the right action happens at the right time.

Analogy: Like a music conductor coordinating different instruments so they play together instead of making noise separately.

Plan + tools + memory + steps → coordinated workflow

Advancedorchestrationagent orchestrationworkflowAI agents

👍

Alignment Training

Reinforcement Learning from Human Feedback

Explanation: Reinforcement Learning from Human Feedback, or RLHF, is a training approach where human preferences are used to guide a model toward responses people consider more helpful, safe, or aligned.

Analogy: Like a coach watching practice and saying which responses were better, so the model learns what people prefer.

Model outputs → human preferences → reward training

AdvancedRLHFhuman feedbackalignmentAI trainingAI agents

📜

Alignment Method

Constitutional AI

Explanation: Constitutional AI is an approach where a model is guided by a written set of principles or rules that help shape safer and more helpful behavior.

Analogy: Like giving an assistant a constitution to follow when deciding what is acceptable or not.

Principles → model critique → safer response

Advancedconstitutional AIalignmentAI safetyprinciplesAI agents

🧩

Training Method

Self-Supervised Learning

Explanation: Self-supervised learning is a training approach where the model learns from the structure of the data itself instead of relying on human-provided labels. The system creates its own learning signal from the data.

Analogy: Like making your own quiz from a textbook. You hide part of the material, try to guess it, and learn from whether you were right.

Raw data → create training signal → learn patterns

Intermediateself-supervised learningtraininglabelsrepresentation learning

🧲

Representation Learning

Contrastive Learning

Explanation: Contrastive learning teaches a model by pulling similar examples closer together in representation space and pushing different examples farther apart.

Analogy: Like organizing photos by putting similar pictures in the same pile and moving unrelated pictures to different piles.

Similar examples closer · different examples farther

Advancedcontrastive learningrepresentation learningembeddingstraining

🎯

Low-Data Learning

Few-Shot Learning

Explanation: Few-shot learning is the ability to learn or adapt from only a small number of examples. In LLMs, it often means giving a few examples in the prompt so the model follows the pattern.

Analogy: Like understanding a new card game after watching only two or three rounds.

Few examples → model adapts pattern

Intermediatefew-shot learninglow-data learningexamplesprompting

📚

Training Strategy

Curriculum Learning

Explanation: Curriculum learning is a training strategy where the model starts with easier examples and gradually moves to harder ones.

Analogy: Like learning math by starting with addition, then multiplication, then algebra, instead of starting with calculus on day one.

Easy examples → medium examples → hard examples

Advancedcurriculum learningtraining strategymodel training

📉

Optimization

Learning Rate Scheduling

Explanation: Learning rate scheduling changes the learning rate during training. This can help the model learn quickly early on and make smaller, more careful updates later.

Analogy: Like driving faster on an open highway, then slowing down as you approach a parking spot.

Learning rate changes over training time

Intermediatelearning rate schedulingoptimizationtraininglearning rate

🔥

Optimization

Warmup Steps

Explanation: Warmup steps gradually increase the learning rate at the beginning of training so the model does not start with updates that are too aggressive.

Analogy: Like warming up before running at full speed to avoid injury.

Small learning rate → gradually increase → normal training

Intermediatewarmup stepslearning rateoptimizationtraining

➕

Training Efficiency

Gradient Accumulation

Explanation: Gradient accumulation lets training collect gradients over multiple smaller batches before applying one update. This can simulate a larger batch size when memory is limited.

Analogy: Like collecting several small payments before making one larger deposit.

Small batch + small batch + small batch → one update

Advancedgradient accumulationbatch sizetraining efficiencyoptimization

⚡

Training Efficiency

Mixed Precision Training

Explanation: Mixed precision training uses lower-precision numbers for some calculations to make training faster and use less memory while keeping enough accuracy.

Analogy: Like using shorthand notes to work faster while still preserving the important meaning.

Lower precision math → faster training + less memory

Advancedmixed precision trainingFP16BF16training efficiency

🧪

Model Compression

Distillation

Explanation: Distillation is a technique where a smaller model is trained to imitate a larger, more capable model. The goal is to keep much of the quality while making the model cheaper and faster.

Analogy: Like a student learning from an expert teacher and becoming easier to consult quickly.

Large teacher model → smaller student model

Intermediatedistillationknowledge distillationmodel compressionsmall models

📦

Model Compression

Quantization

Explanation: Quantization reduces the numerical precision of model weights or calculations so the model uses less memory and can run faster.

Analogy: Like rounding numbers to save space while keeping the answer close enough for practical use.

High precision numbers → lower precision numbers → smaller model

Intermediatequantizationmodel compressioninference optimizationLLM

✂️

Model Compression

Pruning

Explanation: Pruning removes unnecessary weights, neurons, or connections from a model to make it smaller or faster while trying to preserve performance.

Analogy: Like trimming dead branches from a tree so the healthy parts can remain strong.

Large model → remove unnecessary parts → smaller model

Intermediatepruningmodel compressionneural networksoptimization

❓

Language Model Metric

Perplexity

Explanation: Perplexity is a metric used to evaluate language models by measuring how surprised the model is by a sequence of text. Lower perplexity usually means the model is better at predicting the text.

Analogy: Like reading a sentence and asking, "How unexpected was the next word?" If the words are easy to predict, the model is less confused.

Lower perplexity = less surprise

Advancedperplexitylanguage modelevaluationLLM metric

🌐

Text Generation Metric

BLEU Score

Explanation: BLEU Score is a metric often used to evaluate machine translation by comparing generated text against one or more reference translations.

Analogy: Like comparing a student's translation to an expert translation and checking how many phrases overlap.

Generated translation vs reference translation → BLEU

AdvancedBLEUtranslationtext generationevaluation

📄

Text Generation Metric

ROUGE Score

Explanation: ROUGE Score is a metric often used to evaluate summaries by comparing overlap between a generated summary and a reference summary.

Analogy: Like checking whether a student's summary included the same important points as the teacher's model summary.

Generated summary vs reference summary → ROUGE

AdvancedROUGEsummarizationtext generationevaluation

🧪

Live Evaluation

A/B Test

Explanation: An A/B test compares two versions of a model, prompt, feature, or user experience with real users to see which performs better.

Analogy: Like showing two store layouts to different customers and measuring which one leads to more purchases.

Version A vs Version B → compare results

BeginnerA/B testexperimentationevaluationproduct

🔒

Evaluation Data

Holdout Set

Explanation: A holdout set is data kept separate from training and tuning so it can provide a more honest final check of model performance.

Analogy: Like keeping a sealed final exam that students cannot practice on before test day.

Training/tuning data separate from final holdout data

Intermediateholdout setevaluation datamodel testinggeneralization

🧪

Evaluation Data

Validation Set

Explanation: A validation set is data used during model development to tune choices like hyperparameters, thresholds, or model versions without touching the final test set.

Analogy: Like using practice exams to adjust your study strategy before taking the final exam.

Train → validate/tune → final test

Beginnervalidation setevaluation datahyperparameter tuningmodel development

⚖️

AI Evaluation

LLM-as-Judge

Explanation: LLM-as-Judge means using a language model to evaluate outputs from another model or system. It can help score quality, helpfulness, correctness, tone, or adherence to instructions.

Analogy: Like asking one expert reviewer to grade another assistant's answer using a rubric.

Model output → judge model → score or feedback

IntermediateLLM-as-JudgeAI evaluationrubricquality scoring

🗃️

Model Management

Model Registry

Explanation: A model registry is a centralized place to store, version, track, and manage trained models before and after deployment.

Analogy: Like a library catalog for model versions, showing which model is approved, which is experimental, and which is currently in production.

Train model → register version → deploy selected version

Intermediatemodel registryMLOpsmodel versioningdeployment

🏪

Feature Management

Feature Store

Explanation: A feature store is a system for storing, managing, and serving reusable model features consistently for training and production inference.

Analogy: Like a shared pantry where every team uses the same approved ingredients instead of preparing different versions separately.

Raw data → features → training and inference

Advancedfeature storefeaturesMLOpstraininginference

🐤

Release Strategy

Canary Deployment

Explanation: Canary deployment gradually releases a new model or feature to a small percentage of users before rolling it out to everyone.

Analogy: Like testing a new recipe with a few customers before adding it to the full restaurant menu.

Small traffic → monitor → full rollout

Intermediatecanary deploymentrelease strategyMLOpsdeployment

🔵

Release Strategy

Blue-Green Deployment

Explanation: Blue-green deployment runs two environments: the current version and the new version. Traffic can switch from one to the other when the new version is ready.

Analogy: Like opening a new checkout lane and moving customers over only after confirming it works smoothly.

Blue environment → switch traffic → Green environment

Intermediateblue-green deploymentrelease strategydeploymentMLOps

🔄

Automated Retraining

Continuous Training

Explanation: Continuous training is the process of automatically retraining models as new data becomes available, usually as part of an MLOps pipeline.

Analogy: Like a student who keeps studying new lessons every week instead of relying only on old knowledge.

New data → retrain → evaluate → deploy

Advancedcontinuous trainingretrainingMLOpsautomation

🌊

Monitoring

Data Drift

Explanation: Data drift happens when the input data seen by a model changes over time compared to the data it was trained on.

Analogy: Like a store that trained its sales forecast on old customer habits, but customer behavior changes after a new competitor opens nearby.

Training data distribution ≠ current input data

Intermediatedata driftmonitoringMLOpsmodel performance

🔀

Monitoring

Concept Drift

Explanation: Concept drift happens when the relationship between inputs and outputs changes over time, even if the input data still looks similar.

Analogy: Like credit risk changing during an economic downturn: the same income and debt profile may no longer mean the same level of risk.

Same inputs → changed meaning → changed output relationship

Advancedconcept driftmonitoringMLOpsmodel drift

🔁

System Behavior

Feedback Loop

Explanation: A feedback loop happens when a model's outputs influence future user behavior or future data, which can then affect later model performance.

Analogy: Like a recommendation system showing certain videos, users watching more of them, and the system learning to recommend even more of the same type.

Model output → user behavior → future data → model impact

Intermediatefeedback loopMLOpsmodel behaviorproduct

🪪

Documentation

Model Cards

Explanation: Model cards are documents that describe a model's purpose, intended use, performance, limitations, risks, and responsible-use guidance.

Analogy: Like a nutrition label for an AI model, showing what it is for, what it contains, and what warnings to know before using it.

Model details → performance → limitations → intended use

Beginnermodel cardsdocumentationresponsible AIMLOps

⚖️

Fairness Measurement

Algorithmic Fairness

Explanation: Algorithmic fairness is the practice of measuring and reducing unfair outcomes produced by AI systems. It looks at whether different groups are treated consistently and appropriately by a model or decision process.

Analogy: Like checking whether the rules of a game are fair for every team, not just whether the final score looks reasonable.

Model outcomes → compare groups → fairness check

Intermediatealgorithmic fairnessfairnessbiasresponsible AI

⚠️

Fairness Risk

Disparate Impact

Explanation: Disparate impact happens when a system appears neutral but its outcomes disproportionately harm or exclude a protected or sensitive group.

Analogy: Like a rule that applies to everyone on paper but ends up blocking one group much more often in practice.

Neutral rule → uneven outcome across groups

Advanceddisparate impactfairnessbiasresponsible AI

🧯

Safety Control

Toxicity Filtering

Explanation: Toxicity filtering detects and blocks harmful, abusive, hateful, or unsafe language before it reaches users or systems.

Analogy: Like a safety screen that catches dangerous language before it enters a classroom discussion.

Input/output text → safety filter → allow or block

Beginnertoxicity filteringtoxicity detectionsafetycontent moderation

🛡️

Safety Control

Content Moderation

Explanation: Content moderation is the process of reviewing, filtering, or managing user-generated content to enforce safety rules, platform policies, or community standards.

Analogy: Like a moderator keeping a public meeting respectful and removing harmful comments.

User content → policy check → allow, flag, or remove

Beginnercontent moderationmoderationsafetyresponsible AI

🔍

Transparency

Interpretability vs Explainability

Explanation: Interpretability focuses on understanding how a model works internally, while explainability focuses on giving understandable reasons for a model's decision or output.

Analogy: Interpretability is opening the engine to see how it works. Explainability is telling the driver why the car slowed down.

Interpretability = inside model · Explainability = understandable reason

Advancedinterpretabilityexplainabilitytransparencyresponsible AI

🍋

Explanation Method

LIME

Explanation: LIME is a technique that explains an individual model prediction by approximating the model locally with a simpler, easier-to-understand model.

Analogy: Like zooming into one neighborhood and using a simple local map to explain what is happening there, even if the full city map is complex.

Complex model prediction → local simple explanation

AdvancedLIMEexplainabilityinterpretabilitymodel explanation

📊

Feature Importance

SHAP

Explanation: SHAP is an explanation method based on game theory that estimates how much each feature contributed to a model prediction.

Analogy: Like dividing credit among team members for a win based on how much each person contributed.

Prediction → feature contributions → explanation

AdvancedSHAPfeature importanceexplainabilitymodel explanation

🧨

Security Risk

Adversarial Attack

Explanation: An adversarial attack uses carefully designed inputs to trick an AI system into making a wrong prediction or unsafe decision.

Analogy: Like creating an optical illusion specifically designed to fool a computer vision model.

Tiny input change → wrong model output

Advancedadversarial attackAI securityrobustnessresponsible AI

🕵️

Privacy Attack

Membership Inference

Explanation: Membership inference is an attack that tries to determine whether a specific person's data or record was included in a model's training data.

Analogy: Like trying to guess whether someone's private file was part of a confidential training folder by observing how the model responds.

Model behavior → infer training membership

Advancedmembership inferenceprivacy attackAI securityresponsible AI

🔄

Privacy Attack

Model Inversion

Explanation: Model inversion is an attack that attempts to reconstruct sensitive training data or private attributes by querying or analyzing a model.

Analogy: Like trying to recreate a secret recipe by tasting the final dish many times.

Model access → infer hidden data

Advancedmodel inversionprivacy attackAI securityresponsible AI

🧬

Data Generation

Synthetic Data

Explanation: Synthetic data is artificially generated data that imitates real data. It can be used for testing, privacy protection, data balancing, or simulation when real data is limited or sensitive.

Analogy: Like using realistic practice patients in medical training instead of real patient records.

Real patterns → generated data → safer testing/training

Intermediatesynthetic dataprivacydata generationtestingresponsible AI

💸

Cost Management

LLM Cost Economics

Explanation: LLM cost economics means understanding how token usage, model choice, request volume, caching, and output length affect the cost of running an AI product.

Analogy: Like tracking electricity usage in a factory. The machines may be powerful, but every minute they run has a cost.

Tokens + model price + traffic volume → AI cost

BeginnerLLM costtoken pricingAI productcost management

♻️

Cost Optimization

Prompt Caching

Explanation: Prompt caching reuses repeated parts of prompts so the system can reduce cost and latency when the same context is sent many times.

Analogy: Like preparing common ingredients ahead of time instead of chopping them again for every order.

Repeated prompt context → cache hit → lower cost/latency

Intermediateprompt cachingcost optimizationlatencyLLM

🧮

Cost Control

Token Budgeting

Explanation: Token budgeting is planning how many tokens are used for instructions, user input, retrieved context, and model output so the app stays within cost and context limits.

Analogy: Like budgeting money before shopping so you know how much can go to groceries, gas, and savings.

Prompt + context + answer ≤ token budget

Beginnertoken budgetingtokenscost controlcontext window

🚦

Usage Control

Rate Limiting

Explanation: Rate limiting controls how many requests a user, app, or system can make within a certain time period. It protects cost, reliability, and system capacity.

Analogy: Like limiting how many people can enter a store per minute so the store does not become overcrowded.

User requests → limit check → allow or slow down

Beginnerrate limitingAPIusage controlcost control

⚓

Trust & Verification

Grounding

Explanation: Grounding connects AI outputs to trusted sources, retrieved evidence, or known facts so answers are easier to verify and less likely to be unsupported.

Analogy: Like requiring a student to cite the textbook page instead of giving an answer from memory.

AI answer + supporting source = grounded response

IntermediategroundingRAGtrustverificationAI product

💬

Product Improvement

User Feedback Loop

Explanation: A user feedback loop collects user ratings, corrections, comments, or behavior signals and uses them to improve the AI product over time.

Analogy: Like customers reviewing a service so the business can learn what worked, what failed, and what to improve.

User feedback → analysis → product improvement

Intermediateuser feedback loopproduct improvementAI productfeedback

👻

Safe Rollout

Shadow Mode

Explanation: Shadow mode runs an AI system in the background without affecting real decisions. Teams can compare its outputs against the current process before trusting it in production.

Analogy: Like a trainee making recommendations that are reviewed silently but not used until they prove reliable.

AI runs silently → compare results → decide rollout

Intermediateshadow modesafe rolloutevaluationAI product

🎚️

Decision Safety

Confidence Thresholding

Explanation: Confidence thresholding means allowing an AI system to act automatically only when its confidence is high enough. Low-confidence cases can be sent to a human or handled more cautiously.

Analogy: Like asking for manager approval when an employee is unsure instead of letting them make a risky decision alone.

High confidence → automate · Low confidence → review

Intermediateconfidence thresholddecision safetyhuman reviewAI product

🖥️

AI Hardware

GPU

Explanation: A GPU, or graphics processing unit, is a chip designed to perform many calculations in parallel. GPUs are widely used to speed up AI training and inference.

Analogy: Like a factory with thousands of workers doing similar small tasks at the same time instead of one person doing them one by one.

Many parallel calculations → faster AI workloads

BeginnerGPUhardwareAI infrastructuretraininginference

🔲

AI Hardware

TPU

Explanation: A TPU, or tensor processing unit, is a specialized chip designed to accelerate machine learning workloads, especially tensor operations used in neural networks.

Analogy: Like a custom-built machine designed for one factory job, making that job faster and more efficient than a general-purpose machine.

Tensor operations → TPU acceleration

IntermediateTPUtensor processing unitAI hardwaremachine learning

📱

AI Hardware

NPU

Explanation: An NPU, or neural processing unit, is a chip designed to run AI tasks efficiently on devices such as phones, laptops, and edge hardware.

Analogy: Like a small built-in AI helper inside your device that handles AI tasks without needing a large server.

Device AI task → NPU → efficient local processing

BeginnerNPUneural processing unitedge AIAI hardware

⚙️

GPU Computing

CUDA

Explanation: CUDA is NVIDIA's platform and programming model that lets software use NVIDIA GPUs for parallel computing tasks, including AI training and inference.

Analogy: Like a special language that helps your software talk efficiently to NVIDIA GPU workers.

Software → CUDA → NVIDIA GPU compute

IntermediateCUDANVIDIAGPUparallel computingAI infrastructure

🚀

LLM Serving

vLLM

Explanation: vLLM is a high-throughput inference engine designed to serve large language models efficiently, especially when many users are sending requests.

Analogy: Like a fast restaurant kitchen optimized to serve many orders at once without wasting counter space or staff time.

Many LLM requests → efficient serving engine → faster responses

AdvancedvLLMLLM servinginferenceAI infrastructure

📦

Model Portability

ONNX

Explanation: ONNX is an open model format that helps move machine learning models between different frameworks, tools, and runtimes.

Analogy: Like a universal file format for AI models, similar to how PDF helps documents open across many systems.

Model from framework A → ONNX → runtime B

IntermediateONNXmodel portabilitydeploymentAI infrastructure

⚡

Inference Optimization

TensorRT

Explanation: TensorRT is NVIDIA's toolkit for optimizing trained deep learning models so they run faster and more efficiently during inference on NVIDIA hardware.

Analogy: Like tuning a race car engine so the same car can run faster on the track.

Trained model → TensorRT optimization → faster inference

AdvancedTensorRTNVIDIAinference optimizationAI infrastructure

🧩

Efficient Fine-Tuning

LoRA

Explanation: LoRA, or Low-Rank Adaptation, is a parameter-efficient fine-tuning method that trains small adapter weights instead of updating the entire large model.

Analogy: Like adding a small specialized attachment to a machine instead of rebuilding the whole machine.

Large model frozen + small adapters trained

IntermediateLoRAlow-rank adaptationfine-tuningefficient training

🗜️

Efficient Fine-Tuning

QLoRA

Explanation: QLoRA is a memory-efficient version of LoRA that uses quantization so large models can be fine-tuned with much less GPU memory.

Analogy: Like compressing a large instruction manual so it can fit on a smaller desk while still allowing you to add notes.

Quantized model + LoRA adapters → memory-efficient fine-tuning

AdvancedQLoRALoRAquantizationfine-tuningAI infrastructure

✨

Model Scale

Emergent Abilities

Explanation: Emergent abilities are capabilities that appear only when AI models become large or capable enough, even if those abilities were weak or absent in smaller models.

Analogy: Like a team suddenly solving problems that no individual member could solve alone once the team becomes large and coordinated enough.

Larger scale → new capability appears

Advancedemergent abilitiesmodel scaleLLMadvanced AI

📈

Model Scaling

Scaling Laws

Explanation: Scaling laws describe patterns in how model performance changes as model size, data size, and compute increase.

Analogy: Like studying how a car's speed changes when you improve the engine, fuel, and road quality together.

More data + larger model + more compute → performance trend

Advancedscaling lawsmodel sizedatacomputeAI research

✍️

Reasoning Method

Chain-of-Draft

Explanation: Chain-of-Draft is a concise reasoning approach where a model uses short intermediate notes instead of long step-by-step reasoning.

Analogy: Like writing quick scratch notes on the side of a math problem instead of writing a full essay solution.

Short draft notes → final answer

Advancedchain-of-draftreasoningpromptingadvanced AI

🌳

Reasoning Method

Tree-of-Thoughts

Explanation: Tree-of-Thoughts is a reasoning approach where a model explores multiple possible solution paths before choosing the best one.

Analogy: Like exploring several branches of a decision tree before deciding which route is most promising.

Idea branch A / B / C → evaluate → choose path

Advancedtree-of-thoughtsreasoningpromptingadvanced AI

🕸️

Advanced Retrieval

GraphRAG

Explanation: GraphRAG is a retrieval approach that uses knowledge graphs or entity relationships to improve how information is found and connected before generating an answer.

Analogy: Like using a relationship map of people, places, and events instead of searching loose documents one by one.

Documents → entities/relationships → graph-based retrieval

AdvancedGraphRAGknowledge graphRAGadvanced retrieval

⚡

Inference Acceleration

Speculative Decoding

Explanation: Speculative decoding speeds up language model generation by using a smaller model to draft possible tokens and a larger model to verify them.

Analogy: Like a junior assistant drafting ahead while a senior expert quickly checks and approves the correct parts.

Small model drafts → large model verifies → faster output

Advancedspeculative decodinginferenceLLM optimizationadvanced AI

🧑‍🔬

Model Architecture

Mixture of Experts

Explanation: Mixture of Experts is a model architecture that routes inputs to specialized expert sub-networks instead of using the entire model for every token or request.

Analogy: Like sending each question to the right specialist instead of asking the whole company to work on every question.

Input → router → selected experts → output

Advancedmixture of expertsMoEmodel architectureadvanced AI

🕸️

Efficient Attention

Sparse Attention

Explanation: Sparse attention is an attention method that looks at only selected tokens instead of every token, reducing computation and memory use.

Analogy: Like reading only the most relevant pages of a book instead of comparing every page to every other page.

Selected token connections → lower attention cost

Advancedsparse attentionattentionefficiencytransformer

⚡

Efficient Attention

Flash Attention

Explanation: Flash Attention is an optimized attention implementation that reduces memory use and speeds up transformer training or inference.

Analogy: Like reorganizing a messy desk so you can do the same work faster without needing more space.

Attention computation → memory optimization → faster processing

Advancedflash attentiontransformerattentionoptimization