Question 1

What is Activation function?

Accepted Answer

A small mathematical function applied inside neural networks that decides how much signal passes from one layer to the next. ReLU, GELU, SiLU are common ones.

Question 2

What is Adapter?

Accepted Answer

A small trainable add-on to a frozen base model. Lets you specialise a model without retraining the whole thing. LoRA is the dominant adapter style.

Question 3

What is Agent?

Accepted Answer

A program that takes goals and breaks them into actions, calling tools and revising its plan based on results — instead of just answering one prompt at a time.

Question 4

What is Alignment?

Accepted Answer

The work of making models do what humans actually want, not just what they were trained on. RLHF, constitutional AI, and DPO are alignment techniques.

Question 5

What is ap-south-1?

Accepted Answer

The AWS region in Mumbai. When AI APIs land here, latency from Indian users drops sharply — typically 200-300ms to 30-50ms.

Question 6

What is API key?

Accepted Answer

A token that authenticates your account with an AI provider. Treat it like a password — committed to GitHub once means rotate immediately.

Question 7

What is Attention?

Accepted Answer

A mechanism letting a model weigh different parts of its input when producing each output token. The "T" in transformer.

Question 8

What is Autoregressive?

Accepted Answer

A model that generates output one token at a time, conditioned on everything it produced before. All modern LLMs are autoregressive.

Question 9

What is Backpropagation?

Accepted Answer

The training algorithm that tells a neural network how to adjust its weights based on how wrong its output was.

Question 10

What is Batch size?

Accepted Answer

How many examples a model sees at once during training or inference. Larger batches train faster but need more memory.

Question 11

What is Benchmark?

Accepted Answer

A standardised test used to compare models — MMLU, HumanEval, MMLU-Pro. Useful for direction; often gamed in detail.

Question 12

What is BERT?

Accepted Answer

An early transformer (2018) that reads sentences both directions. Mostly retired for new work but still common in production search.

Question 13

What is Bias?

Accepted Answer

When a model systematically favours certain outputs over others — a property of training data more than design. Hard to remove fully.

Question 14

What is Caching (prompt)?

Accepted Answer

A technique where the provider remembers the prefix of a long prompt across calls, charging you less the second time. Cuts repeat costs by 50-90%.

Question 15

What is Chain of thought?

Accepted Answer

Telling a model to "think step by step" before answering. Improves reasoning on multi-step problems at the cost of more tokens.

Question 16

What is Classifier?

Accepted Answer

A model that puts an input in one of N labels. Cheaper and more reliable than free-form generation for routing or tagging.

Question 17

What is Claude?

Accepted Answer

Anthropic's family of large language models. Strong at coding, agents, and following instructions.

Question 18

What is Claude Code?

Accepted Answer

Anthropic's CLI for coding tasks. Gives Claude direct filesystem access, command execution, and a loop.

Question 19

What is Code interpreter?

Accepted Answer

A sandboxed Python environment a model can call to actually run code instead of just describing it. Better for math, parsing, plotting.

Question 20

What is Cold start?

Accepted Answer

When a self-hosted model has to load weights into VRAM before serving the first request. Adds latency. Avoidable with always-on workers.

Question 21

What is Completion?

Accepted Answer

The text a model produces in response to a prompt. The original term, before "chat" took over.

Question 22

What is Constitutional AI?

Accepted Answer

Anthropic's training method where the model critiques its own outputs against a set of principles, then trains on those critiques.

Question 23

What is Context rot?

Accepted Answer

When a model's attention degrades as the conversation gets longer, losing earlier details. Real but underdiscussed.

Question 24

What is Context window?

Accepted Answer

The maximum number of tokens (roughly: words + punctuation) a model can read in one conversation. Bigger context = more documents fit in.

Question 25

What is Cursor?

Accepted Answer

An AI code editor — forked VSCode with multi-file context and inline edits.

Question 26

What is Data card?

Accepted Answer

A short document describing what a dataset contains, how it was collected, and what biases it carries. Should be standard; often missing.

Question 27

What is DeepSeek?

Accepted Answer

A Chinese AI lab that released competitive open-weight models (DeepSeek-V3, R1) much cheaper than Western incumbents.

Question 28

What is Deployment?

Accepted Answer

The work of getting a trained model into a place where users can actually call it. Often harder than the training.

Question 29

What is Distillation?

Accepted Answer

Training a smaller model to mimic a larger one. Used to make cheap models that match expensive models on specific tasks.

Question 30

What is DPDP?

Accepted Answer

India's Digital Personal Data Protection Act. Affects how Indian companies handle personal data sent to AI models, especially overseas.

Question 31

What is DPO?

Accepted Answer

Direct Preference Optimization. A simpler alternative to RLHF — trains directly on preference pairs without a separate reward model.

Question 32

What is Embedding?

Accepted Answer

A vector representation of text (or image, audio) that lets a computer compare meaning, not just words. The math underneath search and RAG.

Question 33

What is Embedding model?

Accepted Answer

A specialised model whose job is to produce embedding vectors for text. Different from the LLM you're probably calling.

Question 34

What is Encoder?

Accepted Answer

The half of a transformer that reads an input and turns it into vectors. BERT is encoder-only. GPT is decoder-only.

Question 35

What is Eval?

Accepted Answer

A test that scores model output against expected behaviour. Production-grade AI needs them; vibe-checking does not scale.

Question 36

What is Few-shot?

Accepted Answer

Including a few worked examples in the prompt to teach the model the output format. No training needed.

Question 37

What is Fine-tuning?

Accepted Answer

Further training a pre-trained model on a smaller, specific dataset to specialise it for a task or voice.

Question 38

What is Foundation model?

Accepted Answer

A large pre-trained model that can be adapted to many tasks. GPT-4, Claude, Gemini, Llama are foundation models.

Question 39

What is FP4?

Accepted Answer

4-bit floating-point precision for model weights. ¼ the memory of FP16. Used to fit big models into small GPUs with minimal quality loss.

Question 40

What is Function calling?

Accepted Answer

A structured way of getting a model to output a function name and arguments instead of free text. Reliable tool use.

Question 41

What is Gemini?

Accepted Answer

Google DeepMind's family of multimodal models. Strong on long context and image/video understanding.

Question 42

What is Generative AI?

Accepted Answer

Umbrella term for AI that produces new content (text, images, audio, code). Most current AI hype is generative AI.

Question 43

What is GPT?

Accepted Answer

OpenAI's family of large language models. GPT-3, GPT-4, GPT-5 are major eras.

Question 44

What is GPU?

Accepted Answer

A chip originally for graphics, now the main workhorse for AI training and inference. NVIDIA dominates; AMD and others trail.

Question 45

What is Gradient?

Accepted Answer

How much each weight in a model should change to reduce loss. The thing backpropagation computes.

Question 46

What is GRPO?

Accepted Answer

Group Relative Policy Optimization. A newer alignment training method, faster and simpler than PPO. Used in DeepSeek-R1.

Question 47

What is Guardrail?

Accepted Answer

A check layered around a model that blocks unsafe inputs or outputs. The seatbelt on top of the safety training.

Question 48

What is Hallucination?

Accepted Answer

When a model produces output that sounds confident but is factually wrong or invented. The defining failure mode of 2024-26 LLMs.

Question 49

What is Hosted?

Accepted Answer

A model you call via an API run by someone else. The opposite of self-hosted.

Question 50

What is Hugging Face?

Accepted Answer

The dominant model hub for open-source AI. Hosts models, datasets, demos. Also the Transformers and PEFT libraries.

Question 51

What is IndiaAI Mission?

Accepted Answer

India's national AI policy initiative. Funds compute, datasets, applications, and skilling for Indian AI builders.

Question 52

What is Indic-NLP?

Accepted Answer

Natural-language processing tools and models specifically designed for Indian languages — handling Devanagari, transliteration, code-mixing.

Question 53

What is Inference?

Accepted Answer

Running a trained model on new inputs to get outputs. Most of what builders care about — training is upstream.

Question 54

What is Inference cost?

Accepted Answer

The per-token cost of running a model. Quoted as ₹/M-tokens or $/M-tokens, input and output separately.

Question 55

What is Instruction-tuning?

Accepted Answer

Fine-tuning a model on input/output pairs framed as instructions. The bridge from raw pre-trained models to usable chatbots.

Question 56

What is Jailbreak?

Accepted Answer

A prompt crafted to bypass a model's safety training and get it to generate restricted content.

Question 57

What is JSON mode?

Accepted Answer

A setting that constrains the model to output valid JSON. Useful for structured extraction and pipeline integration.

Question 58

What is Knowledge distillation?

Accepted Answer

Training a small model to mimic a large one. Cheaper to run, surprisingly close in quality on many tasks.

Question 59

What is Knowledge graph?

Accepted Answer

A structured network of entities and relationships. Sometimes paired with LLMs for grounding and reasoning.

Question 60

What is Latency?

Accepted Answer

Time between sending a request and getting a response. For LLM APIs, includes time-to-first-token and tokens-per-second.

Question 61

What is Llama?

Accepted Answer

Meta's family of open-weight LLMs. Llama 4 released in 2025. The most-deployed open model in production.

Question 62

What is LLM?

Accepted Answer

A neural network trained on massive text data to predict the next token. Powers most modern AI assistants.

Question 63

What is LMArena?

Accepted Answer

A community benchmark that ranks models by head-to-head human votes on real prompts.

Question 64

What is LoRA?

Accepted Answer

Low-Rank Adaptation. A fine-tuning technique that trains small adapter layers instead of the full model.

AI Glossary