2026-06-14站内改写3 min readUpdated: 2026-06-14

Show HN: Landmark AI and ML research explained, redrawn, animated

Rudrite Research offers interactive animated explainers of landmark AI and ML papers, making complex systems and ideas legible. Free and open with over 100 explainers and guided reading tracks.

SourceHacker News AIAuthor: mridul_sahu

Rudrite Research — the frontier, made legible

Interactive, animated, visual explainers of landmark AI & ML papers — the systems and ideas behind the models you use, redrawn and made legible. Free and open.

Browse all 100 explainers · Guided reading tracks

Attention Is All You Need

FlashAttention

PagedAttention (vLLM)

Megatron-LM

DeepSeek-R1

GPT-3: Language Models are Few-Shot Learners

ZeRO: Zero Redundancy Optimizer

Mixtral of Experts

Training Compute-Optimal Large Language Models

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

BERT: Pre-training of Deep Bidirectional Transformers

DeepSeek-V3

Qwen3

OLMo 2

MiniMax-01

Gemma 4

Scaling Laws for Neural Language Models

Adam: A Method for Stochastic Optimization

Deep Residual Learning for Image Recognition

Denoising Diffusion Probabilistic Models

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

LoRA: Low-Rank Adaptation of Large Language Models

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

GSPMD: General and Scalable Parallelization for ML Computation Graphs

Pathways: Asynchronous Distributed Dataflow for ML

Ring Attention with Blockwise Transformers for Near-Infinite Context

Efficiently Scaling Transformer Inference

Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Fast Inference from Transformers via Speculative Decoding

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Training language models to follow instructions with human feedback

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Constitutional AI: Harmlessness from AI Feedback

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

ReAct: Synergizing Reasoning and Acting in Language Models

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

RoFormer: Enhanced Transformer with Rotary Position Embedding

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Learning Transferable Visual Models From Natural Language Supervision

High-Resolution Image Synthesis with Latent Diffusion Models

Scalable Diffusion Models with Transformers

Robust Speech Recognition via Large-Scale Weak Supervision

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Group Sequence Policy Optimization

DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving

CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

YaRN: Efficient Context Window Extension of Large Language Models

Efficient Streaming Language Models with Attention Sinks

Generative Adversarial Networks

Segment Anything

Visual Instruction Tuning

s1: Simple test-time scaling

Tülu 3: Pushing Frontiers in Open Language Model Post-Training

Let's Verify Step by Step

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

KAN: Kolmogorov–Arnold Networks

Differential Transformer

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

RWKV: Reinventing RNNs for the Transformer Era

Titans: Learning to Memorize at Test Time

Byte Latent Transformer: Patches Scale Better Than Tokens

The Llama 3 Herd of Models

Mistral 7B

Phi-4 Technical Report

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Flow Matching for Generative Modeling

Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty

Rewarding Doubt: Calibrated Confidence Expression of LLMs

Why Language Models Hallucinate

τ-bench: Tool-Agent-User Interaction in Real-World Domains

ToolRL: Reward is All Tool Learning Needs

Group-in-Group Policy Optimization for LLM Agent Training

MiniMax-M1: Scaling Test-Time Compute with Lightning Attention

ProRL: Prolonged RL Expands Reasoning Boundaries

The Entropy Mechanism of RL for Reasoning Language Models

Spurious Rewards: Rethinking Training Signals in RLVR

GenPRM: Generative Process Reward Models

From Hard Refusals to Safe-Completions

Proximal Policy Optimization Algorithms

Efficiently Modeling Long Sequences with Structured State Spaces

Auto-Encoding Variational Bayes

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Toolformer: Language Models Can Teach Themselves to Use Tools

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Muon is Scalable for LLM Training

Consistency Models