Show HN: Better Graphs – Teach agents to stop making plain Matplotlib slop
This course teaches AI agents, interns, or developers how to create beautiful and informative graphs by following a set of rules and best practices, moving beyond Matplotlib's default bland appearance. Core principles include using the object-oriented API, applying a consistent theme, trimming chart elements, and choosing chart types wisely.
Preface — what this course is
Most matplotlib output looks like matplotlib: boxed-in spines, a primary blue, a title that just repeats the y-axis label, and ticks at whatever round numbers the library guessed. Fixing that is not a question of competence but of taste, and taste can be written down as rules. This curriculum is the rulebook.
Note
Note that although this ’blog’s curriculum and content choice was entirely designed by myself as a way to teach AI agents, interns, or myself what kind of graphs look and feel right, almost all the actual code snippets and text here are a product of a very long conversation with one AI agent (as of June 2026, that is Claude Code running Opus 4.8. I would like to revisit this idea every so often when I feel a step function has been reached in the performance of agents to see what the new kids on the block can come up with!
Edward Tufte’s The Visual Display of Quantitative Information is the north star: maximize the share of ink that carries data, cut the rest, and never let the chart mislead. Each module states one principle, builds one thing, and extracts one durable rule that gets folded back into the project’s reusable artifacts (CLAUDE.md, VISUALIZATION_GUIDE.md, house_style.py) — so a future agent can produce the same quality with zero re-explanation.
TipHow to read the code
Every figure obeys the house rules: the object-oriented (OO) API, apply_theme() first, a title that states the takeaway, trimmed spines, and unit-aware ticks. The only exception is a counterexample that is explicitly labelled “the look we’re escaping” — those keep matplotlib’s raw defaults on purpose.
import numpy as np import matplotlib.pyplot as plt from matplotlib.ticker import MultipleLocator, MaxNLocator import house_style
Rule 0, applied once: the theme is the first plotting line.
house_style.apply_theme("detailed")
Grey-for-context + one accent is our DEFAULT for single-message charts — not a
mandate. Where several series genuinely need telling apart, reach for a fuller
(principled, non-rainbow) palette. The house accent is #6400FF.
GREY = "#9e9e9e" ACCENT = "#6400FF"
Data is numpy, not pandas. load() returns a dict of arrays (a dataset's columns);
the small helpers below cover the few table operations the figures need. See ndata.py.
from ndata import load, select, group, pivot, rolling_mean, corr, std, finite, MONTHS
1 M0 — Environment & the mental model
Principle. Almost all “ugly default” pain is really fighting the pyplot state machine — the plt.plot, plt.title, plt.xlabel style, where “the current axes” is invisible global state you can only nudge, never hold. The cure is one line, and it reorganizes everything that follows.
1.1 The pyplot state machine vs. holding handles
The plt.* interface always draws on a hidden “current figure / current axes”. That is convenient for a one-off in a REPL and a trap for anything real: you cannot point at the second of two axes, you cannot pass the line to a helper, and every tweak is a fresh global command hoping the right object is current.
Hold the handles instead:
fig, ax = plt.subplots() # fig = the whole canvas; ax = one plotting region ax.plot(x, y) # operate on the OBJECT, not on hidden global state
fig and ax are real objects. You can store them, pass them around, ask them questions (ax.get_xlim()), and hand ax to a styling function. Everything below is a consequence of this.
ImportantExtract — the OO-API rule
Always fig, ax = plt.subplots(constrained_layout=True). After that, no plt.* plotting calls — operate on ax/fig. The only plt.* you keep are plt.subplots itself and plt.style.* (both wrapped by house_style.apply_theme).
1.2 The look we’re escaping
Here is a perfectly ordinary chart drawn the perfectly ordinary way. Read it, then notice everything you have to squint past: the box of four spines, a default-blue line that means nothing, ticks at 0/2/4/…, a y-axis in bare thousands, and a title that just names the variable.
months = np.arange(1, 13) revenue = np.array([41, 38, 46, 52, 55, 61, 68, 64, 59, 50, 47, 44]) * 1000
A counterexample: defaults ON PURPOSE (note: no apply_theme, raw style context).
with plt.style.context("default"): fig, ax = plt.subplots() ax.plot(months, revenue, marker="o") ax.set_title("revenue") ax.set_xlabel("month") ax.set_ylabel("revenue")
Matplotlib’s raw defaults — competent, anonymous, and a little hard to read.
1.3 The same data, rebuilt on the OO API
Same numbers, same five lines of plotting — but now every Artist is something we reached for on purpose. Grey carries the series; one accent point carries the message; the title states the takeaway; the spines are trimmed and offset; the y-axis reads in dollars; the peak is labelled directly so the eye never detours to a legend.
month_names = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"] peak_month_index = int(revenue.argmax())
fig, ax = plt.subplots(constrained_layout=True) ax.plot(months, revenue, color=GREY, lw=2) # the monthly series = context, in grey ax.plot(months[peak_month_index], revenue[peak_month_index], "o", color=ACCENT, ms=9, zorder=5)
house_style.takeaway_title(ax, f"Revenue peaked in {month_names[peak_month_index]}, then cooled into year-end") house_style.despine(ax) # drop top/right, offset the rest house_style.thousands(ax, "y") # 68000 -> 68,000
ax.set_xticks(months) ax.set_xticklabels(month_names) ax.margins(x=0.02)
Direct label on the accent point — in DATA coordinates, offset a few points up.
(Keeping the handle both suppresses Jupyter's repr echo and proves the point:
the annotation is just another Artist you can hold and re-.set_*() later.)
peak_label = ax.annotate( f"${revenue[peak_month_index]:,.0f}", xy=(months[peak_month_index], revenue[peak_month_index]), xytext=(0, 12), textcoords="offset points", ha="center", color=ACCENT, fontweight="bold", )
Same data on the OO API: grey for context, one accent for the point, a title that says something.
The difference is entirely in which Artists we grabbed and what we set on them — which is the whole game.
1.4 Figure → Axes → Artist: the hierarchy
Three nested ideas explain the whole library:
Figure — the canvas. It owns the size, the DPI, and one or more Axes. Saving is a Figure operation (fig.savefig).
Axes — one plotting region: its own data limits, ticks, labels, and spines. “Subplot” = one Axes.
Artist — everything drawn is one. Every line, marker, tick, label, spine, and patch is an Artist you can grab and .set_*(). There is no styling you cannot reach this way.
The figure below labels its own parts — and does the labelling in axes-fraction coordinates, which sets up the next idea.
x_values = np.linspace(0, 10, 200)
fig, ax = plt.subplots(figsize=(8, 5), constrained_layout=True) (sine_line,) = ax.plot(x_values, np.sin(x_values), color=ACCENT, lw=2) # a Line2D Artist we keep a handle to ax.set_title("Everything you see is an Artist you can grab and .set_*()", loc="left") ax.set_xlabel("x") ax.set_ylabel("sin(x)")
def callout(text, point, label_at): ax.annotate( text, xy=point, xycoords="axes fraction", xytext=label_at, textcoords="axes fraction", fontsize=9, color="#444444", ha="left", arrowprops=dict(arrowstyle="->", color="#444444", lw=1), )
callout("Line2D — the data", point=(0.32, 0.80), label_at=(0.06, 0.95)) callout("Axes — the plotting region", point=(0.55, 0.50), label_at=(0.46, 0.22)) callout("Spine", point=(0.00, 0.55), label_at=(0.10, 0.42)) callout("Tick label", point=(0.00, 0.00), label_at=(0.10, 0.10))
fig.text places relative to the whole CANVAS (figure fraction), not this Axes.
source_note = fig.text( 0.99, 0.01, "fig.text() → figure-fraction coords", ha="right", va="bottom", fontsize=8, color=GREY )
Every visible thing is an Artist. Callouts are placed in axes-fraction coordinates.
1.5 Coordinate systems & transforms
When you place text or an arrow, you choose which coordinate system the numbers mean. matplotlib gives you four, and switching between them is the difference between “annotation glued to a data point” and “annotation glued to a corner of the panel”:
System What (0.5, 0.5) means Reach it with
data the middle of the data range (moves if the data changes) default xy=
axes fraction the centre of this Axes, always xycoords="axes fraction" / transform=ax.transAxes
figure fraction the centre of the whole canvas fig.text / transform=fig.transFigure
display a pixel on screen rarely by hand
The proof that axes-fraction is data-independent: the same (0.5, 0.5) lands in the same visual spot in both panels below, even though their y-ranges differ by 1000×.
fig, axes = plt.subplots(1, 2, figsize=(9, 3.4), constrained_layout=True)
for ax, y_scale in zip(axes, [1, 1000]): ax.plot(x_values, y_scale * np.sin(x_values), color=GREY) ax.set_title(f"y range ≈ ±{y_scale}", loc="left", fontsize=11)
transform=ax.transAxes makes these numbers mean "fraction of THIS axes".
ax.plot(0.5, 0.5, "o", color=ACCENT, ms=11, transform=ax.transAxes) ax.text( 0.5, 0.5, " (0.5, 0.5) axes-fraction", transform=ax.transAxes, va="center", color=ACCENT, fontsize=9, )
Same axes-fraction point (0.5, 0.5) in both panels — identical position despite a 1000x data-range gap.
1.6 Before & after — the same series, raw vs. house
Everything in M0 in one comparison, on real data: the classic airline-passengers series (1949–1960), drawn first with matplotlib’s untouched defaults, then rebuilt on the OO API under apply_theme. The defaults and the house theme are different rcParams, so this has to be two figures — you can’t put a truly raw axes beside a themed one in a single canvas, because the theme is global.
flights = load("flights") month_index = np.array([list(MONTHS).index(month_name) for month_name in flights["month"]]) # name → 0..11 decimal_year = flights["year"] + month_index / 12 chronological = np.argsort(decimal_year) decimal_year, passengers = decimal_year[chronological], flights["passengers"][chronological]
with plt.style.context("default"): # the only honest way to show raw defaults post-apply_theme fig, ax = plt.subplots() ax.plot(decimal_year, passengers) ax.set_title("Passengers") ax.set_xlabel("date") y_axis_label = ax.set_ylabel("passengers")
Before — matplotlib’s untouched defaults: boxed in, primary blue, a title that just renames the y-axis.
fig, ax = plt.subplots(constrained_layout=True) ax.plot(decimal_year, passengers, color=GREY, lw=1.1) # the monthly series = context
twelve_month_trend = rolling_mean(passengers, 12) # the point: the 12-month trend ax.plot(decimal_year, twelve_month_trend, color=ACCENT, lw=2.6)
growth_multiple = passengers.max() / passengers.min() # compute the multiple, don't guess it house_style.despine(ax) house_style.takeaway_title( ax, f"US air travel roughly {growth_multiple:.0f}×'d in a decade — and the summer peaks grew with it" ) ax.set_xlabel("Year") trend_label = ax.annotate( "12-month average", (decimal_year[-30], twelve_month_trend[-30]), color=ACCENT, fontsize=9, xytext=(8, -2), textcoords="offset points", ) source = fig.text( 0.0, -0.02, "Data: classic airline-passenger counts, monthly 1949–1960.", fontsize=8, color="#8a8a8a" )
After — the OO API under apply_theme: grey for context, the accent on the trend, a takeaway title, trimmed spines.
The data didn’t change — the decisions did: trimmed spines, grey for context with the accent on the trend, and a title that says what happened instead of renaming the axis.
ImportantExtract — M0 rules
Hold handles. fig, ax = plt.subplots(constrained_layout=True); never plt.* plotting afterward.
Everything is an Artist — to change anything, grab it and .set_*().
Pick the coordinate system deliberately — data for things tied to values, axes-fraction for things tied to the panel (titles, source notes, callouts that should not move with the data).
2 M1 — Chart choice: ask before you plot
M0 gave you the tools t
[truncated for AI cost control]