The model · NFL DRAFTIGAMI

Scorigami is a sport's catalog of new events — final scores that have never happened before, prized for the joy of finding more of them. NFL DRAFTIGAMI applies the idea to the draft: every "no team has ever taken position X at pick #Y" cell is a draftigami, and the tracker on this site enumerates them. What this page contributes is the other half of the project — a probability model that assigns a likelihood to every empty cell. Which ones could plausibly happen next year? Which would be legendary pulls? Here's how the model works.

Every spring, 32 NFL teams take turns picking the players they like best on national TV. No single pick is predictable. The pattern is. Quarterbacks tend to go early. Special teamers tend to go late. Running backs used to dominate the top of round one; now they barely show up there. Tight ends were almost never picked early in the 80s. Kyle Pitts went 4th overall in 2021. 12 personnel — one back, two tight ends, two receivers on the field at once — became the league's most efficient passing formation in the 2010s, and the draft absorbed it.

The model on this page is a 2-D tensor-product P-spline GAM that learned that pattern from 60 years of draft data. Rather than explain it, this page builds it. Six rungs, each one a visualization that climbs one step higher than the last. Pick a position, then scroll.

First · a guaranteeThe model isn't allowed to look at the future.

Every cell page on this site (for example, /c/te-96, the cell for tight ends at pick #96) shows a sparkline of P(this position | this pick) going back to 1970. None of those values gets to see the future. The 2010 value was computed by a model trained on every draft from 1970 through 2009 only — not 2011, not 2024, not 2027. The 1995 value used 1970–1994. The 1971 value used 1967–1970. And so on.

Forecasters call this forward chaining. For each year Y in the training range, the spline gets refit on the rows where year < Y, and we read year Y off that fit's surface. The basis stays fixed; only the coefficients β change. Each year has its own β, its own posterior covariance, its own credible interval — which is why the CI band on every sparkline is wide in 1970 (only a couple years of priors) and narrow by 2025 (55 years of priors). Eval and production go through the identical fit_forward_chained() path.

Wet streets don't cause the rain. Future drafts don't get to shape past predictions. The cost is real (58 separate spline fits per build instead of one), but the payoff is that every value you see on every cell page is an honest out-of-sample prediction at that point in history. A spline GAM is great at borrowing strength across neighboring picks — but only the picks that already exist when you're predicting. Looking sideways: fine. Looking forward: cheating.

Position Year — position drives all six panels · year drives panels 4–6

Rung 1 · the concreteOne pick.

—

One dot. Two coordinates: the year of the draft (x) and the pick number (y). That's all a draft pick is, geometrically.

Rung 2 · every observationNow do that for every pick of this position.

Each orange dot is one actual NFL draft pick of the selected position. The whole career of the position, on one plane. — picks since the merger.

You can already see the structure with your eyes — clusters at the top of round 1, bands across the late rounds. The model has to learn this from only the dots.

Rung 3 · the first abstractionForget the year. Just count.

Stack all the dots from Rung 2 onto the pick-number axis. How often does this position go at each pick number, ignoring when? That's the gray histogram. The blue curve is a 1-D cubic spline fit to that histogram — the smooth version.

The histogram is data. The curve is a model. A 1-D spline is just a smooth function with a finite number of bumps — fit by minimizing (distance to data) + (penalty for too much wiggle). Same machinery as the 2-D model, one axis short.

Rung 4 · time entersBut the league drifts. So fit one curve per decade.

Same idea as Rung 3, but split the data into six decades and fit six separate 1-D splines. The bold black line is the year you've selected — drag the slider to see where it sits relative to its decade and to history.

Decades are still a coarse axis — they jump in 10-year steps. What we really want: a curve that smoothly evolves year by year. That requires a 2-D spline.

Rung 5 · two dimensions of smoothingSmooth jointly over year and pick.

Now we fit a single 2-D smooth surface across both axes at once — the tensor product of a year-spline and a pick-spline, regularized so the surface prefers gentle drift to abrupt jumps. The blue heatmap is that surface. The orange dots are the same picks from Rung 2, drawn on top so you can see how the smooth fits the data. The vertical red line is your selected year.

P at pick #1

—

Most likely pick

—

Peak P

—

Fitted P(pos | year, pick): 0% peak% · actual draft pick

Rung 6 · lift itThe probability surface, in 3-D.

Same surface as Rung 5, except now height is probability. Year runs along one ground edge, pick along the other, P points up. The orange spheres are real picks, anchored to the surface at their cell's height. Drag to rotate, scroll to zoom, hit Play to sweep the year cursor through time.

Fitted surface Raw picks Year cursor Decade ribbons Threshold sea-level log-Z (rare events stand out)

Toggles let you peel the rungs back: hide the surface and you're back at Rung 2 (just dots in 3-D space). Show the threshold plane and any region where the model says "≥ 10% chance of this position" lights up as land above sea level. Show the decade ribbons and you can see how the per-decade curves (Rung 4) live inside the 2-D surface (Rung 5). Our Bills anchors stay labeled in 3-D — drag the camera and watch them follow.

Coda · what's the model for, anyway?The most-likely "first ever" picks of 2027.

Once you've fit a smooth surface, you can ask it about cells that haven't happened yet. The same model that drew the landscape above can rank every position-by-pick combination that has never existed in NFL draft history by its probability of filling for the first time in the next draft.

Some of those empty cells are likely-but-just-haven't-happened-yet (a Center at pick #93 — could happen any year). Some of them are legendary pulls — a Specialist at pick #5, a Quarterback at pick #257 — that the model says is so improbable it's never happened in 60 years and almost certainly won't this year. When one of those does fire off, that's a draftigami.

→ See the live ranked list of remaining draftigamis

Appendix — how the model works

Model class

For pick i with features (year y, pick p) and class k in 13 positions, log-odds against a reference class r (we use WR — common in every era, stable softmax baseline) are:

η_k(y, p) = Σ_{a,b} β_{k,a,b} · B_a(y) · B_b(p)
P(class = k | y, p) = exp(η_k) / Σ_j exp(η_j)

B_a(y) are 25 cubic B-splines on year (clamped on 1967-2027); B_b(p) are 30 cubic B-splines on pick (clamped on 1-262). The tensor product gives 25 × 30 = 750 basis functions per active class. With K-1 = 12 active classes the model has 9,000 coefficients — small enough to fit by L-BFGS in a few seconds.

Penalty

A 2nd-order difference matrix on each axis penalizes curvature, not magnitude. A flat coefficient grid pays zero penalty (no wiggle anywhere); a randomly squiggly grid pays a lot. This is the right inductive bias for a smoothly-evolving draft surface — slow structural shifts (the WR explosion of the 2010s, the death of the FB) are cheap to encode, year-to-year reshuffling is expensive. Production runs at λ_y = λ_p = 10. The loss landscape is shallow within an order of magnitude, so the exact choice doesn't matter much.

Fit

Sparse cubic B-spline bases via scipy.interpolate.BSpline.design_matrix; tensor product computed row-by-row (each row has at most 16 nonzeros). Optimizer is scipy.optimize.minimize with method L-BFGS-B and an analytic gradient; the full Hessian is never formed during the fit. Forward chaining: for each year Y, refit on all rows where year < Y, using the previous year's β as an L-BFGS warm start. That drops iteration counts by 5–10× on consecutive years.

Confidence intervals

Posterior covariance is approximated by inverting a block-diagonal Hessian — one (M × M) block per active class. Per-(year, pick) CI bands come from the delta method on the log-odds variance. Per-year inversions cost about a second per active class on M = 750; across all 57 years × 12 classes that's roughly a minute, paid once per build, amortized into time_series.values_lo / values_hi in data.json.

How well does it do?

Forward-chained NLL on every draft from 1980 through 2027:

Variant	NLL	Brier
Naive frequency (rolling, all-prior)	~4.13	0.93
GBM + σ=16 + EMA hl=3 (prior winner)	2.4553	0.9085
2-D tensor P-spline (this model)	2.4519	0.9080

The headline is the 0.14 percent NLL improvement. The bigger story is what the spline gives you for free: joint smoothing across (year, pick) instead of two separate post-hoc smoothers stapled together; deterministic refits; naturally normalized outputs (softmax, no row-renormalization step); and the Laplace-approximation CI bands above.

Code lives in scripts/spline_model.py. The full writeup with ten diagnostic figures (calibration, residuals, GCV grid, effective DoF over time) is at docs/model.md.