Work-in-Progress Limits
A work-in-progress limit is a hard cap on how many items are allowed in a given stage of a workflow at the same time. A “Ready for QA” column with a WIP limit of 3 cannot hold a fourth ticket. To put one in, someone has to first pull one out. The rule sounds like a brake. It usually acts as an accelerator, and the reason is arithmetic, not motivation.
Little’s Law: why a cap speeds things up
Little’s Law relates three quantities for any stable queue:
average cycle time = average WIP / average throughput
Rearranged, WIP = throughput × cycle time. The practical move hides in the first form. If a team finishes 5 items per week (throughput) and is carrying 25 items in progress (WIP), the average item takes 25 / 5 = 5 weeks to cross the board. Cut WIP to 10 and, at the same throughput, cycle time drops to 2 weeks. Nothing about how fast anyone works changed. The items just stopped waiting in line behind each other.
This is the mechanism behind the Kanban slogan “stop starting, start finishing.” Throughput is bounded by your slowest stage. Starting more work does not raise it, it only inflates WIP, and by Little’s Law inflated WIP at fixed throughput means longer cycle times and more work sitting idle.
One caveat the formula carries: it only holds in steady state, where arrival rate and departure rate are roughly equal over the window you measure. Apply it to a system where work is piling up faster than it clears and the average it reports is a snapshot of a moving target, not a stable property. The point of a WIP limit is partly to force that steady state by capping arrivals.
Where to put the cap
A WIP limit is most useful placed just before the bottleneck, which connects directly to the theory of constraints: the throughput of the whole system is set by its single most constrained stage, so protecting that stage from a flood of arrivals protects the whole flow. In the QA Handoff Workflows case, a single QA analyst behind a fast dev team is the constraint. A low cap on the “Ready for QA” column means that when it fills, developers cannot open new tickets and instead swarm on reviewing and testing the work already in flight. The cap converts an invisible, silently growing backlog into a visible stop signal.
Without that signal the queue does not just grow, it amplifies. A modest rise in upstream throughput produces a disproportionate swing in the downstream queue, the same feedback-and-delay dynamic as the Bullwhip Effect. WIP limits are a structural intervention in the Systems Thinking sense: they change the system’s behavior by changing its constraints, not by exhorting people to work differently.
Setting the number
There is no universal right value. Common starting heuristics: one or two items per person in a stage, then tighten until the limit starts to pinch and surface bottlenecks. The pinch is the point. A limit that never blocks anyone is not doing anything. When the cap is regularly hit at one stage, that stage is your constraint, and the block is information about where to add capacity or unblock flow. A board that is comfortable everywhere is usually a board carrying too much WIP to see its own bottleneck.
Blocked work does not count
A person with four or five items “in progress” is almost always a measurement error, not a multitasker. “In progress” should mean actively moving, and a person can really only move one or two things at once. The slots beyond that are usually work that started and then stalled, sitting under the wrong label.
The fix is to give blocked work its own state rather than counting it against the In Progress cap. The moment a ticket is waiting on a code review, an external team, or an answer, it moves to a Blocked state (or carries a blocked flag, which Linear supports through issue relations) and the person’s active slot frees up to pull the next real thing. This is not a loophole in the cap, it is what makes the cap work. A hard per-person limit of one or two, with blocked work pulled out, is precisely what surfaces blockers: if someone has five things going and three are secretly stuck, the team has a blocker problem nobody can see. Force the cap and those three pop out into the Blocked column where standup can act on them.
A pile-up in Blocked is its own distinct signal, separate from a pile-up in active work. Active work piling up means the team is starting faster than it finishes. Blocked work piling up means dependency hell, slow reviews, or external waits. Conflating the two by letting blocked tickets sit in In Progress hides both diagnoses at once.
Try it
Measure the Little’s Law effect on a real board (1-2 hours, any Kanban tool with cycle-time reporting). Pull cycle time and throughput for your team over the last month, then compute implied WIP and compare it to the actual average WIP on the board. They should roughly match if the system is near steady state. Then set a WIP limit on your most crowded column at about half its current occupancy and watch cycle time over the next two weeks. What you are looking for: throughput holds roughly flat while average cycle time falls. If throughput drops instead, the cap is below the real bottleneck capacity and you cut too far. That divergence is the law telling you where the constraint actually sits.
See also
- QA Handoff Workflows — the QA column as a bottleneck and the WIP cap that protects it.
- Bullwhip Effect — how an uncapped downstream queue amplifies small upstream changes.
- Systems Thinking — WIP limits as a structural lever rather than an appeal to effort.
- [private link] — the formula underneath the whole argument.
- Theory of Constraints — why you cap before the bottleneck.
Sources
- What Is Little’s Law — Businessmap — the WIP / throughput / cycle-time relationship and its steady-state assumption.
- Stable Systems: Little’s Law and Kanban — Nave — why the law only holds when arrival and departure rates match.
- WIP Limits in Kanban — TeachingAgile — capping WIP to reduce context-switching and surface bottlenecks.