Latency is a Design Decision

Latency is usually treated as an engineering problem. Something is slow, so you profile it, and you make it faster, and eventually you hit a wall where making it faster is either impossible or not worth the cost. At that point, most teams give up and ship it slow.

This is a mistake, because the moment latency becomes unavoidable, it becomes a design problem. And design has a rich vocabulary for dealing with it — a set of patterns that can make a 200ms delay feel instant, a 2-second delay feel responsive, and a 20-second delay feel productive. The raw latency number is only one variable in the user's perception. How you stage it is the other.

This post is about that second variable. It is the part that engineers often hand off to designers and that designers often defer to engineers, which is how it ends up being nobody's job.

Perceived Latency is the Real Number

The first principle is that users do not experience latency in milliseconds. They experience it in feelings — responsive, sluggish, broken, stuck. Those feelings correlate with raw timing, but not linearly, and the correlation is strongly mediated by whatever happens on screen during the wait.

A 500ms operation that shows a skeleton of the final layout feels fast. A 200ms operation that shows a spinner feels slow. The underlying latency is opposite, but the perceived latency is the one users remember, and the perceived latency is almost entirely a function of design.

Same 500ms delay. The version with a skeleton feels instant; the version with a spinner feels interrupted.

Once you accept that perceived latency is the number that matters, your optimisation target shifts. You stop asking "how do we make this faster?" and start asking "how do we make this feel already in progress?" Those are different questions, and the second one usually has cheaper answers.

Hiding vs. Revealing

Every latency problem forces a choice: do you hide the work, or do you reveal it?

Hiding the work is appropriate when the user does not need to know what is happening, when the wait is short, and when the outcome is certain. A form submission that takes 300ms should not interrupt the user with a spinner — it should just complete, and if the completion is visually obvious (a success toast, a page transition, an updated value), that is enough.

Revealing the work is appropriate when the wait is long, when the outcome is uncertain, or when the user is trying to decide whether something is still happening. A file upload over a slow connection, an LLM response, a long database query — these need visible progress, not because the progress bar makes them faster, but because the visibility turns dead time into confirmed time.

Hidden work on a short operation; revealed work on a long one. Match the pattern to the duration.

The mistake most teams make is being inconsistent — hiding long operations so users think the system is broken, or revealing short operations so users feel nagged. Pick a threshold for your product (I default to 1 second, but your product's context may shift this) and apply it.

Optimistic UI

For any operation where the success path is overwhelmingly common, there is a third option between hiding and revealing: pretending it already happened.

Optimistic UI updates the interface as though the operation succeeded the moment the user initiated it. The actual network request goes out in parallel, and if it fails — which is rare — you reconcile with an error state. The user sees zero latency, because from their perspective, the system responded instantly.

This sounds risky, and it has a real cost in failure-path complexity, but the perceptual benefit is huge. Likes, favourites, "mark as read," "add to cart," optimistic sends in chat — all of these are domains where the success rate is high enough to justify optimism, and the UX win is the difference between a product that feels alive and one that feels like you are always waiting on it.

A like animation that fires on tap, not on server response. The network catches up quietly behind it.

The thing to avoid is optimistic UI on operations where failure is common or where the stakes of a false success are high. Optimistically showing a payment as completed is a lie; optimistically showing a message as sent, when you can reliably retry, is a reasonable promise. Choose the pattern based on the trust you can honour, not the pattern's availability.

Skeletons and Shape Preservation

When you do need to show a loading state, the single most effective pattern I know is the skeleton — a placeholder layout that matches the shape of the content that is about to appear.

The reason skeletons work is that they eliminate the most disorienting moment of loading: the layout shift when content arrives. A spinner leaves the page blank, then content appears, and the user's eyes jump to find the new information. A skeleton tells the user where to look before the information gets there, so when it does, the jump is smaller and the perception of "load" becomes "fill."

Skeletons also communicate intent. They say: we know what is about to be here, and we are confident enough to reserve space for it. This is a completely different signal than a spinner, which says only that something, somewhere, is happening.

The implementation trap is to over-engineer skeletons into faithful previews of the final content. That is not the point. A skeleton is a silhouette, not a mockup. It should be simple enough to generate automatically, precise enough to reserve correct space, and nothing more. Shimmer is optional; I tend to leave it out unless the load is long enough that the user needs reassurance that the placeholder is intentional.

Giving Time a Shape

The longest category of latency — uploads, generations, renders, anything that takes more than five seconds — benefits from a different pattern entirely. At that length, skeletons stop working, because the user no longer believes the page is loading. They start wondering if something is wrong.

The fix is to give the time a shape. A progress bar is the simplest version: it turns "still going" into "going through this sequence of stages." More sophisticated versions break the work into named steps — "downloading," "processing," "saving" — each with its own indicator. The user's question is no longer "is this happening?" but "where in the process are we?" — which is a much more tolerable question to sit with.

The critical detail is that progress must be honest. A progress bar that fills to 95% and sits there for thirty seconds is worse than no progress bar at all, because it destroys trust for every future progress bar you show. If you cannot estimate honestly, prefer step-based progress over percentage-based, because steps don't require you to predict duration, only to announce transitions.

Footnotes

Jakob Nielsen's original research on response-time thresholds (100ms, 1s, 10s) is forty years old and still almost entirely correct.

The Chrome Web Vitals team has published extensively on perceived latency metrics — LCP, INP, and CLS are design constraints as much as engineering ones.

Luke Wroblewski's talk on mobile design covers the tension between optimistic UI and trust in a way I have not seen improved on.

The Linear engineering blog has some of the most thoughtful writing on local-first and optimistic patterns in a real product. Worth reading the whole archive.

For skeletons specifically, Lucas Bebber's codepen collection is the rabbit hole I fell into while writing this post.