New Course: Run Better Retros in Jira

Learn with Easy Agile

When the Numbers Don't Matter: Why Teams Miss Deadlines Despite Perfect Estimates

Contents
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
Subscribe to our newsletter
  • website.easyagile.com/blog/rss.xml

TL;DR

Agile estimation challenges are rarely about the number. Planning poker is useful when teams treat vote spreads as signals about scope, risk, and dependencies. Software team alignment during estimation improves sprint predictability more than chasing velocity. Velocity alone cannot forecast capacity because context changes across sprints. Coordination work must live in Jira as first-class items, or retro actions will not get done. Easy Agile TeamRhythm keeps planning and estimation in one place: story mapping for sequence, planning poker on the issue for shared context, and retrospective actions that turn into tracked work. The new ebook, Guide to Building Predictable Delivery with Jira in 2026, explains how to plan with clarity, align on effort, and turn problems into progress. The outcome is fewer rollovers, clearer handoffs, and more reliable delivery.

---

Estimation in software teams has become a performance ritual. Planning poker sessions run smoothly, story points get assigned, and even velocity charts trend upward.

Yet research analysing 82 studies found five recurring reasons why estimates fail: information quality, team dynamics, estimation practices, project management, and business influences.

The point is, the problem with estimations runs deeper than accuracy - it's about what teams miss whilst focusing on the number.

When a team estimates a story and three people say it's a 3 whilst two say it's an 8, that spread contains more value than whatever final number they settle on. The disagreement signals different assumptions about scope, reveals hidden dependencies, exposes risks before code gets written. But most teams treat the spread as a problem to resolve rather than intelligence to extract. They discuss just long enough to reach consensus, record the number, move on.

The estimation ritual runs perfectly, while the coordination that estimation should enable never happens.

Why the number misses the point

Communication, team expertise, and composition decide how reliable an estimation will be, far more than the technique itself.

A team of senior engineers who've worked together for years will generate different estimates than a newly formed team with mixed experience, even looking at identical work. Neither set is wrong - they reflect different realities.

Problems emerge when organisations ignore this context and treat estimates as objective measurements.

Story points get summed across teams. Velocity gets compared across squads. Estimates meant to help one group coordinate become data points in dashboards, stripped of the shared understanding that gave them meaning.

What planning poker actually reveals

Planning poker works when teams use it to promote collaboration, uncover risks, and address uncertainties proactively. Those benefits vanish when teams rush past disagreement to reach a number.

Someone who estimates low might:

  • Know a shortcut from previous work
  • Have solved something similar before
  • Understand a technical approach others haven't considered

Someone who estimates high might:

  • Have spotted an integration challenge
  • Questioned an assumption in the requirements
  • Remembered something that broke last time
  • Identified a missing dependency

Both pieces of knowledge matter more than splitting the difference.

Teams that skip this interrogation lose their only reliable way to discover what they don't yet know. Later, when work takes longer than estimated, they blame the estimation technique. They should blame the coordination failure that happened during estimation.

How team members add context

Research indicates that even team members whose skills aren't needed for an item contribute valuable questions during planning poker.

The database person asks about data volume. The designer notices a missing edge case. The platform maintainer flags a version incompatibility.

None of them are trying to make the estimate bigger to protect themselves. They're raising legitimate technical considerations that affect how much work is actually involved. These questions reveal real complexity the team needs to account for. That's different from someone saying "Let's call it an 8 instead of a 5, just to be safe" without any specific reason (that's called "padding", and a team must avoid that at all costs).

When estimation becomes a solo activity or a quick management decision, all that context disappears. Work looks simpler than it is, not because anyone lied, but because key voices never got heard.

Why team coordination matters more at scale

Coordination ranks as one of the biggest challenges in large-scale software projects, particularly with complex codebases and multiple teams working simultaneously. As organisations scale, coordination problems increase:

  • Dependencies multiply across teams. What used to be a conversation between two people in the same room now requires checking another team's roadmap, finding out who owns a component, and waiting for their sprint to align with yours.
  • Handoffs increase between specialists. A feature that one full-stack developer could build now needs a frontend specialist, a backend specialist, a platform engineer, and a data analyst - each working on different schedules with different priorities.
  • The margin for error shrinks. When you're coordinating work across five teams instead of one, a single miscommunication or missed dependency can block multiple teams at once, not just delay one work item.
  • Assumptions travel further from their source. The product manager who spoke to the customer isn't in the room when the backend developer makes a technical decision. The context that shaped the original requirement gets lost through layers of handoffs and documentation.
Estimation sessions are often the only moment when everyone involved in delivering work actually talks about it before starting. Treating that time as a box-checking exercise to produce velocity data is one of the biggest mistake you could make as a team.

When estimates protect plans instead of revealing risks

Let's look at a familiar scenario. A team commits to eight features based on historical velocity. Three sprints in, two features ship and the rest need another month. Marketing campaigns get delayed. Customer pilots get postponed. Executive presentations need rewriting.

The team "missed commitment," so stakeholders lose confidence.

Next quarter, product managers add extra weeks to every timeline just to be safe. Engineering leaders give longer forecasts they know they can beat rather than their honest best estimate. Everyone protects themselves, the system slows down, trust erodes further.

Now, step back - why did the team commit to eight features in the first place? Usually because velocity suggested they could complete them.

The velocity number reflected what they'd delivered in the past. What it didn't reflect was whether those sprints had taught them anything, whether dependencies were worsening, whether technical debt was compounding, or whether the work ahead resembled the work behind.

Why velocity numbers mislead

Velocity treats delivery capacity as stable when it's dynamic.

Teams get faster as they remove friction and slower as complexity increases. Last sprint's number is already outdated, yet sprint planning sessions treat it as a reliable forecast.

Consider what velocity cannot tell you:

  • Whether the team learned from previous mistakes
  • Whether technical debt is slowing new work
  • Whether dependencies have become more complex
  • Whether the upcoming work requires unfamiliar technology
  • Whether key team members will be available

Better agile estimation techniques won't fix coordination problems. But teams can fix coordination problems if they treat coordination itself as work that deserves attention.

Why fixing coordination problems matters for estimation

Teams that have messy estimation sessions usually know exactly what's wrong.

Someone will say it in standup: "We should really refine stories before planning."

Another person mentions it in Slack: "We need to check dependencies earlier."

It even comes up in the retrospective: "Our estimates are all over the place because we don't understand the work."

The insights are there. What's missing is follow-through.

Look at your last three retrospectives. How many action items got completed before the next one?

The average completion rate for retro action items sits around 0.33% - teams basically complete almost none of them. And that's mainly because the improvement actions exist outside the system that schedules work.

During retrospectives, teams identify real problems:

  • "Our estimation sessions run too long because we don't refine stories beforehand"
  • "We keep getting surprised by infrastructure work that's not in our backlog"
  • "We never follow up on the spikes we create"
  • "Dependencies block us mid-sprint because we didn't check capacity with other teams"

All true. All important. Then the retrospective ends, everyone returns to their sprint board, and those insights have nowhere to go.

The backlog contains features. The sprint contains stories. And improvement work lives in meeting notes, if anywhere. So it doesn't compete for capacity, doesn't get assigned an owner, doesn't get tracked, and doesn't get done.

How to make improvement work visible

Usage data from Easy Agile TeamRhythm showed teams completing only 40-50% of their retrospective action items. After releasing features to surface and track incomplete actions, completion rates jumped to 65%.

The mechanism was simple:

  1. Convert retrospective insights into Jira work items
  2. Give each one an owner
  3. Slot them into upcoming sprints like any other work
  4. Make incomplete action items from the previous retros more visible

When teams actually complete their retrospective actions, estimation gets better without changing the estimation technique. Stories get refined before planning, so there's less guesswork. Dependencies get checked earlier, so there are fewer surprises. The team builds shared understanding over time, so the spreads in planning poker get narrower and the discussions get shorter.

As you can see, this is a ripple effect.

Coordination problems compound when ignored, but they also improve when treated as real work.

A team that never fixes its planning problems will have messy estimation sessions next quarter, and the quarter after that. A team that deliberately improves coordination will gradually need less time to reach better software team alignment and their estimates will become more reliable as a byproduct.

What this means for tools and practice

Most planning tools were built to manage tasks, not support the conversations that help teams coordinate around those tasks. The challenge now is making coordination visible and trackable in the same place where delivery work happens.

Easy Agile TeamRhythm keeps coordination work alongside delivery work.

✅ User Story Map shows how work connects to goals, making sequencing a shared discussion instead of someone's private spreadsheet.

✅ Planning poker happens inside the Jira issue, where everyone sees the same acceptance criteria and attachments - so the spread of estimates becomes a trigger for conversation, not a number to agree on.

✅ Retrospective actions convert directly into backlog items that get added into upcoming sprints, so improvement work gets the attention it deserves.

The stakes for getting this right are rising. Atlassian's push towards AI assistance means your planning documents will be read by more people and processed by more systems. When Rovo searches your Jira instance, it surfaces whatever you've written - clear goals and explicit dependencies, or vague titles and hidden assumptions.

A fuzzy sprint goal doesn't just confuse your team. It confuses the dozen people across three time zones reading your board asynchronously, the automation trying to identify relevant work, the AI assistant trying to generate status updates, and the stakeholders trying to understand progress.

Where this leads

The teams that succeed over the next few years won't be the ones with the most sophisticated agile estimation techniques or the highest velocity scores.

They'll be teams where planning poker uncovers risks instead of producing commitments. Where retrospectives generate work items instead of wishful thinking. Where coordination shows up in Jira instead of getting discussed in conversations that disappear.

They'll be teams that figured out estimation was never about the numbers. It was always about getting everyone pointed in the same direction, with their eyes open, before the work starts.

The number is just what gets written down. The conversation that takes the team to the number is where the real work lies.

Easy Agile TeamRhythm
Improve team collaboration and delivery

Related Articles

  • Workflow

    Why Team Planning Feels Harder Than It Should (and What To Do About It)

    TL;DR

    Sprint planning feels harder because distributed/async work, tool consolidation, and Atlassian AI expose priority noise, software estimation problems, and hidden dependencies. This guide shares three team planning best practices - set a clear order of work, estimate together to surface risk, and close the loop with action-driven retrospectives, so team alignment and team collaboration improve and plans hold up.

    ---

    Sprint planning should not feel like a shouting match where the loudest voice wins. Yet for many teams it has become a long meeting that drains energy, creates a weak plan, and falls apart by Wednesday.

    The problem is not that your team forgot how to plan. The world around the plan changed. People work across time zones. More decisions happen in comments and tickets than in rooms. And the plan lives in tools that other people (and AI‑powered search) will read later. When the audience shifts from "who was in the meeting" to "everyone who touches the work", the plan must be clearer and better organised.

    On one hand, AI and asynchronous collaboration are on the rise. On the other, we have a direct link between strong delivery capabilities and better performance. And those two lines meet in planning - if inputs are unclear, you simply do the wrong things faster.

    We believe planning feels harder because three foundational elements have broken down:

    • How priorities turn into an order of work,
    • How estimates show risk (not just numbers), and
    • How suggestions for improvements turn into real work.

    When these are weak, planning becomes an exercise in managing metal overload rather than building a clear path forward.

    This article explains why planning is hard now, what changed in 2025, and how to fix those three basics so your plan still makes sense when someone reads it two days later.

    This piece examines why sprint planning has become so difficult, what changed in 2025 to make it worse, and most importantly, how teams can fix those three foundations so your plan still makes sense when someone who missed the meeting reads it two days later.

    The mental load of modern planning

    "Cognitive load" is simply the amount of mental effort a person can handle at once. Every team has a limit on how much information they can hold at any given moment, and planning meetings now push far past it.

    At the same time, teams are asked to:

    • Rank features against unclear goals,
    • Estimate work they have not fully explored,
    • Line up with platform teams whose time is uncertain,
    • Balance different stakeholder requests,
    • Figure out dependencies across systems they do not control, and
    • Promise dates that others will track closely.

    Plans are often made as if everyone is fully available and not also working on existing projects and tasks. And we all know quite well, that is never the case. When the mental load is too high, teams cannot safely own or change the software they're working on.

    Planning then becomes a bottleneck where teams spend more time managing the complexity of coordination than actually coordinating. Decisions slip, assumptions go unchecked, and the plan that comes out is not a shared understanding but a weak compromise that satisfies no one.

    Where priorities fail

    In most planning sessions, "priority" has lost its meaning. Everything is P0. Everything is urgent. Everything needs to happen this sprint. If everything is top priority, nothing is.

    Teams struggle to prioritise because there are too many moving parts at once: lots of stakeholders, long backlogs, and priorities that change week to week. Most prioritisation methods like MoSCoW, WSJF, and RICE, work for a small list and a small group, but they stop working when the list and the audience get big. Meetings get longer, scores become inconsistent, and re-ranking everything every time something changes isn’t practical. People also learn to “game” the numbers to push their item up.

    There’s a second problem: these methods assume everyone agrees on what “value” means. In reality, sales, compliance, platform, design, and support often want different things. Numeric models (like simple scoring) miss those differences, because some trade-offs (like brand risk, customer trust, regulatory deadlines) are easier to discuss than to put into a single number.

    A flat product backlog makes this worse. As Jeff Patton says, reading a backlog as one long list is like trying to understand a book by reading a list of sentences in random order - all the content is there, but the story is gone. Without the story, “priority” becomes a label people use to win arguments rather than a clear order of work.

    Put simply, when the work and the number of voices grow, the usual techniques can’t keep up. If you don’t have a way to surface different perspectives and settle the trade-offs, decisions drift from strategy, plans keep shifting because they were never tied to outcomes, and engineers stop seeing why their work matters.

    The estimation show

    Teams are generally too optimistic about effort and too confident in their estimation numbers. On average, work takes about 30% longer than expected. Even when people say they’re 90% sure a range will cover the actual effort, it only does so 60–70% of the time.

    Translation: confidence feels good, but it does not mean accuracy.

    The deeper issue is how we estimate. It often becomes a solo guess instead of a shared check on risk.

    Here's what actually happens - someone sees a work item called “Update API,” give it 5 points based on the title alone, and the team moves on. No one tests the assumptions behind the number.

    Nobody talks about the auth layer changes implied by "update." Nobody brings up the database migration hiding in plain sight. Nobody checks whether the frontend team knows this change is coming.

    And when those show up mid-sprint, the plan slips and trust drops.

    After a few misses, behaviour shifts for the worse. People start padding their estimates - quietly rounding numbers up to feel safe. Product then starts pushing back. And estimation turns into negotiation, not learning.

    A healthier signal to watch is the spread of estimates. A wide spread of estimates isn't a problem to smooth over, but rather a signal to discuss assumptions. Most likely, there will be some difference in the initial estimates, giving each team member a great opportunity to talk about why their estimates were either higher or lower than the others.

    The coordination cost of dependencies

    When different teams own connected parts of the system, one team’s work often can’t move until another team makes a change. If those teams aren’t lined up, the change gets stuck.

    This is common when how the software is wired doesn’t match how the teams are organised.

    For example, the code says “Service A must change before Service B,” but A and B live in different teams with different priorities, sprint dates, and intake rules. The code requires coordination, but the org chart doesn’t provide it.

    In large organisations, these small, work item‑level links are a constant source of delay. And it has only gotten worse in recent times.

    Platform engineering has grown, with most features now touching shared services - auth, data platforms, CI/CD, component libraries. But planning often happens without the platform team in the room. No one checks their capacity, no one tests the order of work, and theres no agreement on intake windows or what to do when something is blocked.

    So the plan looks ready on paper. Stories are sized. The sprint is committed. Then, three days in, someone says at stand‑up: “That API won’t be ready until next Tuesday,” or “The platform team is tied up,” or “Friday’s deployment window isn’t available.” Work waits, people wait, and money burns while nothing moves forward.

    Dependencies increase complexity in any system. More complexity means more idle time as work passes between people or teams. Developers, then, often end up waiting for other work to wrap up before they can begin, creating inefficiencies that add no value but still cost payroll and budget.

    What changed in 2025

    Three shifts in 2025 made planning even harder:

    1. Distributed work reduced live coordination.

    Hybrid days replaced being in the same room every day. More work happens asynchronously. That means your written plans and notes like sprint goals, story maps, dependency notes, must explain what real-time conversations used to cover: why this work matters, what comes first, what won’t fit, who’s involved, and what could block you. Vague goals that could be fixed in person now fall apart across time zones.

    2. Fewer tools, one system.

    Teams cut vendors to reduce spend. Planning, estimation, and retrospectives moved into one solution, whether they fit well or not. While that reduces context‑switching, it also means that teams would have to lose specialised tools and custom workflows. It also means that stakeholders can see the full line from strategy to stories in one place, so your sprint goals, estimation notes, and retro/improvement actions will be read more closely.

    3. Atlassian AI raised expectations.

    Atlassian expanded Rovo across Jira and Confluence. Search, governance, and automation now connect conversations to issues and speed discovery. And the thing about AI is that it accelerates whatever direction you're already pointed. If goals are fuzzy, if estimates are guesses, and if dependencies are hidden - the automation will just help you go faster in the wrong direction.

    The combination of all these changes is brutal - more coordination required, less overlap to coordinate in real-time, and higher penalties when plans fail because stakeholders can now see more of the picture.

    Fixing the foundations: sprint planning best practices

    The teams that make planning work have rebuilt their foundations with three planning best practices. They’re simple, written rules the whole team follows. They live on the planning board and the work items, so they still make sense after hand-offs and across time zones.

    1. Turn priorities into a clear order of work

    “Priority” breaks when people don’t share the same idea of value. The fix is to agree on a single, visible order - why this first, then that, and what won’t fit this sprint.

    Teams that get this right:

    • Turn goals and outcomes into backlog work on a steady rhythm, not ad-hoc.

    Once a month, product and delivery confirm objectives and break them into epics and small slices for the next 6–8 weeks. This keeps meaningful work in the pipeline without crowding the backlog assumptions and wishes that will change anyway.

    • Write testable sprint goals using a consistent template.

    Objective, test, constraint. Set clearly defined goals, objectives, and metrics. What is the definition of done? How will the team know if they are successful? You should leave the sprint planning meeting with a clear idea of what needs to get done and what success looks like. For example, "Users can complete checkout using saved payment methods without re-entering card details" is much better than "improve checkout UX" every time. If you can't verify whether you hit it, it's a wish, not a goal.

    • Run a 30-minute review before scheduling anything.

    Agree the order of work before fixing the dates. In 30 minutes with engineering, design, and platform teams: walk through the dependency list, check the capacity, and identify top risks. Output: ordered path to done, clear boundary of what won't fit that sprint, and a simple rule for how you’ll handle blocked items. This surfaces cross-team friction before it becomes a mid-sprint crisis.

    • Make dependencies visible where people actually plan.

    Use one standard dependency field and two to three link types in Jira. Review the highest-risk links during planning - not when they block work on day three.

    Easy Agile TeamRhythm's User Story Maps makes this process concrete - Build goals and outcomes at the top, work items that connect to those goals, and sprint or version swimlanes to clearly group them. Turn on dependency lines so blockers and cross-team links show up in planning. When epics ladder to outcomes, when the order of work explains itself, and when dependencies are visible on the same board - you stop rethinking priorities and start shipping.

    2. Estimate together to find risk early

    Estimation is not about hitting a perfect number. It is a fast way to identify risk while you can still change scope or order. Treat it as a short, focused conversation that makes hidden assumptions visible and records what you learned.

    • Estimate together, live.

    Run Planning Poker from the Jira issue view so everyone sees the same title, description, acceptance criteria, designs, and attachments. Keep first votes hidden and reveal them all together (so the first number doesn’t influence the rest).

    Easy Agile TeamRhythm help you run the estimation process in Jira, right where work lives. Capture the key points from the discussion in comments in the issue view. Sync the final estimate number back to Jira so your plan is based on current reality, and not old guesses from two weeks ago.

    • Record the reasoning, not just the number.

    If a story moves from a 3 to an 8 after discussion, add two short notes on the work item:

    - What changed in the conversation (the assumption you uncovered), and

    - What’s still unknown (the risk you’re facing)

    This helps the next person who picks it up and stops you repeating the same debate later.

    • Stick to clear, simple scales.

    Time estimates vary a lot by person. Story points help teams agree on the effort instead. If you ask each team member to estimate the amount of time involved in a task, chances are you'll get 5+ different answers. Timing depends on experience and understanding. But most team members will agree on the effort required to complete a work item, which means you can reach a consensus and move on with your story mapping or sprint planning much more quickly.

    Maintain a small set of recent work items (easy/medium/hard) with estimates and actuals. When debates don't seem to end, point to this known work: "This looks like that auth refactor from June - that was an 8 and took six days including the migration script."

    • Limit committed work at roughly 80-85% of average capacity.

    The space makes room for one improvement item, for unavoidable interrupts, and for reality. Setting unachievable goals sets your whole team up for failure. Failing to meet your sprint goals sprint after sprint is damaging for team motivation and morale. Use estimates to set reasonable goals as best you can. Consider team capacity, based on your past knowledge of how long tasks take to complete, how the team works, and potential roadblocks that could arise along the way.

    Protect honesty in the estimate numbers. An estimate is a shared view of scope and risk, not a target to enforce. If it needs to change, change it after a team conversation - don’t override it. Remember what we said earlier - when estimation turns into negotiation, people start “padding” (quietly inflating a number to feel safe), and accuracy gets worse.

    3. Closed the loop on improvements

    Teams often fall into the trap of writing retrospective items as good intentions and not clear actions. Broad notes like “improve communication” or “fix our process” sound fine on paper, but they don’t tell anyone what to do next.

    Action items are often ignored during a retrospective. Everyone focuses on engaging people in getting their ideas out, and not as much time is spent on the action items and what's going to be done or changed as a result.

    • Start every retro by checking last sprint’s actions.

    Ten minutes. Did we close them? If closed, did they help? What did we learn? This way, the team acts on what they learned straight away and everyone can see what changed.

    • Turn insights into Jira work items immediately.

    Each action needs an owner, a due date, a link to the related work, and a clear definition of done. If you can’t assign it immediately, it’s not actionable, it’s a complaint.

    Easy Agile TeamRhythm's Retrospective lives right inside Jira, attached to your board. Turn retrospective items into Jira issues with owners and dates, link them to an epic or backlog item, and slot at least one into the next sprint. Track incomplete action items and repeating themes on the same page as delivery work.
    • Make space for one improvement item every sprint.

    Pick one action item for a sprint and finish it before starting another, so it doesn't get pushed aside by feature work. Treat action items like a feature: estimate it, track it, and close it with a note about the result and impact.

    • Write a short impact note when you close an action.

    A retro should give people energy. It should help them see that we're improving, that their voice matters, that something got better because of something they said.

    Write a one-line impact note for every closed action item, ideally with a small metric - for example, "Batched PR reviews into two daily slots - median review time dropped from six hours to 90 minutes." This teaches the next team, justifies the investment, and shows that retro action items increase the team's capacity to deliver, and are not admin overheads.

    What changes when you follow the planning best practices

    Teams with solid sprint planning foundations rarely talk about them. They’re not on stage explaining their process. They’re just shipping steady work while everyone else wonders what their secret is.

    There is no secret. The best teams have simply stopped treating planning as a meeting to survive and started treating it as a system of best practices that compound over time.

    The mental load of planning sessions does not go away. But it shifts. Instead of processing all of it live in a room under time pressure, these teams have recorded their answers into written plans and notes that do the thinking for them.

    The user story map explains what matters and why. The order of work shows dependencies before they block work. The estimation scores and notes capture not just numbers but the assumptions behind them. Retro action items sit next to product work, so the next sprint benefits from what the last one learned.

    Fix the best practices and your sprint planning challenges will reduce. Not perfectly. Not forever. But enough that planning stops feeling like a crisis and starts feeling like what it should be: a calm view of what is possible now, with simple ways to adjust as you learn.

    The cost of unclear plans is rising. AI speeds up what you feed it. Stakeholders can see more details easily. Work happens across time zones. In that world, clarity isn't a nice-to-have - it's a basic requirement.

    The teams that will do well in 2026 aren't the ones with better tools or smarter frameworks. They’ll be the ones with a few simple best practices written into the plan and the work items, so handovers are easy, others can check and understand them, and they still make sense after the meeting.

  • Workflow

    Agile Estimation Techniques: A Deep Dive Into T-Shirt Sizing

    TL;DR: What T‑shirt sizing is, where it shines, how to run it with a real team, how it relates to story points, and how to avoid common traps.

    A quick scene

    Friday afternoon. You’ve inherited a backlog that sprawls for metres. Someone asks, “Roughly how long to ship the payments revamp?” Your team glances at the ceiling. You don’t need a perfect answer, you need a safe first pass that helps you plan sensibly. That’s where T‑shirt sizing earns its keep.

    What is T‑shirt sizing in agile?

    T‑shirt sizing is a lightweight way to estimate relative effort using XS, S, M, L, XL. It’s great for roadmaps, release planning, and early discovery, moments when detail is thin and the goal is direction, not exact dates.

    Think of it as a sketch: enough shape to discuss options and make trade‑offs. As work moves closer to delivery, translate sizes into more precise estimates (for most teams, that’s story points).

    New to story points or need a refresher? Read How to use story points for agile estimation and 10 reasons why you should use story points.

    When to use T‑shirt sizes vs story points

    Use T‑shirt sizes when:

    • You’re scanning a large backlog to spot big items and cut noise
    • You’re sequencing epics on a roadmap or release plan
    • You’re aligning many teams for a Program Increment and need a first pass on effort

    Switch to story points when:

    • You’re shaping a commitment for a sprint or release
    • The team understands the work well enough to discuss risk, complexity, and unknowns

    Simple rule of thumb - story point estimates are best for sprint planning. Affinity mapping, bucket systems, dot planning, and T-shirt sizing are better for roadmap and release planning.

    How to run a T‑shirt sizing session (two practical patterns)

    The main thing to keep in mind is you don’t need ceremony to get this right. What you need is speed and shared understanding.

    1) Small set of items (do this in 20–30 minutes)

    1. Pick the work: 10–20 epics or features you want to compare.
    2. Calibrate quickly: Agree on one example for S, M, L from your history.
    3. Silent first pass: Each person suggests a size. Keep it to 30 seconds per item.
    4. Discuss only the outliers: If your spread is XS to XL, talk. If it’s S/M, move on.
    5. Capture the decision: Write the size on the card/issue and one sentence on why (risk, dependency, unknown). Future‑you will thank you.

    2) Huge backlog (affinity + buckets)

    1. Affinity wall: Lay items left‑to‑right from smaller to larger.
    2. Light buckets: Draw soft bands for XS/S/M/L/XL and nudge items into them.
    3. One pass of challenges: People can move cards they strongly disagree with, but they must explain what information would change the estimate.

    If you prefer a card‑based approach, swap in Planning Poker and use T‑shirt cards instead of numbers.

    Here's an example of how T-shirt sizing would play out at a fashion retailer (we know, it's a bit on the nose). The team had a quarter‑long goal to reduce checkout drop‑off. In their first hour, they T‑shirt sized five ideas:

    • New payment provider (XL) - Big integration, contract, risk
    • Guest checkout (M) - Some UX and auth changes
    • Auto‑fill postcode (S) - Low risk, measurable uplift
    • Order status emails (M) - Copy, events, templates
    • Retry logic on payments (L) - Engineering heavy, few dependencies

    They sequenced S → M → L and left the XL until discovery removed the scariest unknowns. Two sprints later, they pointed the M/L items and committed. The XL became a spike with clear questions.

    Where sizing goes sideways and how to recover

    Converting sizes to points then leaving them untouched

    Why it bites: People treat the conversion as a promise, plans harden, trust erodes when reality changes.

    Try this: If you convert for prioritisation, mark those items as provisional, replace the size with true points during refinement, and keep a short note on what changed. For more on timing and trade offs, see 5 agile estimation tips.

    Treating sizes as dates

    Why it bites: A neat row of S and M turns into calendar commitments, and the team inherits a deadline they never made.

    Try this: Share ranges based on throughput, update as you learn, and keep the conversation focused on outcomes.

    One scale across many teams

    Why it bites: S in one team is M in another, cross team comparisons become arguments not insight.

    Try this: Keep scales local, and during PI Planning compare sizes only to surface risk and dependencies. Use a shared program board instead of chasing numeric parity.

    Endless debate on edge cases

    Why it bites: The time you spend arguing dwarfs the cost of being slightly wrong.

    Try this: Timebox each item, discuss only the outliers, capture the uncertainty in one sentence, and move on. If a decision is still sticky, schedule a small spike with a clear question.

    Skipping calibration examples

    Why it bites: What counted as M last quarter slowly drifts, new joiners anchor on guesses.

    Try this: Keep a living set of examples for S, M, and L in Jira, refresh them when your tech or team changes, and link those issues in your session notes.

    Loud voices steer the room

    Why it bites: Anchoring replaces thinking, quieter people disengage.

    Try this: Start with a silent first pass, reveal together, then invite two or three different voices to speak before the most senior person. A little psychological safety goes a long way.

    Jumping from XL epics to sprint commitments

    Why it bites: The team commits to fog, you get churn and rework.

    Try this: Slice the work first, use story mapping to find thinner slices, and refine before you point.

    Mixing size and value

    Why it bites: Small items with real impact wait behind large but low value work, momentum stalls.

    Try this: Keep a separate value signal, a one line impact hypothesis is enough, then weigh size against value when you sequence. The planning guide above has a simple pattern you can copy.

    No breadcrumb of why you chose a size

    Why it bites: You cannot learn from your estimates later and the next session restarts from scratch.

    Try this: add one sentence on risk, dependency or unknown to each decision, then check a sample in your next retro. Use action-driven retrospectives to close the loop.

    Recording sizes and keeping plans honest in Jira

    • For shaping and tracking epics, keep sizes and notes on a shared board the team actually uses every day. A user story map gives context and helps when you later point stories. Easy Agile TeamRhythm supports mapping, lightweight estimation and retros all inside Jira.
    • When several teams are involved, use a program board to visualise objectives, dependencies and dates. Easy Agile Programs keeps this view in Jira so you can plan PI events without spreadsheets.
    • For public roadmaps, keep it simple and visual. Easy Agile Roadmaps helps you share a plan stakeholders actually read.

    Regardless of the type of agile project you're working on or the estimation process you choose, the more you practice, the quicker your team will become master estimators. We recommend trying a couple of different methods to see which one feels most comfortable for your team.

    FAQ (for searchers and skimmers)

    • What’s the point of T‑shirt sizing if we’ll use story points later?

    It lets you compare big pieces quickly so you can decide what to pursue, sequence sensibly, and avoid wasting refinement time on the wrong items.

    • Can we convert sizes to story points?

    You can, locally, for prioritisation, just mark them clearly and replace with real points before a sprint commitment. Don’t reuse another team’s scale.

    • Stakeholders want dates. How do we answer without over‑promising?

    Share ranges based on today’s sizes and the team’s recent throughput, and update as items shrink and you learn more. For a practical way to connect goals, value and delivery, see How to make plans that actually ship and We simplified our OKRs.

    • How do we run T‑shirt sizing across many teams?

    Keep it team‑local first. During PI planning, compare sizes only to surface risk and dependencies, not to rank teams. Use a program board to keep the conversation grounded. Start with our PI Planning guide.

  • Agile Best Practice

    The Problem with Agile Estimation

    The seventh principle of the Manifesto for Agile Software Development is:
    Working software is the primary measure of progress.
    Not story points, not velocity, not estimates: working software.

    Jason Godesky, Better Programming

    Estimation is a common challenge for agile software development teams. The anticipated size and complexity of a task is anything but objective; what is simple for one person may not be for another. Story points have become the go-to measure to estimate the effort involved in completing a task, and are often used to gauge performance. But is there real value in that, and what are the risks of relying too heavily on velocity as a guide?

    Agile estimation

    As humans, we are generally terrible at accurately measuring big things in units like time, distance, or in this case, complexity. However, we are great at making relative comparisons - we can tell if something is bigger, smaller, or the same size as something else. This is where story points come in. Story points are a way to estimate relative effort for a task. They are not objective and can fluctuate depending on the team's experience and shared reference points. However, the longer a team works together, the more effective they become at relative sizing.

    The teams that I coach have all experienced challenges with user story estimation. The historical data tells us that once a story exceeds 5 story-points, the variability in delivery expands. Typically, the more the estimate exceeds 5 points, the more the delivery varies from the estimate.

    Robin D Bailey, Agile Coach, GoSourcing

    Scale of reference

    While story points are useful as an abstraction for planning and estimating, they should not be over-analyzed. In a newly formed team, story points are likely to fluctuate significantly, but there can be more confidence in the reliability of estimations in a long-running team who have completed many releases together. Two different teams, however, will have different scales of reference.

    At a company level, the main value I used to seek with story points was to understand any systemic problems. For example, back when Atlassian released to Server quarterly, the sprints before a release would blow out and fail to meet the usual level of story point completion. The root cause turned out to be a massive spike in critical bugs uncovered by quality blitz testing. By performing better testing earlier and more regularly we spread the load and also helped to de-risk the releases. It sounds simple looking back but it was new knowledge for our teams at the time that needed to be uncovered.

    Mat Lawrence, COO, Easy Agile

    Even with well-established teams, velocity can be affected by factors like heightened complexity with dependencies scheduled together, or even just the average number of story points per ticket. If a team has scheduled a lot of low-complexity tickets, their process might not handle the throughput required. Alternatively having fewer high-complexity tickets could drastically increase the effort required by other team members to review the work. Either situation could affect velocity, but both represent bottlenecks.

    Any measured change in velocity could be due to a number of other factors, like capacity shifting through changes in headcount with team members being absent due to illness or planned leave. The reality is that the environment is rarely sterile and controlled.

    Relative velocity

    Many organizations may feel tempted to report on story points, and velocity reports are readily available in Jira. Still, they should be viewed with caution if they’re being used in a ‘team of teams’ context such as across an Agile Release Train. The different scales of reference across teams can make story points meaningless; what one team considers to be a 8-point task may be a 3-point task for another.

    To many managers, the existence of an estimate implies the existence of an “actual”, and means that you should compare estimates to actuals, and make sure that estimates and actuals match up. When they don’t, that means people should learn to estimate better.

    So if the existence of an estimate causes management to take their eye off the ball of value and instead focus on improving estimates, it takes attention from the central purpose, which is to deliver real value quickly.

    Ron Jefferies
    Co-Author of the Manifesto for Agile Software Development
    Story Points Revisited

    Seeking value

    However, story points are still a valuable tool when used appropriately. Reporting story points to the team using them and providing insights into their unique trends could help them gain more self-awareness and avoid common pitfalls. Teams who are seeking to improve how they’re working may wish to monitor their velocity over time as they implement new strategies.

    Certainly, teams working together over an extended period will come to a shared understanding of what a 3 story point task feels like to them. And there is value in the discussion and exploration that is needed to get to that point of shared understanding. The case for 8 story points as opposed to 3 may reveal a complexity that had not been considered, or it may reveal a new perspective that helps the work be broken down more effectively. It could also question whether the work is worth pursuing at all, and highlight that a new approach is needed.

    The value of story points for me (as a Developer and a Founder) is the conversations where the issue is discussed by people with diverse perspectives. Velocity is only relatively accurate in long-run teams with high retention.

    Dave Elkan, Co-CEO, Easy Agile

    At a company level, story points can be used to understand systemic problems by monitoring trends over time. While this reporting might not provide an objective measure, it can provide insights into progress across an Agile Release Train. However, using story point completion as a measure of individual or team performance should be viewed with considerable caution.

    Story points are a useful estimation tool for comparing relative effort, but they depend on shared points of reference, and different teams will have different scales. Even established teams may notice velocity changes over time. For this reason, and while velocity reporting can provide insights into the team's progress, it must be remembered that story points were designed for an estimation of effort, rather than a measure. And at the end of the day, we’re in the business of producing great software, not great estimates.

    Looking to focus your team on improvement? Easy Agile TeamRhythm helps you turn insights into action with team retrospectives linked to your agile board in Jira, to improve your ways of working and make your next release better than the last. Turn an action item into a Jira issue in just a few clicks, then schedule the work on the user story map to ensure your ideas aren’t lost at the end of the retrospective.

    Many thanks to Satvik Sharma, John Folder, Mat Lawrence, Dave Elkan, Henri Seymour, and Robin D Bailey for contributing their expertise and experience to this article.