Series: QA Leadership · Article 4 of 9

It was a Monday standup. The QA Lead had shadows under their eyes - the previous week had gone mostly to logging tickets. Fifty-four issues in a single release. Fifty-four.

Monday · 09:15 · Sprint Planning
EM"How's testing going? Will we make the Friday release?"
QA"I have 54 open issues from this sprint. I'm working 10 hours a day. I don't know if I'll make it."
EM"54? That's a lot, what happened?"
QA"I don't know what happened - I only know the last release had 18, and this one already has 54 and I'm not done yet."
Dev"This release was big, lots of new features..."
QA"The previous one was big too. 18 issues."
EMsilence

That conversation ended well - because the QA Lead had data. Previous release: 18. Current: 54+. Without that comparison it would have been just: “there are a lot of bugs, we’re working.”

Issues per Release is the metric that turns “there’s a lot of work” into a concrete signal. And - more importantly - it points to where the problem lies. Not always in testing.

What Issues per Release is

Issues per Release is the number of all problems QA finds while testing a single release - from the moment code is accepted for testing to the deployment decision.

Formula
IPR = the count of all issues found while testing a release
Release v2.3 → testing took 8 days → found: 12 bugs + 4 UX notes + 3 performance issues + 2 requirement mismatches = 21 issues
Release v2.4 → testing took 6 days → found: 6 bugs + 1 UX note + 1 performance issue = 8 issues
Key point: we count all issues found, not only Critical or High priority ones. Every deviation from expected behavior carries informational weight.

What counts as an “issue”

This is one of the most important questions when rolling out this metric. Too narrow a definition - and you lose half the signal. Too broad - and the number loses its interpretation.

🐛
Functional defect
The application behaves differently than it should according to the spec or common sense.
always count
🎨
UX / UI problem
Elements work technically, but are unreadable, unintuitive, or inconsistent with the rest of the product.
count with a label
Performance problem
Response time, resource usage, behavior under load - beyond accepted thresholds.
count with a label
📋
Requirement mismatch
Something other than the spec was implemented - deliberately or through a misunderstanding.
always count
🌍
Environment problem
The application behaves differently across browsers, devices, or test environments.
count with a label

Recommendation: count every type, but tag each one. That way you get a global IPR number plus the ability to drill down - e.g. “20 issues, of which 14 functional defects, 4 UX, and 2 environment.”

The key perspective

Issues per Release is NOT a QA metric

It's a quality metric for the entire development process. QA only measures it - but the whole team owns the result. And that is exactly what makes this metric so valuable in a conversation with the Engineering Manager.

~50%
Developers
Code quality, unit tests, code review, self-testing before handoff
→ Definition of Done
~20%
Product / Design
Requirement completeness, spec consistency, designer availability for questions
→ Backlog quality
~20%
Team process
Requirement review before the sprint, Three Amigos, refinement, acceptance criteria
→ Process maturity
~10%
QA
Test case quality, scenario coverage, the test environment
→ Testing effectiveness
When IPR rises - the first conversation shouldn't be "QA needs to test better." It should be: "what changed in the development process since the last release?"

A trend that says more than any status update

One release means nothing. Six releases with a clear direction - that’s a story. And it’s the story that convinces the Engineering Manager to act.

Issues per Release - the trend across 6 releases
The breakdown by type reveals where the problem lies and what needs intervention
v2.1 → v2.6
Functional defects UX / UI Performance Requirement mismatches
IPR vs DDR - the process correlation
As IPR falls, DDR rises. Cleaner code entering testing = fewer problems = more caught before production
mutual dependency
Issues per Release (count) DDR (%)

How to read the result

There’s no single “good” IPR - it depends on release size, system complexity, and team maturity. But trends are always telling. Below are reference thresholds for a typical release of medium complexity.

20+
Alarm signal
Worth investigating the causes before continuing the sprint. What changed in the process?
12-20
Needs work
Requires attention. Check which categories dominate and propose one corrective action.
6-12
Solid level
Good work. Monitor the trend - is it steadily falling, or oscillating?
<6
Mature process
Excellent result. Check that tests cover the critical paths deeply enough.

Important caveat: a low IPR with a low number of tests is not a success - it may mean QA is testing too shallowly. Always pair IPR with test scope and DDR.

How to start measuring - four steps

Good news: you don’t need new tools. The data is already in your tracker - you just need to gather and label it properly.

1
Define "issue" and write it down in one place
Before you count anything - agree with the team: what goes into the counter? Functional defects, for sure. UX notes? Performance issues? Requirement mismatches?
Write it in Confluence, a wiki, or as a comment on the Jira filter. Consistency of the definition matters more than its perfection.
2
Set up the "Fix version" field or a release tag in Jira
Every issue created during testing should be assigned to the release it concerns. In Jira that's the "Fix Version/s" field or a custom release-v2.x label.
project = MYAPP AND issuetype in (Bug, Task, Improvement)
AND "Fix Version" = "v2.3"
AND created >= startOfSprint()
ORDER BY created ASC
This filter gives you every issue found while testing a specific release.
3
Add type labels - from day one
Just "how many" is enough to start. But "how many and of what kind" gives you a far stronger argument in conversations with the EM and PM. Introduce a simple labeling system: type:functional, type:ux, type:perf, type:requirement.
Tagging takes 30 seconds per issue. It pays back many times over at every retrospective and business conversation.
4
Reconstruct the history - at least the last 4 releases
As with DDR - one data point is too few. Counting IPR retroactively for the last 4-6 releases takes 1-2 hours and gives you a trend right away.
If you don't have type labels for previous releases - that's harder, but the global numbers are still worth it. An IPR trend without the type breakdown is still very telling.

Interactive IPR tracker

Enter the data from your recent releases and you’ll immediately see the trend and a rating for each one.

Issues per Release tracker
Enter the issue count for each release - the tracker generates a rating and a trend summary
ReleaseIssues (enter)IPRRating
v2.1 -
v2.2 -
v2.3 -
v2.4 -
v2.5 -

How this metric shifts the dynamic

Without data, the conversation about the quality of code entering testing is hard. QA sounds like complaining, the dev sounds defensive. With IPR in the background - it’s a conversation about numbers, not emotions.

✗ Without data - the conversation ends where it started
QA"We keep getting code full of bugs. We can't work like this."
EM"Every release is different, this one was especially big..."
Dev"We were working under pressure, the deadline was tight..."
QA"But this isn't the first time..."
EM"Alright, let's see how the next one goes."
✓ With data - the conversation leads to a concrete action
QA"I have data from the last 6 releases. IPR was: 8, 11, 9, 21, 28, 32. Something changed after v2.3 - and the trend has been clearly rising ever since."
EM"v2.3... that was the sprint when we changed the team composition and dropped code review for faster delivery."
QA"Exactly. 80% of the IPR increase is functional defects. I propose one action: reinstating mandatory code review with a checklist for testing."
EM"That makes sense. When can we roll it out?"

The difference isn’t that the QA Lead is more persuasive. It’s that they come with a fact, not a feeling. An IPR trend of 8 to 32 across 6 releases is undeniable. The opinion “we’re getting worse and worse code” - is debatable.

Three pitfalls when using IPR

You compare releases of different sizes
A release with 3 features and a release with 12 features aren't comparable without normalization. The fix: also track IPR per story point or per feature - or at least mark "large/small/medium" for each release in the historical data.
Low IPR because QA tests too shallowly
An IPR of 4 may mean excellent code - or tests that don't go deep enough. Always pair IPR with DDR: if IPR falls and DDR also falls - something's wrong with test coverage. If IPR falls and DDR rises - you have real progress.
You use IPR to judge devs, not the process
This is the most dangerous pitfall - and the fastest route to developers no longer reporting problems themselves, starting to hide them, and treating QA as the enemy. IPR measures process maturity, not people's competence. Communicate that clearly and consistently every time you present this metric.

IPR in a business conversation

Sprint Review "Issues per Release came in at 8 - 40% lower than the previous sprint. Code is entering testing cleaner and cleaner. That's a good signal for the entire development process."
1:1 with EM "I have the IPR trend from the last 6 releases - a clear spike after v2.3. It coincided with dropping code review. I'm proposing a concrete action and I want to check whether IPR returns to its previous level within two releases."
Board "Over the last four quarters Issues per Release dropped from 24 to 8 - that's 66%. Each issue is on average 1.5 hours of QA work. That's 24 hours saved per release - time we now put into exploratory testing and automation."

What this metric, missing from the QA handbook, gives you

✓ IPR gives you
  • An objective gauge of the quality of code entering testing
  • An early signal - before escapes reach production
  • An argument for the EM conversation based on facts, not opinion
  • An indicator of the maturity of the entire development process
  • A correlation with DDR - a fuller picture of process health
✗ IPR doesn't tell you
  • Whether bugs are escaping to production (that's Escaped per Release)
  • How effective the testing is (that's DDR)
  • Whether you can release (that's Confidence Score)
  • Who specifically makes mistakes - and it shouldn't
QA is not a repair factory. Issues per Release is the metric that proves it - and moves the quality conversation to where it belongs: the level of the entire development process.

In the next article

Article five covers Escaped Bugs per Release - a metric that doesn’t ask how many bugs you have in total, but which specific releases were risky. And how that view lets you diagnose causes, not just observe effects.

Spoiler: a spike in a single release is always a signal to investigate. And we have a method for how to run that investigation.

Series: QA metrics the business wants to hear
  • 01
    Diagnosis, three pillars, five metrics, the QA → KPI mapping model
  • 02
    Formula, thresholds, historical data, seasonality, pitfalls
  • 03
    Taxonomy, data collection, the cost of each type, how to report
  • 04
    Issues per Release you are here
    Rollout from scratch, the link to the development process, the EM conversation
  • 05
    Pinpointing problems, not just watching trends
  • 06
    Number of Releases - the context metric
    Why 3 bugs with 2 releases is a disaster, and with 15 - a success
  • 07
    Release Confidence Score step by step
    Three calculation models, rollout, concrete examples from practice
  • 08
    Storytelling with metrics - building a narrative
    How to turn a table of numbers into a business argument
  • 09
    3 anti-patterns that destroy QA credibility
    Too many metrics, no context, jargon - and how to avoid each