Number of Releases - the context metric

Series: QA Leadership · Article 6 of 9

Picture two teams. Both report: "This quarter we had 3 bugs in production." One deserves a bonus for it, the other should stop shipping immediately and run a deep retrospective. Where is the difference? In the number of releases - the one metric neither of them put in the report.

production bugs · 2 releases

On average 1.5 bugs per release. Every second deployment hands the customer problems.

Crisis

production bugs · 15 releases

Just 0.2 bugs per release. The vast majority of releases ship without the slightest issue.

Success

The same absolute number can describe two completely different realities. That is exactly why Number of Releases is not a metric you simply present - it is the foundation through which you judge every other data point.

It is the most overlooked of the five metrics in this series, because on the surface it seems trivial (“Just count how many releases we shipped”). Its role, however, is critical. Without it, indicators like DDR, Escaped Bugs or Issues per Release are merely dry numbers, stripped of any real scale.

The metric that does not shine on its own - but lights up the rest

Picture the other four metrics in the series as satellites. Each one orbits a single reference point - the number of releases. Without that center, each one drifts without context.

Common denominator

Number of Releases

gives scale to each of the metrics below

Escaped Bugs

escaped ÷ releases
= escaped per release

→ real customer impact

Issues per Release

issues ÷ releases
= maturity per release

→ comparability over time

DDR

context: how many chances
to detect / escape

→ scale of the process

Confidence Score

trend across N releases
= prediction stability

→ repeatability

An absolute number tells you how much happened. A normalized number tells you whether that is a lot. And "is that a lot" is the only question the business truly cares about.

How to normalize every metric in the series

Normalization is simply dividing the absolute number by the number of releases. But the effect is transformative - it turns a number that swings with your pace of work into a quality indicator independent of that pace.

Metric	Absolute	Normalized	What you gain
Escaped Bugs	12 / quarter	÷ releases	Comparability across quarters with a different cadence
Issues found	96 / quarter	÷ releases	Code-maturity trend independent of deployment count
QA time	320h / quarter	÷ releases	Cost of quality per release - an argument for budget
Hotfixes	8 / quarter	÷ releases	Stability of the release process, not a raw failure count

A concrete example - the same team, two quarters

Watch how normalization completely flips the conclusion. Without it, Q4 looks worse than Q3. With it - you see a clear improvement.

Escaped bugs - absolute vs normalized

Absolute numbers rise (more releases). Per release - they fall. Which conclusion is true?

Q3 vs Q4

Escaped total (count) Escaped per release

Reading the absolute numbers: “The number of production bugs rose from 7 to 12 - our quality is dropping.” Reading them after normalization: “Bugs per release fell from 1.4 to 0.8. We doubled our delivery pace and the quality of our software clearly improved.” The second conclusion is the true one. The first is a business trap.

Number of Releases vs Deployment Frequency

The number of releases is a very close cousin of the industry’s most famous delivery-pace metric - Deployment Frequency from the DORA research. It is the first of the four key DevOps metrics and a direct indicator of how often your organization actually delivers value to users.

Keep in mind that the latest DORA report moved away from the classic four-tier split in favor of seven archetypes, but the data in the classic layout still makes an excellent reference point. Only about 16% of teams can deploy changes “on demand”, while 24% do so less than once a month. The maturity gap is enormous.

Top tier

On demand

Multiple times a day, small code batches · ~16% of teams

High

Daily - weekly

At least once a week, often more

Medium

Weekly - monthly

Sprint cycles, end of sprint every 1-2 weeks

Low

Less than monthly

Large batches, high risk on every deployment · ~24% of teams

Why does this matter for QA? Because delivery pace and quality are not opposites - that is one of the most important findings from years of DORA research. The fastest teams are also the most stable. More frequent, smaller releases mean a smaller blast radius for every change, easier diagnosis and faster rollback. The number of releases is not just the denominator for your metrics - it is a signal of how mature the whole process is.

A rising number of releases alongside a falling escaped-per-release is the strongest evidence QA can present: we ship faster and more safely at the same time. That is precisely what DORA calls an elite-performance trait.

Deployment ≠ Release - and why it changes the counting

Before you start collecting data, you have to be clear: are you counting deployments or releases? They are not the same thing, and it trips up even experienced teams - especially those working with feature flags.

Deployment

🚀 Deployment

A technical act. Code lands on the production environment, but it can stay hidden from the user - e.g. behind a disabled flag.

Example: a new feature's code is deployed, but the flag is off

Release

🎁 Release

The business moment of making a new feature available to users. It can happen weeks after the deployment - e.g. by simply turning the flag on for 100% of the user base.

Example: turning the flag on for 100% of users

For this series’ QA metrics, we count what reaches the user, that is releases in the business sense. An escaped bug is a problem the customer felt - so the denominator has to be the number of moments at which anything could have reached the customer. If your team separates deployment from release with feature flags, decide clearly: is an escaped bug counted from the moment the code is deployed, or from the moment the flag is turned on? Consistency in this definition is critical - just as with escaped bugs in article 3.

Normalization calculator

Enter the absolute numbers and the number of releases - the calculator will show the normalized values with a verdict for each metric.

QA metrics normalizer

See how the number of releases changes the interpretation of your data

Releases in the period

Escaped bugs (total)

Issues found (total)

QA time in hours (total)

Escaped / release

0.60

Good

Issues / release

8.0

Needs work

QA time / release

24h

Cost of quality

How to start counting - and do it right

It is the easiest metric in the whole series to collect - but it has a few definitional traps worth settling from the start.

Decide what counts as a "release"

Deployment or availability to the user? A hotfix - does it count as a separate release? A rollback and re-deploy - one release or two? Write the definition down and stick to it. For quality metrics I recommend counting what reaches the user.

Pull the data from what you already have

Git tags, CI/CD history (Jenkins, GitHub Actions, GitLab), the changelog, the version list in Jira. The number of releases is one of the most readily available data points in the whole series - usually counting the production tags is enough.

Add the number of releases as context to EVERY report

This is the heart of it. Never report escaped bugs, issues or DDR without the number of releases beside them. One sentence - "across 10 releases this quarter" - changes the interpretation of every other number.

Normalize all the other metrics - and show both views

Show both the absolute number and the normalized one. The absolute one speaks to the scale of work, the normalized one to quality. Together they give the full picture - and protect you from wrong conclusions in either direction.

Three pitfalls when using the number of releases

The number of releases as a goal in itself

More releases is not the goal - it is the means. If a team starts artificially splitting one release into five to "improve" the normalized metrics, that is gaming the system. The number of releases should reflect the real rhythm of delivering value, not be optimized for prettier charts.

Comparing teams with different delivery models

A team deploying on demand and a team releasing once a sprint are two different worlds. Normalized metrics help, but they do not erase contextual differences - regulation, product type, architecture. Use normalization to compare a single team over time, not to rank teams against one another.

Ignoring release size

10 small releases are not the same as 10 big ones. The number alone does not account for size. For more precise normalization, consider weighting by story points or the number of changes - especially when releases differ wildly in scale. The number of releases is a good default denominator, but not a perfect one for every case.

The number of releases in a conversation with the business

Sprint Review "This quarter we shipped 12 releases - 4 more than the previous one. Despite such a big jump in pace, the bug-per-release rate dropped from 1.0 to 0.5. We are shipping faster and more safely."

1:1 with EM "The absolute number of production bugs went up because we doubled the number of releases. But if we look at the per-release rate, our quality improved significantly. We are achieving exactly what the DORA research calls elite performance: pace and stability rising at the same time."

Leadership "We increased deployment frequency from 3 to 12 releases per quarter, moving up a tier on the DORA benchmarks. More importantly, the defect rate per release fell by half. We deliver business value faster while drastically lowering risk."

Why this “trivial” metric is the foundation

Number of Releases gives you

A common denominator for all the other metrics in the series
Protection against wrong conclusions drawn from absolute numbers
Comparability across quarters with a different cadence
A bridge to Deployment Frequency - the language of DORA and the boardroom
Proof that pace and quality can rise at the same time

Number of Releases is not

A goal in itself - more does not always mean better
A measure of release size (the count alone ignores scale)
A tool for ranking different teams
Sufficient on its own - it only shines together with the other metrics

Number of Releases is a metric you do not present - it is the metric through which you present all the others. The quietest hero of the entire series.

In the next article

Five metrics behind us. Article seven ties them all into a single decision indicator - the Release Confidence Score. Three calculation models, from a simple traffic light to a weighted model, a step-by-step rollout and examples from practice. This is the moment the whole series starts working as a system - a single number that answers the business’s most important question: can we release?

Series: QA metrics the business wants to hear

01
The complete guide read
Diagnosis, three pillars, five metrics, the QA → KPI mapping model
02
Defect Detection Ratio read
Formula, thresholds, historical data, seasonality, pitfalls
03
Escaped Bugs & Problems read
Taxonomy, data collection, the cost of each type, how to report
04
Issues per Release read
Rollout from scratch, the link to the development process, the EM conversation
05
Escaped Bugs per Release read
Pinpointing problems, not just watching trends
06
Number of Releases you are here
The context metric, normalization, the link to Deployment Frequency
07
Release Confidence Score step by step
Three calculation models, rollout, examples from practice
08
Storytelling with metrics - building a narrative
How to turn a table of numbers into a business argument
09
3 anti-patterns that destroy QA credibility
Too many metrics, no context, jargon - and how to avoid each