Acceptance Criteria vs. Acceptance Tests

The Vocabulary Problem

Most teams use the phrases acceptance criteria and acceptance tests interchangeably. This is the source of an enormous amount of confusion. The two terms describe different artifacts, written by different people, for different purposes — and conflating them tends to produce documents that satisfy neither role.

The cleanest distinction, drawn from work by Gojko Adzic and the BDD community:1

  • Acceptance criteria are statements of intent. They describe what success looks like from the business or user perspective. They are written before development begins, in plain language, and they help everyone agree on what the story is trying to achieve.
  • Acceptance tests are specific, concrete scenarios — usually expressed as examples — that exercise the intended behavior. They are derived from the criteria, written collaboratively at refinement, and they can often be made executable.

Acceptance Criteria

Criteria are the high-level promises. They answer the question “how will we know this story has met its intent?” They are usually short, bullet-form, and written in natural language:

  • A user can reset their password using their registered email.
  • Password resets expire after one hour.
  • Reset links can only be used once.

These are not tests. They cannot, as written, be executed. They tell you what needs to be true, but not how you would prove it in any specific case. Their job is to align understanding before the team commits, and to anchor the conversation when edge cases arise.

Acceptance Tests

Tests are the concrete realizations. They take each criterion and make it specific with examples:

  • Given a user has requested a password reset, when they click the link within one hour, then they can set a new password.
  • Given a user has requested a password reset, when they click the link after 61 minutes, then they see an "expired" message.
  • Given a user has successfully reset their password, when they try the same link again, then the link is invalid.

Each test is executable — by hand or by automation. Each illustrates one criterion with one specific example. The shift from criteria to tests is the shift from "what is true" to "prove it with this example."

Why the Distinction Matters

Three problems show up reliably in teams that conflate the two:

  • Criteria become test scripts. The PO writes "system displays 'Invalid email' when an unregistered address is entered" as a criterion. This is really a test scenario. The criterion behind it — “unregistered emails are rejected with a clear message” — never gets written, and the team misses related cases.
  • Tests become checklists. Acceptance tests written as terse bullets lose the example-based clarity that makes BDD-style tests useful. The team ends up checking off ambiguous statements rather than running unambiguous scenarios.
  • The conversation collapses. When both layers live in the same field of the same ticket, refinement focuses on the wording instead of the intent. The conversation that should have happened — what does success actually look like? — never quite arrives.

How They Work Together

In a healthy team, criteria and tests reinforce each other across a small workflow:

  • The PO drafts criteria alongside the story card. Three to five bullet points, written before refinement.
  • The team uses refinement to translate each criterion into two or three concrete tests, expressed as examples. Edge cases surface here that the criteria did not anticipate.
  • The criteria stay on the story as the intent record. The tests live alongside the code, often in Gherkin syntax, and become the executable specification.

The deliberate separation is what makes the criteria stay readable and the tests stay specific. Each artifact does its own job.

Relationship to BDD and ATDD

This distinction is the foundation of Behavior-Driven Development (BDD) and Acceptance Test-Driven Development (ATDD). Both practices treat acceptance tests as the primary specification of the system's behavior, written before the code and used to guide implementation. The criteria provide the intent that anchors the tests; the tests provide the precision that proves the intent has been met.

Teams do not need to adopt full BDD/ATDD to benefit from the distinction. Even without executable specifications, separating intent from example sharpens the conversation at refinement.

Coaching Tips

Use two fields on the story.

One for criteria (intent), one for tests (examples). The structure forces the distinction without nagging.

Generate tests collaboratively.

Tests written in isolation tend to miss edge cases. Three people brainstorming examples will find what one writing alone won't.

Use Given-When-Then for tests.

The form forces a setup, an action, and an observable outcome. Vague tests collapse under it.

Don't fear redundancy.

A criterion and its tests will say similar things. That's fine. The repetition is what keeps the two layers in sync.

Watch for tests masquerading as criteria.

Anything that names a specific value or input is probably a test. Pull the intent out of it — that's the real criterion.

Tie tests to code where you can.

Even partial automation of acceptance tests prevents drift. Cucumber, SpecFlow, or even simple test names mapped to scenarios are enough to start.

Summary

Acceptance criteria and acceptance tests are two distinct artifacts that the agile vocabulary regularly collapses into one. Pulling them apart — intent in one column, examples in the other — sharpens refinement, improves test quality, and gives the team a clearer record of what each story was actually trying to do.

Footnotes
  1. Adzic, Gojko. Specification by Example. Manning, 2011.
  2. North, Dan. “Introducing BDD.” Better Software Magazine, 2006.
  3. Cohn, Mike. User Stories Applied. Addison-Wesley, 2004.
Back to Story Crafting