What Is Concept Testing?
Concept testing is a market research method used to evaluate consumer response to a product, service, campaign, or idea before it reaches the market. Companies present a concept, typically as a description, mockup, or storyboard, to a target audience and measure reactions, purchase intent, and perceived value. The goal is to filter out weak ideas before committing development budgets and launch resources.
Unlike A/B testing, which compares live variations, concept testing happens upstream, before anything is built or published. It answers one foundational question: does this idea land with the people we want to reach?
Why Concept Testing Matters
Most product failures are not manufacturing or distribution failures. They are idea failures. According to Nielsen, approximately 80% of new product launches fail within the first year. Concept testing exists to reduce that rate by surfacing misalignment early, when corrections cost a fraction of a post-launch fix.
For marketing teams specifically, concept testing applies to campaign ideas, taglines, creative directions, and brand extensions, not only physical products. A retailer testing five campaign concepts before production can eliminate three underperformers before spending a dollar on creative execution.
Core Methods of Concept Testing
Monadic Testing
Each respondent evaluates a single concept. This produces cleaner, unbiased scores because participants are not comparing options side by side. Monadic testing is the standard for purchase intent and overall appeal measurement. It requires larger sample sizes, typically 150 to 300 respondents per concept, to achieve statistical reliability.
Sequential Monadic Testing
Respondents evaluate multiple concepts in a randomized order. This is more cost-efficient than pure monadic testing but introduces some order bias. Rotating the sequence across respondents controls for that effect.
Comparative Testing
Respondents see two or more concepts simultaneously and rank or choose between them. This method identifies a relative winner quickly but does not reveal whether any concept clears an acceptable threshold on its own.
Prototype or Storyboard Testing
Higher-fidelity versions of the concept, such as animated storyboards for TV spots or clickable prototypes for apps, give respondents a more realistic experience. This method reduces the gap between tested concept and final execution but costs more to prepare.
Key Metrics to Track
| Metric | What It Measures | Typical Scale |
|---|---|---|
| Purchase Intent | Likelihood to buy | 5-point or 11-point |
| Overall Appeal | General concept favorability | 5-point or 7-point |
| Uniqueness | Perceived differentiation | 5-point |
| Relevance | Fit to respondent’s needs | 5-point |
| Believability | Credibility of claims made | 5-point |
| Net Promoter Score | Likelihood to recommend | 0-10 |
The top-two-box score, the percentage of respondents selecting the top two ratings on a 5-point scale, is the standard benchmark for purchase intent. Scores above 60% generally signal a strong new product concept; scores below 40% suggest significant reformulation is needed.
The Concept Score Formula
Many research firms calculate a composite concept score to produce a single, comparable number across test concepts:
Concept Score = (Purchase Intent Score × 0.40) + (Overall Appeal Score × 0.30) + (Uniqueness Score × 0.15) + (Relevance Score × 0.15)
Weights vary by category and company. Consumer packaged goods teams tend to weight purchase intent more heavily. Brand campaigns may weight appeal and uniqueness equally alongside intent. The formula’s value is not its precision but its consistency: using the same weights across every concept test makes scores directly comparable over time.
Real-World Applications
Frito-Lay’s Flavor Lab
PepsiCo’s snack division Frito-Lay concept-tests hundreds of flavor ideas annually before committing to limited-edition runs. The company has stated publicly that most flavors never survive internal concept screening. Lay’s “Do Us a Flavor” campaigns, which ran in the United States from 2012 to 2017, used crowd-sourced submissions but applied proprietary concept scoring before selecting finalists. The 2013 winning flavor, Cheesy Garlic Bread, generated $1 billion in global sales, a figure Frito-Lay attributed in part to validated consumer demand signals gathered before production scaled.
Airbnb’s “Belong Anywhere” Positioning
Before launching the “Belong Anywhere” brand platform in 2014, Airbnb tested multiple positioning concepts with travelers and hosts across several markets. Creative agency DesignStudio, which led the rebrand, has described an extensive concept validation phase covering everything from the logo to the emotional framing of the brand story. The platform has remained in use for over a decade, suggesting the concept held up durably, not just at launch.
McDonald’s McPlant Rollout
McDonald’s concept-tested plant-based burger formats in multiple markets before committing to broader distribution. The McPlant, developed with Beyond Meat, launched in limited U.S. markets in 2021 after concept and product tests in Sweden and Denmark. U.S. sales underperformed projections, and McDonald’s scaled back. The episode illustrates both the value and the limits of concept testing: it identifies demand signals but cannot fully simulate competitive dynamics or category maturity in a new market.
Concept Testing vs. Related Research Methods
Concept testing sits within the broader discipline of market research but occupies a specific stage in the innovation funnel. It follows exploratory consumer insights work and precedes product development or campaign production. Focus groups are often used in early concept exploration to generate qualitative depth, while concept testing typically adds quantitative rigor to score and rank options.
The method also informs brand positioning decisions by revealing which benefits and messages create the strongest response among a target segment. A positioning concept that scores well on uniqueness and purchase intent is more likely to support durable differentiation than one that scores high on appeal alone.
Common Mistakes
- Testing too broadly. Concept tests should recruit from the defined target audience, not the general population. A financial product tested with all adults will produce noise if the actual buyer is a 35-to-55-year-old household decision-maker.
- Relying on appeal alone. A high appeal score with low purchase intent often signals that respondents find the concept interesting but would not buy it. Both metrics matter.
- Testing fully finished concepts only. Rough concepts can be tested early and cheaply. Waiting for polished executions wastes time if the core idea is flawed.
- Ignoring open-ended responses. Quantitative scores tell you what respondents prefer. Open-ended responses tell you why, which is where the actionable insight lives.
When to Use Concept Testing
Concept testing adds the most value when the cost of being wrong is high and the cost of research is low relative to that risk. New product launches, major campaign investments, brand extensions, and value proposition revisions are all strong candidates. For small creative decisions or iterations on proven platforms, the research overhead may not justify the timeline delay.
As a general threshold: if a wrong decision would cost more than $100,000 to correct, concept testing is almost certainly worth running before committing.
Frequently Asked Questions
What is concept testing in marketing?
Concept testing is a research method that exposes a product, campaign, or idea to a sample of target consumers before launch to measure purchase intent, appeal, and perceived value. It is used to identify weak ideas early, when changes are still cheap to make.
How is concept testing different from A/B testing?
Concept testing happens before anything is built or live, evaluating ideas in the form of descriptions, mockups, or storyboards. A/B testing compares live variations with real users in a real environment. The two methods serve different stages: concept testing filters ideas upstream; A/B testing optimizes execution downstream.
What is a good purchase intent score in concept testing?
A top-two-box purchase intent score above 60% is generally considered strong for a new product concept. Scores below 40% signal that the concept needs significant reworking before moving forward. These benchmarks vary by category, so comparing against category norms is always more reliable than using universal cutoffs.
What is the top-two-box score?
The top-two-box score is the percentage of respondents who select the top two options on a 5-point rating scale, typically “definitely would buy” and “probably would buy” for purchase intent questions. It collapses the scale into a single, comparable percentage and is the most widely used benchmark in concept testing research.
When should a company run concept testing?
Concept testing is worth running whenever the cost of a wrong decision significantly exceeds the cost of the research. A practical rule: if correcting a bad launch decision would cost more than $100,000, concept testing is almost certainly justified before committing resources.
