
The Benchmark Brand Trap: When One Competitor Owns the Yardstick
In this piece
In every category there's a brand that respondents reach for before you finish the question. Ask about cola and you'll hear Coke within thirty seconds. Ask about EVs and Tesla. Ask about search and Google. That reflexive name is the benchmark brand trap: when one competitor becomes the unspoken yardstick that quietly structures every answer respondents give about every other brand in the category. If you're running brand perception research without naming this dynamic, you're measuring distance from a fixed point rather than the territory itself.
How the benchmark brand trap shows up in transcripts
It hides in the comparative phrasing. "It's like Coke but less sweet." "Cleaner taste than Coke." "Doesn't have that Coke bite." Three respondents, three different brands being discussed, and Coke is the silent third party in each sentence. The respondent isn't telling you what they think of Pepsi, or RC, or Dr Pepper. They're telling you where each one sits relative to the brand they've already decided is the reference point.
This is the cost: if 70% of your respondents anchor on the benchmark before they think about you, your differentiation work is happening on the benchmark's home field. You can win on a specific attribute and still lose the framing.
Benchmarks in product tests: useful and dangerous
Product testing has used benchmark brands for decades, and for good reason. Nielsen BASES, which has been forecasting CPG launches since the late 1970s, relies on category benchmarks to translate raw concept scores into volumetric forecasts. A 6.2 on purchase intent means nothing in isolation. A 6.2 against the category's top quartile is a signal. P&G's stage-gate process has long required new SKUs to beat an in-market control in blind paired tests before earning shelf space. The benchmark is the only way to know whether "consumers like it" actually predicts anything.
Here's the case for and against running product tests against a named benchmark:
The pro side. Benchmarks calibrate. They turn a 7-point scale into a decision rule. A blind taste test of a new cola against Coke Classic tells you something a monadic test never will, because preference under direct comparison maps to switching behavior in a way absolute liking scores don't. The 1985 New Coke debacle, retold by Coca-Cola's own archivist,, is the cautionary tale on the other side: blind tests against Pepsi showed New Coke winning on sweetness, so Coca-Cola reformulated, and discovered that respondents had been telling them about a sip, not about loyalty. The benchmark worked for the question asked. The question was wrong.
The con side. A benchmark in a product test imports the benchmark's frame into your evaluation. If you sip-test against Coke, you optimize for sip-test attributes (sweetness, fizz, initial impression) and miss the attributes that drive repeat (finish, fatigue, occasion fit). The benchmark tells you how to win the test it's built for. It can't tell you whether that test predicts the market.
The synthesis most experienced product researchers land on: use benchmarks for calibration in quant, but never let them set the qualitative agenda. The qual is where you find out whether the benchmark's vocabulary is even the right vocabulary.
Run your next study on Enumerate.
See how Enumerate works on a study like yours. Book a 30-minute demo and we'll walk you through it.
Book a demoTailored to your use case
Escaping benchmark gravity in your next study
The fix in qual is mostly methodological, and it starts with the discussion guide. Three moves work:
- Open with the category, not the brand. Ask respondents to describe what a great product in the category does before any brand is named. This surfaces the criteria they care about, not the criteria the benchmark has trained them to recite.
- Probe the comparative clauses. When a respondent says "like X but cheaper," don't accept it. Ask what "like X" means concretely, then ask what they'd want if X didn't exist. The counterfactual is where the real perception lives.
- Run at least one segment of interviews that never names the benchmark. Hard to do with a human moderator who'll slip out of habit. Easier with an AI moderator running a consistent guide across the full sample, where the probing logic holds from interview one to interview one hundred.
Enumerate's AI-moderated interviews are useful here because the probe on "like Coke but less sweet" gets asked every single time, not just when the moderator catches it. Across a medium-to-large sample, that consistency is what surfaces the benchmark's gravitational pull as a pattern rather than an anecdote. The automated thematic coding then makes the benchmark dependency visible as its own theme, separate from the brand attributes you thought you were measuring.
The implication for brand strategy is harder than the methodology. If the benchmark owns the category's vocabulary, communications that play in that vocabulary reinforce the trap. The brands that escape do it by changing the question respondents ask in the aisle, not by answering the benchmark's question better.
Want to design a brand perception study that names the benchmark instead of being shaped by it? Book a demo with Enumerate.
Related Reading

Online vs Offline Appliance Buying: A Research Field Note
A field note on researching online vs offline appliance buying. where the channel split actually lives, and what most studies miss.
Read more
Depth vs Breadth in Research: You Can Finally Have Both
For decades, researchers chose depth or breadth. AI-moderated qual at scale ends that tradeoff. Here's how the economics changed and what it means for your next study.
Read more
How AI Makes Diary Studies Viable for Commercial Research
AI transforms diary study economics by handling massive data volumes and cross-respondent synthesis that historically limited commercial use.
Read more
Run your next study on Enumerate.
See how Enumerate works on a study like yours. Book a 30-minute demo and we'll walk you through it.
Book a demoTailored to your use case