New Coke Taste Test: The Research That Was Right and the Launch That Was Wrong

In this series
Famous Product Failures
Part 1New Coke Taste Test: The Research That Was Right and the Launch That Was Wrong
In this piece
The New Coke taste test is the most cited example of research that was technically correct and strategically catastrophic. In 1985, a global beverage brand ran blind taste panels with a very large sample of consumers across the United States.
The reformulated product won, consistently. Then the company pulled a 99-year-old formula and replaced it with the winner. Within three months, they reversed course under public pressure that bordered on civic outrage. The data was right. The decision it informed was wrong.
Key Takeaways
- A very large blind taste test produced genuine preference data (the reformulation tasted better in blind single-sip conditions) but the methodology couldn't measure brand equity.
- Hedonic preference in a controlled test doesn't predict purchase behavior when identity, nostalgia, and habit are load-bearing parts of the product experience.
- Single-sip tests systematically overweight sweetness; full-serving consumption preferences often diverge from first-impression scores.
- The failure wasn't bad research execution; it was a mismatch between the question the test could answer and the question the decision required.
What the Taste Test Measured, and What It Couldn't
The Project Kansas research program, which the company ran in the early 1980s to diagnose a rival's growing market share, was methodologically conventional and professionally executed. Respondents tasted unmarked cups of cola; no brand cues, no packaging, no context. In those conditions, the sweeter reformulation outperformed both the original and the leading competitor by a meaningful margin. The result was internally consistent across markets and demographic groups. The problem was the construct. Hedonic preference ("which of these tastes better right now") is a real thing worth measuring. It is not the same as purchase intent, loyalty, or category satisfaction over time. The taste test asked a sensory question and received a sensory answer. Then the organization used that sensory answer to make a brand architecture decision.
Single-sip methodology compounds this. A sip-test overweights first-impression flavor characteristics: sweetness, carbonation hit, immediate finish. Characteristics that drive long-arc satisfaction (how a full serving feels after 12 ounces, how the taste holds across a meal) don't register in a three-sip protocol. The competing brand had known this for years; its sweeter formula performed better in blind sip conditions than in in-home usage studies. The research team was aware of the dynamic. Competitive pressure to win in the format the rival had weaponized overrode the methodological caution. The divergence between single-sip scores and sustained-use preference is well documented in sensory methodology, including standards like ISO 6658 on sensory analysis and its associated hedonic-testing guidance; a divergence the category understood but didn't apply here.
The Equity Signal and What the Reversal Confirmed
What a very large taste test cannot tell you: how many people have a 40-year relationship with a specific flavor that functions as a memory anchor for their childhood, their family, their cultural identity. Qualitative work during the same research program surfaced strong emotional attachment to the original formula. Those findings were noted and largely set aside, partly because the quantitative data was so consistent and partly because the rational expectation was that consumers would update their preference once they tasted the superior product. That bet proved to be the catastrophic assumption.
Harvard Business School's case study on the episode, by Susan Fournier, documented this framing precisely: the research answered whether consumers would prefer the taste; it was never designed to answer whether consumers would accept the loss of the original. When the company re-released the original formula 79 days after launch, the reception was immediate. Estimated hotline volume in the first three months ran to hundreds of thousands of calls, per contemporaneous press accounts, and the grassroots protest movement that had organized around restoring the original formula dissolved almost overnight. Consumers hadn't just preferred the original taste, they needed the original product to exist. That distinction, preference versus necessity, is what the sensory test had no mechanism to find.
The Methodology Gap Still Running in Teams Today
The New Coke story is 40 years old. The methodology error it illustrates runs in product and brand teams right now. Concept tests, flavor screenings, reformulation studies, packaging upgrades; all of them routinely use controlled preference testing to answer questions that controlled preference testing can't fully address. What gets skipped is the equity audit: what does this product mean to the people who buy it, beyond the sensory or functional experience? For a commodity product, preference data does most of the work. For a brand with deep cultural presence, hedonic preference data is necessary but not sufficient. It tells you whether the new version would win a taste contest. It doesn't tell you whether the old version is doing work in people's lives that has nothing to do with taste.
This is where qualitative research methods earn their place in the methodology sequence; not as a follow-on focus group to validate a quant result, but as a parallel track probing the meaning layer: what role does this product play in your routine, your household, your identity? What would you lose if it changed? Enumerate's AI-moderated interview platform can run that meaning-layer probe across a medium-to-large sample early enough to influence the product decision, not just explain it afterward. Research programs that want to catch this gap before launch need to ask one question alongside the preference test: "What would change for you if this product were no longer available?" For that 1985 cohort of loyal drinkers, the answer would have been loud. The research methodology design that surfaces it before launch, rather than 79 days after, is the whole point.
Want to see how this works in practice? Book a demo with Enumerate.