Skip to content

The Wine Buyer’s Dilemma, The Wine Economists’ Blunder

2011 September 28
by Mike

The American Association of Wine Economists—yes, there is such an entity—released a working paper the other week that caused a stir among grape nuts. “The Buyer’s Dilemma—Whose Rating Should a Wine Drinker Pay Attention To?”, co-authored by Omer Gokcekus of Seton Hall University and Dennis Nottebaum of the University of Münster, compared “expert” scores for 120 wines from the 2005 Bordeaux vintage with the ratings given those wines by CellarTracker users. The experts were Robert Parker, Stephen Tanzer, and the Wine Spectator, and the results were eye-catching: Parker and the Spectator had significantly higher average scores than the CT community, while Tanzer’s average score very nearly matched the CT average, coming in just a smidge under it. Based on these findings, the answer to the question is clear: Tanzer is the critic to heed. Buyer’s dilemma solved!

Bizarrely, Gokcekus and Nottebaum misreport their  findings in the paper’s Abstract: they claim that the average scores for all three expert sources were higher than the CT ratings. Only when you get to the section devoted to Tanzer do you discover that his average score was actually lower. Also, Tanzer’s first name is spelled “Stephan” several times in the text. I understand that this is a “working” paper, but before it was released, it should have been scrubbed of these mistakes, which naturally cause the reader to wonder if the same carelessness pervaded the entire project. As it is, the study was compromised by a more fundamental mistake, which I’ll get to in a minute.

First, let’s go inside the numbers (someone has  clearly been watching too much SportsCenter). Parker rated 107 of the 120 wines that were used in the study; his average score for those wines was 93.2, versus 91.7 on CellarTracker. The Spectator published ratings for 104 of the wines, and its average score was 93.24 versus 91.73 on CT. Tanzer graded 61 of the 120 wines, and his average score was slightly lower than the community’s—92.08 versus 92.16. You will note that the sample size for Tanzer was markedly smaller than those for Parker and the Spectator. Via email, Gokcekus told me that this wasn’t a problem: “When you conduct statistical tests the data size is taken into account. Thus, statistically speaking, the results are not skewed. [Tanzer’s] smaller data set is not giving him any advantage or disadvantage.” The average price of the wines that Tanzer reviewed was $145.60, versus $109.97 for Parker and $119.81 for the Spectator.

Although Tanzer is the study’s big winner, the paper focuses mainly on Parker (more of this Parker obsession!), and it’s his numbers that are most closely scrutinized. Wines that Parker  rated 89 or 90 points received slightly higher scores from the CT crowd, but above 90 points, CT users were notably tougher graders. Wines that got 93 points from Parker averaged 91.5 points on CT; wines that he awarded 95 points garnered just 92 points on CT: and wines that earned 98 Parker points averaged 95 CT points. Gokcekus and Nottebaum offer two possible explanations for these discrepancies. They suggest that Parker might be more adept at sniffing out nuances than regular wine drinkers. Or it could be that CT reviewers are engaging in “iconoclastic” behavior—they resent Parker’s influence (his “hegemony over the wine community,” as Gokcekus and Nottebaum put it) and are using their own ratings as a form of rebellion against him.

You are not buying that last part? Me neither, but let’s set that aside for a moment. While “The Buyer’s Dilemma” is an interesting paper, I have issues with how the study was done and how Gokcekus and Nottebaum interpreted the data. The major flaw is that they chose a less-than-ideal set of wines as the basis for their research: the 2005 vintage in Bordeaux produced hugely structured clarets that are still years from reaching maturity. Gokcekus told me that they picked 2005 because “at the time we started this research project it was the youngest vintage that had already been scored by a large number of users on CT.” But the 05s are too young, and this may have skewed the results. Critics have graded the 05s based on the potential that they see in the wines. Whether you believe that the pros have the ability to accurately predict how a wine is likely to evolve over the course of decades is immaterial; till now, the experts have rated the 05s mostly with an eye to future performance.

By contrast, I’d guess that the majority of CellarTracker 05 Bordeaux scores were derived on the basis of immediate gratification, and because many of these wines are still painfully tight, that may have held down the CT ratings. It is more than likely that the two groups, the pros and the amateurs, were using different yardsticks to judge the wines, which could account for the split between the CT community and Parker and the Spectator. This is why the choice of wines was so key. Gokcekus and Nottebaum would have been much better off using wines that offered enough upfront pleasure now that this criteria gap, if you will, wouldn’t have been such an issue. A recent vintage of Napa cabernets would have been a more sensible option. Given where they are in their development, 05 Bordeaux were almost uniquely ill-suited to this kind of study.

Gokcekus and Nottebaum suggest that the difference between Parker’s average score and the CT average might be rooted in Parker’s ability to discern nuances that would likely elude everyday drinkers. I think they flatter Parker. For one thing, the 05s are still very primary and aren’t yet offering much in the way of subtleties. Beyond that, there is the issue of methodology. Parker and other critics taste dozens of wines a day and probably spend no more than 30 or 40 seconds evaluating most of them, so how much nuance can they really perceive? We know from the research into sensory perception that repeated, rapid exposure to the same basic aromas and flavors dulls the senses. Anyone who has ever tasted, oh, 40 young Barolos in a single sitting can attest to that (once they regain the ability to move their tannin-battered mouths).  So the nuance argument gives Parker and other critics more credit than they probably deserve.

The “iconoclastic” explanation strikes me as completely farfetched. While there has always been resentment of Parker’s influence, it has generally been confined to fellow critics and winemakers; I don’t think consumers have felt much resentment—quite the opposite, actually. And while a small subset of wine enthusiasts seem to now define themselves in opposition to Parker—basically, if he’s for it, they are against it—I doubt that large numbers of CellarTracker users are scoring wines differently from Parker simply for the purpose of sticking it to The Man. If that were the case, the discrepancies would surely be greater. If you are going to make a statement, Make A Statement. A 2-3 point margin is a difference of opinion; a 20-point margin is an act of defiance, a raised middle finger.

In conjuring such an outlandish scenario, Gokcekus and Nottebaum overlooked what I think is the most intriguing possibility of all: that the differences between expert ratings and CT scores may be indicative of grade inflation on the part of critics. I’m repeating myself here, so forgive me, but there are powerful incentives now for critics to bump up their scores. High scores are catnip for retailers, who use them to flog wines via shelf talkers and email offers. In turn, those citations are excellent free publicity for critics. In a tough economy and a crowded marketplace for wine information, big numbers can help a critic to stand out, and I don’t think there is any doubt that score inflation has become rampant. Tanzer is one of the few critics who has held the line on scores, and the fact that his ratings most closely track the CT ratings strikes me as a highly suggestive data point. Parker and the Spectator are not as restrained as Tanzer, and this may account for the gap between their grades and those of the CT community. CT users might be amateurs, but many of them are very experienced and knowledgeable tasters, and it could be that they are giving these wines scores that more accurately reflect their quality. But, again, because Gokcekus and Nottebaum based their investigation on a less-than-ideal group of wines, this is purely speculation.

Gokcekus told me that he and Nottebaum plan to conduct more research on this topic, using a different region/vintage combination. Let’s hope they choose more wisely next time. Picking the wrong wines can spoil a dinner; it can also needlessly complicate a study.

36 Responses leave one →
  1. April 26, 2013

    Most sufferers buy Dvd videos, because we need to enjoy all of them.

  2. Daniel Hennessy permalink
    January 11, 2013

    The biggest problem with that study, and your article, is that both ignore the massive Anchoring Bias involved with CellarTracker scores. CellarTracker automatically and prominently shows Tanzer’s rating for every single wine that IWC covers, and that influences how CT users then go on to score the same wine. Scores issued by other major critics like Allen Meadows, and Jancis Robinson are hidden from view unless the CT user has paid for a subscription. And Parker doesn’t allow even that basic functionality – his paid readership must go to the Parker website or page through the Parker journal to find his scores.

    CT scores are closest to Tanzer/IWC scores because those are the ones that they observe. It biases their response. I’m very surprised that the authors of the working paper didn’t notice something so obvious.

  3. José permalink
    October 3, 2011

    Event though the sample size would have been reduced, the study could have been more useful if ALL wines used for the study were rated by the three critics and CT. The way it was done was almost like comparing apples and oranges.

  4. October 1, 2011

    Mr Klapp,

    It seems we are in agreement on every point and your views are spot-on. I think the only possible source of contention regarding this post lies with the question: do we even care about this so-called study or not?

    As for my off-the-cuff comment about being ‘a bit tangential’ it was merely an observation that we were moving off subject, not an admonition; my apologies if it seemed otherwise. No question about the very high caliber of people posting here – I think some of the most intelligent and informed of any wine blog out there, which is certainly to Mike’s credit in attracting such readers.

    David Boyer
    Classof1855.com

  5. Bill Klapp permalink
    October 1, 2011

    David, you are right about the sorry state of wine science, considering the worldwide resources available, and it does seem as though much of it, especially of the UC-Davis variety, ends up being used to formulate Megapurple. I suspect that the aggregate knowledge is far greater than we can know, but also existing in pockets rather than being shared globally. Also, I think that a little healthy skepticism about “science” is in order anyway (I don’t want any wine “science” funded by Monsanto or DuPont personally), and there can be no question that trial and error (the Burgundian monks, etc.) has proven more valuable than any modern-day science is likely to be. That said, science does do a miserable job of explaining many readily observable customs and phenomena in the world of wine. Frankly, we have no clue whether storing wine at 55 degrees is necessary. We just do it, based upon speculation about average temperatures of passive cellars in Europe.

    And of course, you are right about we, the consumers and collectors, being responsible for the phenomenon known as Robert M. Parker, Jr. I would add the caveat that, once our mistake was made, Parker has done little but capitalize upon and encourage our collective stupidity and insecurity, and, of late, has taken to unsupportable ranting in defense of his self-righteousness. But we may well be responsible for that as well. Parker-baiting has been in vogue for some time now.

    As to the tangential nature of my post, indeed so, but the beauty of this forum is that it attracts many top-notch students of fine wine, and does not require a “wicketkeeper” to shape and censor the dialogue…

  6. Wilfred permalink
    September 30, 2011

    I don’t really place much stock in this study except the finding that the mean scores might have been statistically different from each other (but on a practical level, is there REALLY any difference between something like a 91.3 and a 92.6)?

    My gripe with the study is that no real implications or conclusions can be drawn except for the conclusion that different review groups (TWA, Tanzer, CT) rate wines slightly differently.

    But so what?

    The reason I think the implications are flawed is that the observations are not “independent.” That is, many CT users know what Tanzer or Parker or whomever rated the wines, and might (to a greater or lesser degree) be influenced by those ratings. Second, the reviewers (except for WineSpectator which uses blind ratings) are influenced by such factors as where the wine ranked in the 1855 classification (as the Editor of Winespectator once pointed out, its amazing how orderly reviews’ rankings are isn’t it)?

    Interesting article but not worth thinking about too much.

  7. September 30, 2011

    Mr Seysess,

    My apologies if I offended you in any way. Your tongue-in-cheek gist was entirely lost on me – my bad.

    Best Regards,

    David Boyer
    classof1855.com

  8. September 30, 2011

    Mr Klapp,

    Being the savant oenophile you are, I thank you for imparting your wisdom and knowledge, if a bit tangential. I have no doubt that you and I would enjoy a long discussion over a glass of wine about the many maligned notions too often found in the wine world.

    As you know, in Monsieur Jayer’s time there really was little wine science to speak of but farming expertise was was the order of the day and if you’re a farmer, high yields are paramount. His contribution to quality wine cannot be argued, whether it was scientific in nature or just blind luck. The same can be said about Monsieur Peynaud, who was certainly influential with his work regarding tannin management and malolactic fermentation.

    Even today I’m amazed at how little we know about wine chemistry and viticulture. Science has not kept pace when it comes to wine but maybe that’s a good thing. Sadly, in this age of instant gratification our wines are turning into Coca-Cola type products and I don’t look forward to drinking the wines of tomorrow. With all the technology being used (abused?) in winemaking today anybody can “Parkerize” their wine and garner high scores. I don’t blame Parker – everyone has his own personal sense of taste – but no one knew at the time that he would have the influence he had (and still has). In the end we can only blame ourselves for giving anyone that kind of god-like power.

    Best Regards,

    David Boyer
    classof1855.com

  9. Jeremy Seysses permalink
    September 30, 2011

    David,
    I was indeed being tongue in cheek and completely agree with what you have written. Kudos for being reasonable (and I write this not at all tongue in cheek).

    Mike,
    There are some vintages of Burgundy (and other regions) that have never come around, but I think we agree that 1993 is not one of them. As David aptly remarked, it is about balance.

    Bill,
    Accad gets too much credit when people credit him with the sulfitic pre fermentation maceration (which then led to cold soaking). His methods were merely recycled research out of Montpellier that had been dismissed (as he was) for its catastrophic results. His vineyard work is getting talked about again, but I must say that while I know plenty of Accad followers, I have never heard what his vineyard work specifically advocated. I rather suspect that he advocated good common sense, rather like Jayer. As you point out, Jayer’s biggest and most lasting contribution was that quality came from the vineyard. He was not alone making that point, but he made it vocally and influentially and for that all Burgundians can thank him.

  10. Bill Klapp permalink
    September 30, 2011

    David, Jeremy has not taken it out of context. He is, I believe, making a tongue-in-cheek poke at Parker, whose ignorance in that regard is manifest and appalling, for the very reason that you cite: we now know that tannins dissipate in various ways. In my view, neither Peynaud nor Jayer ultimately proved to be scientific geniuses, though each in his time was to some degree hailed as such for hurling some useful fundamental winemaking truths into what was pretty much a scientific void.

    Parker must have done what can only be described as a fly-by read of Peynaud, and may well have gotten what he thinks he knows from Peynaud’s prize student, that globe-trotting alchemist and Frankenwiner, Michel Rolland. Jayer took an oenology degree, but the evidence suggests that his contributions were insistence upon low yields, destemming and cold-soaking for greater concentration and color, the latter two of which appear to be the result of his personal preferences rather than his education. (He destemmed to avoid any possibility of green notes and an obvious tannic character, both of which he loathed, and which loathing resulted in the now-demonstrably false assertion regarding tannic wines that Parker still clings to.) I note that Guy Accad later ended up being demonized for his advocacy of cold soaking, as most knowledgable Burgundy critics (obviously never including Parker) believed that deep purple, concentrated, often higher-alcohol Burgundies lost all sense of the region’s hallmark, terroir. From that I take away that cold maceration in the hands of Henri Jayer worked, because Jayer was Jayer, but in the hands of a Guy Accad and others, became controversial, and in any analysis, provided the foundation elsewhere for the over-extracted, high-alcohol “fruit bomb” made so famous by Parker.

    Jayer’s true contribution lies in his often-quoted “A great wine is crafted in the vineyard, not in the cellar,” which is certainly true of the greatest Burgundies of Jayer’s time, but also before and after, independent of Jayer’s one-liner. (Interestingly, Accad’s positive vineyard contributions were thrown out with his cold-soak bathwater.) If Parker thought about it, he would realize that that Jayer quote underlies biodynamics (which Parker tolerates if the wines are big and fruity enough to suit him) AND the natural wine movement (which he does not). I think that low yields from old vines, advocated by Jayer and virtually all winemakers of excellence, are considered to be a good thing, and yet, the use of green harvests and certain canopy management techniques remain controversial, and suggest that using the term “low yields” as a shibboleth, as Parker has done consistently, is facile and reflective of inadequate study. (I throw Parker this bone: under peer-group pressure, and at the last possible minute, Parker finally suggested that Accad was “controversial”, in precisely the same way that Hardy Rodenstock is merely “controversial” today, in Parker’s myopic view.)

    Parker also read Peynaud’s assertion that alcohol carries the sweetness in wine, and did the classic American “more is better” schtick, deciding that lower yields, overextraction and longer hang time could result in greater concentration and sugar levels, and thus, higher alcohol and greater sweetness. And it seems that he perverted the work of Peynaud and Jayer in making up the notions that greater sweetness from high alcohol can and should be ratcheted up even more by promoting the usually commensurate lower acidity level, and either managing tannins by full or partial destemming (as in Jayer Burgundy and the relatively few other Burgundies that Parker approved of) or by having them disappear under layer upon layer of extracted fruit. The concepts of bracing acidity and balance are lost on Parker. What listening to Parker has given the world is Helen Turley, not a new generation of Henri Jayers…

  11. September 29, 2011

    I’ll submit that the data as observed has nothing to do with palates per se, and only a little to do with the normal distribution of points around a mean (this pertaining only to the CT data)being truncated on one side by the 100 point limit. Rather, imo, it has everything to do with 96-100 being viewed as the territory of the Gods, and mortal men fearing to go there. And Mr Tanzer realizing that if you open that gate to often it ultimately comes unhinged.

  12. September 29, 2011

    Hi Jeremy,

    It seems that the statement made about ‘once too tannic, always too tannic’ may have been out of context or even possibly made at an earlier time in winemaking when things were not as well understood as they are today. There is still a lot of research being done in many areas including the effects of tannins and aging. With bottle age it is thought that either the tannins attach to phenolic compounds and become so large that they fall out of solution, or they break up due to acidity in wine and become smaller. Either way we know tannins soften with age.

    If you compare a barrel tasting of Bordeaux to a 20 year old Bordeaux you would ,no doubt, clearly understand how different the two wines would be (for many reasons). Many so-called ‘old world’ wines can be tannic when young but become much smoother with age. I have heard numerous winemakers state that if the wine isn’t balanced from the beginning, it will never come into balance, which may more accurately reflect your statement. Tannins in young wine are intended to be present, which gives the wine enough structure to age, which in turn creates depth and complexity. A red wine made to age should absolutely be tannic when young (or acidic if it’s something like a Grand Cru Burgundy white wine) and the winemaker is counting on the fact that with bottle age, the wine will soften and become more pleasant to drink.

    David Boyer
    classof1855.com

  13. September 29, 2011

    Methinks this is much ado about nada…

  14. Kathy permalink
    September 29, 2011

    Is it me or does the delta between all the scores seem really small. I mean, we’re talking about 1.5 points. BFD. Seems to me that everyone kinda rates them all the same. For Spectator, the range (key word here, is “range”) of 90-94 is considered “outstanding”. Didn’t the average scores all rate in this range??? It might be interesting or even salacious if the Parker/Spectator averages fell into the higher category. Now THAT’s a story.

  15. September 29, 2011

    Excellent point, Jeremy, but as you know, Parker, Turley, and Wetlaufer only apply that wisdom to Burgundy. Speaking of which, I’m thinking now that perhaps a better set of wines for a test like this would be 1993 red Burgs :)

    Steve, that’s true, but I found it quite interesting that there was such a large gap between Parker’s higher-scoring wines and what those wines averaged on CellarTracker–a three-point difference strikes me as reasonably significant.

  16. steve permalink
    September 29, 2011

    If you just round the numbers (who carries out subjective ratings to two decimal points?!), all groups are pretty much in the same ballpark (one point differences basically). Seems like the study shows that the pros and CT folks are pretty much in agreement. And when you get to wines being rated in the mid to upper 90′s, line up 10 pros in a brown bag tasting and see how far apart they end up!

  17. Jeremy Seysses permalink
    September 29, 2011

    Wait a moment, those arguments about the wines being too young, shut down, etc. are completely untenable! After all, people like Parker and John Wetlaufer regularly debunk that type of laughable suggestion by quoting variously Henri Jayer, Emile Peynaud and Jean-Bernard Delmas as having all come up with more or less saying “once too tannic, always too tannic”.

    There, that should set that part of the argument to rest.

  18. September 28, 2011

    Man, I was going to jump in and skewer you but you nailed it in the 2nd to last paragraph. Tanzer is more conservative than WS and WA which gives him better correlation with CT ratings.

    A better test to determine correlation would be to assess in which percentile vs. its peer group each wine was rated and compare that across wines rated by each publication. It would help dampen how recklessly each reviewer throws around the big numbers and give better insight into whose ratings correlate with CT users better.

  19. September 28, 2011

    Pseudoscience; laughable trash, except that it is a sad reflection on so many things.

  20. Donn Rutkoff permalink
    September 28, 2011

    Hmm. ATWOT or ARWOT. Seems a few social scientists in Scotland or England did a study, published about 6 or 8 months ago, that “proved”(?) that cheap wine is just as tasty as expensive wine. Really. The best thing is at least this study, co-authored by one from Seton Hall in NJ and one from what I assume is in Germany, Munster with umlaut, was not done with U.S. taxpayer dough. What can you prove trying to research items on which people voluntarily spend their money? On a luxury item. Is that $5K diamond worth more than the $5.2 K emerald? Maybe. Maybe not. Depends on who is looking at it. But it is comforting to know that there is an Amer. Assoc. of Wine Economists. How come they never get interviewed on CNBC, Bloomberg, or hey, I have a great idea, on Rachel Maddow??? Now that would be ADTTWOT (a double triple total waste of time)

  21. September 28, 2011

    Grade inflation is highly unlikely if you look at the data. Your argument basically is Parker and WS rate everything more highly, including the less expensive wines, and the lower correlation between them and CT is not because some expensive wines are being rated lower and some less expensive wines are being rated higher, it’s because in general less expensive wines are being rated higher. In this scenario Tanzer and CT are rating these less expensive wines lower while staying in the same neighborhood on the more expensive wines because the scale has a cap of 100 and Parker and WS cannot go any higher. In other words they are using more of the scale. This seems unlikely to me, at least when it comes to Parker, because it was established that on the lower end of the rating scale the community was actually rating wines slightly higher than Parker was, difficult to do if we’re tying to say that Parker is rating the less expensive (lower rated wines) too highly. It is also not supported by the variance data, which shows Parker and WS with larger variance (the amount between the lowest score and the highest score). So on the whole, this theory seems unlikely to me (and must have to the writers as well as they don’t mention it).

    You’re also ignoring the elephant in the room, although your posting has a great explanation for it: price. We would expect to find strong positive correlation between price and ratings because in general, more expensive wines are better than less expensive wines (there are obvious exceptions, I know about the studies that show Joe Schmo off the street can’t distinguish in a lab setting, and there is no doubt that you can find really good wine at lower prices). But the CT crowd takes this to an extreme, unfortunately the link to the study is down but I believe it was over .7 . It’s also a blow to the argument that Parker is overly influenced by prestige by not tasting blind (if you take price to be an indicator of prestige) since he has the lowest correlation by far (down around .55 if I recall correctly). Tanzer isn’t helped by this either, since he comes close to matching CT.

    But your explanation for it makes the most sense and really, the argument here is for very young wines that need to age you shouldn’t look to CT. Like you, I’d want to see the results for Napa Cabs or perhaps an older vintage of Bordeaux.

  22. rick permalink
    September 28, 2011

    Hmm… I skimmed the comments so if I missed this my apologies, but there’s another issue here that isn’t addressed. The absolute difference is tiny. Parker’s scores are 1.5 points higher than the CT scores. Tanzer’s are 0.08 lower. Now, I’m sorry, but there’s no way in hell that people can control things to 0.08 of a point. Even 1.5 points is a very tight tolerance.

    Finally, CT users also don’t all define the score ranges the same. I’ve seen CT reviewers who will savage a wine as barely drinkable and give it an 86. For them, 90 is the lowest, drinkable wine score. When you cannot control for the variety of definitions of what a given score means, I’m not at all convinced that aggregating them is useful.

  23. September 28, 2011

    Interesting, when wine-friends asking me whether I prefer Parker, WS, Tanzer etc. I always tell them my favourite wine-critic is CT.

    I call this “swarm intelligence”.

    Cheers,
    Martin Zwick
    http://www.berlinkitchen.com

  24. Bret Rooks permalink
    September 28, 2011

    I think it’s also worth mentioning that a community scoring method like that which comes out of CT will inevitably provide a bell curve distribution of scores (or something like it). As you run up against the top end of a scale, you lose part of that distribution and the average scores will skew lower.

    A wine can only have a 100 score on CT if everyone scores it that way, regardless of palate preference…which means there will never be one.

  25. Randy Caparoso permalink
    September 28, 2011

    My thoughts? ARWOT (a ridiculous waste of time)… the less said about this grotesque distortions of quality, the better for poor, unsuspecting consumers…

  26. Bill Klapp permalink
    September 28, 2011

    Corey, you have a great point, and I am even more cynical about CT users and scores, having spent too many years reading silly amateur assessments of wines. There are many motives that go into posting tasting notes, and the ego’s need to express itself (so often driven by insecurity about one’s taste and judgment) is high on that list. Few will post to, in essence, say that they made a purchasing mistake. Also, haven’t we all been to a tasting event where somebody opens a corked or shot old, expensive bottle and then starts with the line of BS about how it is a great bottle, but great bottles like this one exhibit blah, blah, blah that might make you THINK that it is corked or shot, yaddah, yaddah. Why would one not expect that to occur in spades when there are no live witnesses?

  27. Cris Whetstone permalink
    September 28, 2011

    There is a huge factor that gets overlooked repeatedly in this discussions. Tanzer’s scores are right there for everyone who is a subscriber to Cellar Tracker. You have to manually enter those for Parker, et al. Never under estimate the power of suggestion. My anecdotal experience leads me to believe the correlation to Tanzer’s scores runs much much further than 2005 Bordeaux.

  28. September 28, 2011

    Hi Mike,

    Thanks for the great service you’ve done with this article. I agree completely with the fact that the ’05 Bordeaux vintage was a very poor choice to pit critics against CT users for the same reason you mentioned: the vintage is far too young to drink. There are numerous CT notes about Bordeaux that was consumed way too early and then receive lower scores. I do think the major three critics Parker, Tanzer, and Suckling scored those wines with the future in mind because all three of them have enough experience to know pretty much how those bottles will develop with age.

    Parker has always been generous, Tanzer conservative, and Suckling falls somewhere in the middle. I do what everyone else does, which is to calibrate my palate accordingly. I know generally to knock off a few points if I’m considering a Parker score and add a couple points if I’m considering a Tanzer score; Suckling aligns well with my palate about 90% of the time. And since he left Spectator, the publication doesn’t have the same luster or credibility it once had, at least for me.

    I wrote about the issue of rising scores in early 2010 because I was seeing the same situation as you. While I admit that wine quality may have moved up a bit, far too many average wines were being hoisted over that dreaded 89 point threshold that usually means a slow death on the shelves. One of the most egregious examples was the 2005 Columbia Crest Cab named Wine of the Year in Wine Spectators top 100 of 2009. Was it a terrible wine? Not entirely. Was it a 95 point wine? Not even close!

    As for tasting, I recently scored and wrote tasting notes (2000 wines in 65 flights) for Better Wine Guide over the course of a few months. The palate and olfactory were borderline abused in this setting but still managed to be able to distinguish aromas and flavors. I didn’t know it at the time, but it prepared me for tasting through 120 ‘10 Bordeaux barrel samples in an afternoon. I was worried about getting through that volume of wine and maintaining any sense at all but it was not too difficult, possibly because the vintage was so great. I don’t know how other critics taste, but for me it’s taxing, but not impossible. It’s important to me to come back weeks later to check myself and I have almost always had consistent notes, so I believe volume tasting is possible in controlled circumstances. Oh, and I also used a palate cleanser throughout called SanTasti and I don’t believe I could have otherwise gotten through 30 wines per flight for days on end.

    Also the entire culture of CT comes into question because of the diversity of its users, from wannabe critics to very sophisticated oenophiles and connoisseurs. Where the average lies I would guess to be biased more toward the less knowledgeable, based partially on what I have read on the site. As with many cultures, I get the impression that there may be some type of peer pressure to ‘fit in’ to CT or take the risk of being ostracized. There are so many amateur mistakes in knowledge and practice posted on CT (for example, decanting 30 year old for three days) that it calls into question how one finds value here unless there are specific people that have proven themselves over time. I even question the relevancy of this study based on my observations about CT.

    And the statement about people resenting Parker is completely untenable. If people loathed him so much, he surely would not have the massive influence to move markets, which he has done on numerous occasions and continues to do. I would normally expect more intelligence from economists but perhaps they should just stick to numbers and leave wine and people out of their studies.

    David Boyer
    classof1855.com

  29. September 28, 2011

    correction — my analysis is at masterofwinejourney.blogspot.com

  30. September 28, 2011

    See my analysis on masteofwinejourney.blogspot.com. The authors of the study used the wrong statistical test to arrive at their conclusions. Their results are, in fact, statistically meaningless.

    Wine scores are not interval variables!

  31. Corey permalink
    September 28, 2011

    CT scores are subject to other biases that should tend to make them inflated. There’s a self-selection bias: most people are scoring wines they purchased, and they tend to purchase only wines they are likely to like. There’s a confirmation bias: if you bought the wine, you’re subconsciously going to want to like it to confirm your good taste in picking it out (esp. true if you spent a lot of money on it). Critics should be immune from both of these biases. So it’s surprising that the CT scores tend to be lower.

    Another factor is that a lot of the critics’ reviews are done very soon after bottling when the wines are first on the market, which is often before the wines lose their baby fat and shut down behind the structure for who knows how long. Tasting a top, heavily structured wine 6 months after bottling can be quite different from tasting it 3 years after bottling. If the critics are smarter about whether/how much to decant/aerate the wines before tasting them, that could make a big difference too.

  32. September 28, 2011

    The data set is big enough to do what the study set out to do: compare the scores of CT vs professional reviewers. You only need 30 wines to be statistically relevant.

  33. September 28, 2011

    “that the differences between expert ratings and CT scores may be indicative of grade inflation on the part of critics.” – You’re right!

    I also think the data set was way too small. 1000 wines should have been done, across a whole range, not just 2005 Bordeaux. And yes, young wines that don’t require much ageing should have been the focus.

Trackbacks and Pingbacks

  1. Looking For Logic In All The Wrong Places | Mike Steinberger's Wine Diarist
  2. The Wine Buyer?s Dilemma, The Wine Economists? Blunder « rowensalemy
  3. Terroirist » Daily Wine News: Adam Smith

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS