Robert Parker’s recent dramatic downgrade (and the subsequent discussion), which was a seeming acknowledgment (pointed out by some here) that there is some error involved in the scoring process, got me thinking about how barrel samples are scored by many: with a range, because the scorer is building some margin of error into the score.
Does anyone here think that there isn’t any error inherent in putting a number on a bottle of wine based on tasting it one day in one particular lineup in one particular set of circumstances (blind, not blind, in an office, in a winery, etc.)? I think Robert Parker just showed us that there is, and there are many much less dramatic examples of a point or two one way or the other. I believe that changing any of these variables has the potential to change the outcome, so we have a system that has error in it. Yet we still persist in not 91, not 89, but 90. If we have two wines, one that gets a 91 and another that gets a 90, we like being able to point to them and say one is one point better than the other. But aren’t we really within the margin of error here? Aren’t these two wines in the same range? Maybe tweaking just one of the tasting variables would result in the two wines flipping places.
But, you might say, I know that the first wine was better than the second one. To which I would respond, “do you really?” Haven’t enough studies been done to show that tasters, even extremely experienced tasters, have difficulty in replicating their results given the exact same lineup? And what about the other 1,000 wines a critic has tasted and rated? The scoring system would have you believe that each of these can be neatly ranked according to their exact score. But what if each one is really “91 ±2”? Magnify that error by 1,000 different wines and your neat ranking is much more meaningless, since the errors are random and not systematic there is no way to know what the “true” ranking should be (perhaps because there really isn’t a true ranking …).
There are definite problems to a range system, the biggest being the lack of clarity. Instead of definitely pointing to a wine and saying “91”, you say “88-92” and the consumer says “huh?” People like clarity, even if it is false clarity (look at how the ± in polls are ignored, even if the results are within the margin of error). And of course with the wine world wired for points, a range system wouldn’t help those in and out of the biz who chase high scores. And there’s too much money involved for any major critic to buck (probably). So I’m not going to hold my breath, but I think the wine world would benefit from some acknowledgment that all of these exact numbers they see really ought to have a margin of error disclaimer.
I kind of like Ray’s scoring system of staggering minus and so forth. It conveys what he was tasting at that moment without being bogged down by a numerical score.
…or you can just go with “nice” for the bulk of your notes.
I only assigned scores because I have several people that follow my notes that want me to do so. I dropped them for a while and was pestered to start using them again so I did. As of late though I’ve been moving a little more toward 2-4 point ranges when scoring a wine. fwiw.
I’ve used all kinds of scales: 10 pt, 20, 100, A-F. Mostly lately I’ve avoided scores. But I’ve been buying some stuff on WineBid and I like to see a CellarTracker consensus. Tasting notes are interesting but if I’m buying a 20 year old Cal Cab I want to know if the wine is hanging on and a consistent batch of notes all in the 87-92 range, for example, tells me something. So since I’ve rejoined CellarTracker I’m going to hold my nose and use the 100 pt scale.
Of course my very first CT note was for a 2005 Bordeaux that I rated 88 even though I’ll bet that it’ll be 90-93 before things are over. In other words YMMV. For me I rate on the here and now – others rate on potential.
Since CT only works if you use a score rather than a range I’ll stick to one number.
i’d like to use more grade scores A+/A etc but it won’t track it on CT. So it’s hard for me to look back on my notes and sort via what i liked best based on score so i have to put a numerical number
When I see a 90 pt. rating, I know the 90 isn’t absolute. All I get from that rating is that the critic/taster thought it was a pretty good wine and I need to see the note to know why they thought that. If I saw the same 90 with a ±, or I saw a range of a 88-90, I still have the same general impression that someone thought the wine was pretty good.
To me, the scores are a way of quantifying “I hate it”, “I like it”, “I love it”, “Wow!”, etc. It is obviously an inexact endeavor, and ALL scores should be taken with a grain of salt. Even those by the great man. In this context, a range is no better than an exact score and just adds another layer of ridiculousness.
I think the idea that scores can be quantified in anything more than broad categories (e.g.: A to F, 1 to 5, 3 Stars, Poor to Excellent, etc), or “100 point wines” exist outside the context of individual bottles in your own personal context, is pretty much an exercise in self-deception. The more granular the system, the more inherently absurd the implied precision becomes IMO.
Added layer because it’s more vague? Or because it makes the score even more pseudo-scientific than it already is? If the later, I actually agree that a range shouldn’t make people think there is more rigor in the process than there actually is, and that is an unintended side-effect I hadn’t considered. But I just think we are kidding ourselves if we think the general public doesn’t see these scores as absolutes that can be compared to each other, and therefore ranges are more honest, if we going to use scores at all. Lesser of two evils in my book.
A score range is mathematically identical to simply using fewer points. The standard “100 point” range is really 50 points, and more strictly, about 15-20 points are really used very much. (80-100 gives a 21-point range).
World of Fine Wine gives a calibration scale for a five-star scoring system, a twenty point scoring system, and the standard “100 point” scoring system, which I have pasted below. Using a coarser-grained scoring system is the same as using a range in a finer-grained system.
Cool chart! I was going to post something on the “100 pt scale really isn’t a 100 pt scale” but this is much better. I’m glad you beat me to the punch.
I would agree to a point. Things like letter grades or stars have much fewer gradations, and I agree that they therefore tend to function as a range (and serve a pretty good purpose for the entire rest of the critical world which seemingly isn’t as obsessed with putting an exact number on everything). I think we’d be better off if scoring was done is some sort of non-numerical fashion like stars or letter grades, but I don’t see that happening anytime soon.
I disagree about mathematically identical; I’d say functionally identical. Because stars and letter grades aren’t numbers. If you are giving an exact score, you are not giving a range, regardless of the boundaries of your scale. This would be as true of the 20 point scale as the 100 point scale, in particular, as you point, because the 100 point scale really isn’t a 100 point scale. The point of a range is the admission of some error (which is of course why it will never happen). Someone using it has the balls to say “Today at this time in these conditions I think this wine is a 91 (or 17 if you prefer), but I recognize that if I did this same exercise tomorrow, I might say 90 or 92.” The smaller the scale the smaller the error of course, so if you only had a 10 point scale it might be insignificant for reporting purposes, but it would still be there.
What would be truly interesting (and absolutely impossible to do from the outside) is a study on how much different tasting factors add to the error in a score (error would be defined as difference in retastings). You’ve got blind vs. non-blind, number of wines in the lineup, conditions of the tasting (office vs. restaurant vs. winery), presence of others, varietal variation in the lineup, number of times tasted before scoring,and others I’m sure. Since the whole thing is pseudo-scientific to begin with, I don’t think you could take the resulting numbers as absolutes, but they’d certainly be good barometers of what someone’s process was.
That chart is even more pseudo-scientific than the points exercise to begin with! If a score is just a shorthand for someone to say how much they enjoyed a wine, then there’s expressive value in saying that a wine is a 90/100 or a ****/, but there’s no Rosetta Stone capable of proving that what one person means by 90/100 is the same thing that someone else means by ****/. It’s sort of like trying to translate text. Yeah, you can get a decent approximation of what someone meant to say, but you’ll never capture exactly the same thing. And the purported exactitude is the most annoying defect of the 100-point system in the first place.
Keith, I don’t think anyone seriously considers scores “scientific” or “exact”. They are useful if problematic shorthands when discussing or evaluating wines. Although clearly people will differ, I have found also that there are people who align well, at least with my palate, and use of scores can be helpful in that instance. Not perfect, not scientific or exact, but helpful. If you don’t like them, I’m sure you don’t use them. But most people do find them useful.
If you feel that the 100-point system is particularly defective in its purported exactitude, is that because you feel there is some, presumably coarser grained, scoring system which is better suited to writing tasting notes? Or rather is it that you don’t agree with any graded scale of ratings, even “thumbs up/thumbs down”, which amounts to a graded scale with two grades?
Re. the question as to whether stars, glasses or letter grades are identical to number grades, they are identical. The only questions are the coarseness of the grain, very roughly how they calibrate (of course this can’t be done exactly, but given the subjectivity of the whole enterprise at least we can come up with broad guidelines), and what the symbols look like. E.g.: four stars or four biccheri could be written with four numbers rather than four symbols; letter grades are typically calibrated to numerical scales for grading exercises in schools.
I like the stars because there is less exactitude. Someone shouldn’t go look at my score and see whether or not I liked x wine x amount. I have had plenty of wines that merit high scores due to their craftsmanship but do not pull at me from an emotive standpoint. Points makes everything so objective, when wine is hardly so.
I actually don’t have an objection to rating wine - as I said, it’s a shorthand useful for expressing how enjoyable something is. An imperfect shorthand, but obviously communicative of something despite its imperfections. I do have an objection to the way they’re sometimes treated as scientific (and I will dig up some egregious examples of that from Another Board if you don’t believe me!).
Re. the question as to whether stars, glasses or letter grades are identical to number grades, they are identical. The only questions are the coarseness of the grain, very roughly how they calibrate (of course this can’t be done exactly, but given the subjectivity of the whole enterprise at least we can come up with broad guidelines), and what the symbols look like. E.g.: four stars or four biccheri could be written with four numbers rather than four symbols; letter grades are typically calibrated to numerical scales for grading exercises in schools.
That I can’t agree with, and here I think you’re slipping into the scientific fallacy. The difference between a five-star system and a 100-point system is not just in the coarseness of the grain. That only makes sense if you’re looking at scores like the amusement-park game where you see how hard you can swing a hammer and the marker travels up the pole and (maybe) hits the bell. In that situation you can replace the markers on the pole with a measure in kilograms or pounds and always know that when the marker hits x number of kilograms it equals y number of pounds. But that is because there is an objective definition of what a kilogram is and what a pound is. There is no objective definition of what a point is and what a star is. Also, kilograms and pounds are different measures of the same thing. It is not necessarily true that one person’s points, another person’s points, or some other person’s stars are different measures of the same thing. For example, Robert Parker says his point ratings are arrived at by adding a certain number of points for color, a certain number of points for aging potential, a certain number of points for ageability, a certain number of points for overall enjoyment, etc. (Assume for the sake of argument that this is both sensible and possible.) Maybe some other guy arrives at his point ratings by only factoring in overall enjoyment. Parker’s rating might be 90/100 and the other guy’s rating 9/10, but just because they’re the same ratio doesn’t mean that one equals the other because the two people aren’t saying the same thing. The number is a shorthand for something that’s not quantifiable, and that’s why you can’t (and shouldn’t) think you can convert from one system to another.