The Evolution of Sweet Maria's Cupping Descriptions


Our New Coffee Scoring System

In December 2008, I shifted our cupping scoring system. I know that for those of us living in a "base 10" society, a system based on 110 points does not seem to make much sense. And some people have expressed concern that customers will see a shift to high scores – a sort of "score inflation." I guess there is also something of a Spinal Tap issue – the scores now go to 11(0) so they are better than 10(0).

I want to lay out my reasons for changing this system – why I think it is better and will lead to more accuracy in the scores.

At one time a couple years back, I became so frustrated with cupping numbers, I was going to abandon them completely, to migrate to a purely "descriptive" cupping system. In a sense, I still believe that descriptors are more important than numbers, and especially more important than the dreaded "total" score for each coffee. After all, would you buy a banana if someone told you it acheived a 94.7 score in "Banana Review" magazine ... but you hate bananas? Okay, silly example, but it has been difficult to get the numbers to accurately communicate the essential and meaningful qualities of the coffee they purport to describe. Our new system gives me hope though.

In brief, the bane of my existence (in coffee cupping at least) is total points. If a coffee is an 87, does that mean you will like it more than an 86 ... no. It's use is limited, but it does communicate something about quality .. right? The individual categories tell you more in detail about the coffee. For years, I resisted adding categories that SCAA and CoE use - for example, Clean Cup/Uniformity (if you make 5 cups, how similar are they) - for several reasons. I found those to be punitive to DP coffees, for example. Now I see them as expressive: if a coffee scores 9 on flavor, 9 on complexity, and 5 in Clean Cup, you know you are dealing with a potentially fantastic DP coffee that has some funky, rustic, earthy, unconventional flavors. My thinking on this has changed somewhat, as I come to find that including a Clean Cup/Uniformity score can round out the picture of a coffee.

The reason I want to change the scoring is that the scoring has become too compact, the range too tight, and the numbers fail to be expressive when they do this. The goal, after all, is to effectively communicate using numbers and graphs as a supplement to the written word. The SCAA system is over 100 points as well (I believe 106 is the top score) because they discovered that the extra points give the judge the ability to score a coffee more aggressively in each category ... consider my example of a great DP coffee that has low clean cup, uniformity, maybe acidity, and high complexity, flavor, body etc The resulting score is a very expressive set of numbers, and a corresponding graph that really has distinct form, with extreme spikes in the spider web. If the balance of these numbers comes out to, say, 83 which would mean it is "Specialty" coffee but of plain, ordinary character, then the cup certainly merits a "cupper's correction" to communicate that no, this is a really fantastic, if somewhat odd and imbalanced, coffee. So, say a 9 point cuppers correction brings it all the way to 92.

I know that for some this sounds like cheating, but let me tell you the reality, after 18 or so Cup of Excellence juries and countless other competitions and group cuppings. What is happening among judges is that they don't truly use the category scores to lead to a final score. As the cup cools, they form an idea of a total score, and they adjust the individual category scores to justify it. In a way, that is the correct thing to do ... after all, if you think a coffee is fantastic and somehow your numbers total 83, have you done a good job expressing your overall quality of that coffee? Likewise, if a coffee happens to rate highly in individual categories, but overall the cup is not attractive, do you do it justice to rate it 88 or 92 ? A few days ago, I cupped all the expensive coffees from this year on one table. I roasted archived samples of the Esmeralda Geshas, Aida's Grand Reserve, Guatemala CoE#1 El Injerto and some others. The later 2 have balance, complexity, and are highly enjoyable. Esmeralda #2 and #3 have soaring acidity, so different than all the other high price coffees, just astounding. But the body, the balance is not there. They are thin, which also affects the length of the aftertaste and complexity a bit. Dinging them on those 3 scores could lead to an 86-88 on many forms. That would be sooooo wrong.

So, what are the actual changes that bring us to a 110 point scale? Previously, we rated coffees according to six categories (Dry Fragrance, Wet Aroma, Body/Movement, Brightness/Acidity, Flavor/Depth, and Finish/Aftertaste) along with a 5 point cuppers correction. In addition to those categories, we're now adding in scores for Sweetness, Clean Cup, Uniformity, and Complexity. The other major change is that we're now grading each attribute from 1-10, instead of grading some 1-5 under the old system.

We will be graphing all of the scores so they can be compared (even overlayed). So the shapes of the spider graphs will give you a good visual clue about the cup character, and will be standardized (which was critical feedback about the current spider graphs). Below I have a comparison of spider graph of two coffees in the 100 and 110 point systems so you can see what difference it makes. I plan to revisit as many coffees that are archived as possible – so they can be compared to the new system.

Another point about adding the new category is that this is information that already existed in the review – and/or in more experienced home roasters' store of knowledge. That is, Costa Rican coffees are generally very clean cups, Sumatran coffees are generally rustic. It gives a score and quantifies a key part of the experience of a coffee so it can develop fuller picture/graph.



