[eDebate] 50 speaker point scale at Wake
Fri Nov 2 15:20:53 CDT 2007
Ross suggests the following guidelines:
49-50: Brilliant. Hard to imagine a better performance.
47-48: NDT elim worthy performance.
45-47: Powerful but not extraordinary. Workmanlike break round or early
43-44: Good stuff, but missing what it takes to break into the top
40-42: Decent. More than one area needs improvement.
This, at most, opens up the range by a couple more points.
50 = 30
49 = 29.5
48 = 29
47 = 28.75
46 = 28.25
45 = 28
44 = 27.75
43 = 27.5
42 = 27
41 = 26.5
40 = 26
And it does so at the expense of upsetting a relatively strong
community consensus about the general spread of points (a consensus
which admittedly includes a constant upward drift). We all know what
a 28 means, even if we have to accept that exactly how that is defined
by each individual judge is always in question. We don't know what a
48 means, except that it's pretty good.
Someone posted a year or two ago (maybe it was Dr. Larson?) that
humans really do have a fairly limited capacity to make judgments like
these. That the more space you offer, the more people will eventually
bunch themselves back up in order to eliminate the confusion created
by gaps between points with no real capacity to discriminate.
Maybe I'm the only one, but I don't really see the current system as
particularly broken. As has been pointed out, speaker points are a
better way to predict who will win a given debate than actual win-loss
record, which I think is pretty telling. For myself, I find it
occasionally difficult to pick between a 28 and a 28.5 or a 27/27.5 or
something, but I usually don't find it difficult to communicate my
general thoughts via points.
Frankly, most debaters are pretty good and very few are extraordinary.
Given that, the range of "needs some serious work" at 26.5 and
"extremely good" at 29 seems perfectly acceptable. That's six degrees
of gradation, which includes some room on the outside for "something
went terribly wrong" at 26 and "spectacular" at 29.5.
I think to expect a tournament with 100 judges, each making subjective
decisions on where to draw these lines to ever objectively reflect how
well a person spoke is impossible. Adding more gradations only seems
to encourage community confusion about exactly what distinguishes a
given speech from another.
This is not to say I'm against the Wake experiment. I'm curious to
see if it has a meaningful effect on points, speaker awards, clearing
teams, etc. But I don't think it will be any more "correct" than a
normal tournament, because there's a degree of precision desired here
that is simply impossible under any circumstances.
As for the season-long judge variance idea, the main problem with that
is that all things are not equal. If judging was totally random, it
would all balance out, but it's not random. MPJ means judges see
particular teams, particular styles, and particular levels of quality
debates more often than others.
This is the same problem with judge-variance over one tournament, so
of course the bigger the sample size, the less it would be a problem.
But I have a feeling the sample size will remain FAR too small to
produce data any more useful than just assessing the points the judge
decided to assign. Even the people who judge the most can't have more
than 80 or 90 prelim rounds over a year. With MPJ, and a variety of
tournament contexts (most people adjust their scales depending on the
general quality of debates they will be hearing that particular
weekend), I have a hard time believing even that is enough time to
generate a meaningful sample. And for the folks who hear 20 prelims a
year, a few more low-point debates than expected could radically
affect the rest of their points for the whole year.
Once again, it would certainly be VERY interesting to see this data,
but I'm skeptical it would produce more "accurate" results.
And I just can't help believing
Though believing sees me cursed
"You Are the Generation That Bought More Shoes and You Get What You Deserve"
More information about the Mailman