[eDebate] Fwd: Re: Response to Kloster
Thu Nov 23 07:55:35 CST 2006
One data point (and Gary obviously has many more to use): at our JV/Novice Nationals last year, we used a 9 category preference system. There were 167 debates (about 40% of all prelims) where one team
preferred a judge more than the other team did. The team that preferred the
judge more won just 83 of those debates (49.7%). If we exclude the 14
debates where there was a difference of more than one category (8 of which
were won by the team that preferred the judge more), then the team that
preferred the judge slightly more won 75 out of 153 debates (49.0%).
West Virginia University
----- Original Message -----
From: Jean-Paul Lacy<mailto:lacyjp at wfu.edu>
To: Gary Larson<mailto:Gary.N.Larson at wheaton.edu> ; Edebate<mailto:edebate at ndtceda.com>
Sent: Thursday, November 23, 2006 3:12 AM
Subject: Re: [eDebate] Fwd: Re: Response to Kloster
>As a final caveat, none of us are as smart as we think we are in our
>rankings. It is still the case that judge rankings are a very poor
>predictor of who wins a debate, being slightly worse than chance.
If rankings are slightly worse than chance:
Are we collectively bad at picking judges? Or, is does this statistic prove
that we can collectively pick good judges?
If we're only slightly off pure chance, maybe mutual preference is becoming
strong enough that we can pick fair judges.
Maybe debaters and coaches are getting smart enough to pick the judges who
will do their best to determine the fair winner. A mutual 100 judge can
only pick one winner.
The bottom line is the holy grail--every team in the tournament gets the
judge who they think can fairly judge their debate.
The real question is: How much lack of mutuality is a predictor of who
wins? Or, when does the difference predict an outcome?
The point where it becomes a significant difference should be the cut-off
for mutuality in the whole preference vs mutuality mess.
--JP "still learning statistics" Lacy
ps--While I agree in principle with having a "bright line" or "cap" for
strikes, shouldn't people be able to figure this out for themselves if they
filled out a sheet that made their Z-score of a 0 LESS than -1? The numbers
are on the sheet as you submit them.
pps--Given an unfettered 0-100 system, I disagree with translating things
into ordinals for an additional comparison point for the tab room. Ordinals
are useful, but they don't reflect how teams fill out a sheet in an
unfettered 0-100 system. People are counting on the Z score to reflect
differences between clusters when they fill out an unfettered 0-100 sheet.
Ordinals can't reflect that.
ppps--The 9-0 system isn't good enough. Has any system beat ordinals in
terms of overall preference? Despite whines to the contrary,ordinal ranking
is the easiest way to fill out a preference sheet. Get a stack of 4x6 cards
and put them in order if you can't figure out how to do it on a computer.
Honestly, it is much easier to figure out if X judge is better than Y than
if X judge should be deemed equal to your A+ judges. If sheet gaming, (as
reflected in categorical 9-0 prefs,) is valued by the community, it is
still preserved in an ordinal system. Add a guaranteed strike cap or
"cut-off" to that system and you have the best we can do for the time being.
eDebate mailing list
eDebate at www.ndtceda.com<mailto:eDebate at www.ndtceda.com>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Mailman