[eDebate] Evolution of Mutual Preference - Part 1

Gary Larson Gary.N.Larson
Sun Apr 16 15:49:19 CDT 2006

In light of the ongoing conversations about the pros and cons of mutual
preference judge assignment and particularly discussions surrounding the
size of categories at tournaments such as the NDT, I would like to
initiate a discussion about the evolution of mutual preference and
potentially propose an experiment for the next big step in its

First, some background.  In the 30+ years I've been involved with
assigning judges to debate rounds, four options have presented
themselves:  tabroom preference, participant preference (typically
called mutual preference), random, and some version of any of those
three with the inclusion of participant strikes and preclusions.  Prior
to the genesis of participant preference systems, to be honest, most
tabrooms used some version of tabroom preference.  Even when "random"
judging was advertised, it was rather rare that tabrooms didn't put
their thumb on the scale with judgments that "that judge CAN'T hear that
round" in what they perceived were the most egregious problems.  Of
course, the weight of that thumb varied from tabroom to tabroom and
could never realistically impact all teams and judges with strictly
equal protection.  It was primarily for that purpose that I began to
develop computerized judge placement algorithms 20 years ago.  In
addition to the time advantage created by automation, my principal
objective was to insulate the judge placement process from the vagaries
of tabroom intervention, whether the tabroom was implementing random
judge assignment or some version of mutual preference.  Oddly enough, in
early CEDA tournaments using the computer, the principal criticism was
that all of a sudden random really did mean random with all of the
"strange" judge assignments that randomness might entail.

Our intuitions after 20+ years of at least limited use of mutual
preference is that it produces some significant advantages while it also
produces some disadvantages.  Almost no one really has the intuition
that complete randomness is appropriate.  Most teams are always going to
prefer some judges more than others with the result that the decision in
the debate might be affected either directly or indirectly (by a change
in the team's performance).  And whether the outcome is directly
impacted, participant satisfaction and the perception of fairness
definitely is.  It is also the case that almost everyone admits that
some judges are objectively more "capable" of rendering good and
appropriate decisions in various debate rounds (while we will vehemently
disagree on who they might be and what might count as effective
adjudication).  So I think that matching judges to rounds using
something other than strict randomness is justified.

The question is HOW?  I also take as a given that tabroom preference is
no longer a genuinely viable option.  While almost all tabrooms have
fair and honorable people staffing them, no one is either wise enough or
virtuous enough to play God for every debate in a tournament.  And most
of our tournament schedules don't have the time between rounds that it
demands.  It's perhaps an open question as to whether strikes by
themselves solve the problems associated with random judging.  For a
variety of reasons, I think that they don't, particularly in a world
with alternative models of debate practice that are embraced by a
minority of participants.  If, in a clash of civilizations debate, one
team can strike nearly all of the judges sympathetic to the alternative
view while the other team can strike only a small percentage of
potentially hostile critics, we haven't really created level ground.  
(PS - it's intriguing that while I was editing this Ede was writing
about Louisville adopting a random judge strategy for 06-07)

But before we embrace mutual preference judging, we need to address
three potentially damning critiques.

First, mutual preference potentially permits participants too much
control over their judging, thereby sacrificing some of the potential
pedagogical value of judge adaptation and perhaps entrenching certain
argument styles or elitism in the community.  In it's early days, mutual
preference didn't really have much of a limiting effect.  It was not
that long ago that teams still received CC judges in ABCX systems even
in rounds prior to elimination.  A decade ago, the operating principles
of the NDT virtually mandated at least one CC judge per panel in rounds
1-2.  But more recently, the increasing power of computerized algorithms
has significantly limited the range of judges that a team will likely
receive prior to elimination from a tournament.  At CEDA Nats, for
instance, just over 50% of all judge assignments were 1's (the top 11%
of the pool) while 93% were in the top third (1-3).  Only 1-2% were in
the bottom half and all but 1 of those involved teams that had been
eliminated.  It is an open question as to whether this is TOO good.  As
one who continually strives to make it better, I'll have to leave to
others the question of how good is good enough or potentially too good. 
But once again, for the potentially marginalized either because of
argument style, regional travel or some other factor, being assigned
judges in the top third of one's preference sheet might be much more
critical than we imagine.  And while we might wonder about the half of
the judges that teams might rarely if ever see, it remains the fact that
teams and their coaches still report dissatisfaction with judges ranked
4 or higher out of 9 and don't even get very excited by 3's.  Perhaps
this is the curse of rising expectations but for a lot of teams it is
still the discomfort of facing at a national tournament a pool of 180
judges most of whom they have no experience with (whether or not they've
been to tournaments without mutual preference).

Second, mutual preference is potentially exclusionary with respect to
critics (though potentially no more so than strikes if we permit 20%
strikes).  While tournaments can use mutual preference with a tight pool
that ensures that no obligated rounds are lost most tournaments that use
mutual preference don't take this strategy to its conclusion.  Faced
with assigning C judges (or 6-8 out of 9) when teams are expecting A's,
tournament directors find it hard to not exclude the judges who would
only fit as C's.  Apart from the cost of replacing critics (mitigated by
volunteer rounds if the community is willing), it is important to ask
whether the composition of the group of "excluded" critics contributes
to the alienation of marginalized groups (whether new coaches, women,
ethnic minorities, or those with significantly different judging
paradigms).  Neil Berch has consistently highlighted the differential 
percentage of male and female critics particularly in outrounds. 
Experience at this year's CEDA Nats again reflected Neil's concerns,
though not always in the way that we might expect.  It was the case that
female critics in the pool had a lower aggregate pref than male critics
in the pool.  But it was not the case that they were excluded more
frequently from judging their full commitment.  It was also not the case
that they were assigned to less important prelim rounds than male
critics (within broad groups of essentially equal preference).  In fact,
it should be noted that groups of critics who are often perceived as
marginalized frequently heard their full commitment.  The Louisville
critics, for instance, not only heard their full commitment but four
additional rounds that they graciously volunteered to the community
(kudos to Ede and his staff for volunteering).  The most frequently
"excluded" critics were those who were unknown though no strong patterns
emerged this year.  The differential representation of women in elim
rounds represented not only the fact that fewer women had late elim
obligations, but also a tendency to remove them from panels when they
were on strike cards (for whatever reason).  The computer actually
assigned a higher percentage of women to panels in every elim round
other than doubles than were represented in the judging pool at large at
that point in the tournament.  But this still needs an open and honest
discussion.  While we might correctly note that the best solution is for
those completing pref sheets to change their rankings, it is also clear
that for years now we've made little progress in seeing this happen.  

Finally, mutual preference as currently designed might not treat all
teams that fill out sheets equally.  While we might argue that every
team is filling out the same sheet with the same constraints and the
computer is assigning each round judges without any special favoritism,
the category systems that we currently use might not affect all teams
equally.  Since my "experiment" addresses this issue directly, I'll
defer this discussion to part two of an already too long posting.


More information about the Mailman mailing list