[eDebate] Solt on the point scale

Gary Larson Gary.N.Larson
Tue Nov 6 18:58:34 CST 2007

David makes some excellent points about attempting to create experimental controls.  But I do need to take exception with a couple of things.
David suggests that I?m merely speculating as to whether the process of recording two scores would itself impact one or both of the scores given.  While my ?speculation? is based on a very well established literature base in experimental psychology, let?s imagine for a moment that I?m wrong.  If people do give exactly the same scores on the 30-point scale and Wake uses those scores as the official scores for the tournament rather than the 50 or 100 point alternative, Ross hasn?t at all addressed the issue he posed.  While not everyone agrees there?s a pretty strong consensus that the current points reflect two potential problems ? inflation and compression.
David then suggests that if the experiment created different scores it would be because they are more thoughtful than usual.  I trust that this will be the case but would then ask how to inject the same thoughtfulness into tournaments once the experiment ends.
Whether or not the scores on the 30-point scale prove to be the same or different, David then sets an interesting condition for evaluating whether the new scale works.  If judges translate the old exactly to the new then the new is unnecessary.  If the new scale, however, doesn?t distribute normally as the old times 3.33 +/- 2, then the new scale isn?t reliable.  But, of course, this assumes that the current scale is fully reliable, precisely one of the concerns on the table.  In fact, if David is right that the only problem is one of inadequate discrimination, then any of the strategies that start with the old scores, followed by a translation into new scale equivalents, followed by the tweaking up or down 1 or 2 points can be immediately adopted and no ?research? is required.  In such a case the only difference between Wake?s solution and USC?s solution would be whether one or two additional discriminations between each option is better.  In either case, continuing to use the old scale without translation (during the research phase) serves no purpose.
The only outcome that David is rightly concerned about is if some judges use a 70-100 scale while others use a 90-100 scale even though the instructions might suggest otherwise.  Of course, even that?s not unique since some currently use a scale from 26-28, others use 27.5-28.5 and some others use 29-30.
