In the ideal world, experiments require randomized assignment of evaluation participants, often called “subjects” in the design of such a study, to different treatments (e.g., two different digital library interfaces, one that uses only icons and another that combines icons with text labels). In the real world, subjects in an experimental group are more often assigned to some sort of treatment (e.g., access to a digital library) while subjects in a control group receive no treatment. The following figure illustrates the design of the latter form of evaluation.
Time
1
2
(Pre)
(Post)
Experimental Group
R
0
X
0
Control Group
R
0
0
R = Random Assignment 0 = Observation X = Treatment
When a methodological authority such as Suchman (1967) (quoted on the previous page) and his adherents speak in terms of an evaluation approach being foolproof or infallible, people listen. Hence, it is not surprising that the experimental approach to evaluation remains deeply entrenched in the minds and actions of many social scientists and evaluators today as well as many evaluation clients in the context of digital libraries. For decades, experimental methods have been held up as the “gold standard” for evaluation by some experts for whom every other approach is viewed as inferior (Campbell & Stanley, 1966). This model remains the method of choice for educational research and evaluation in certain circles today (Shavelson, Towne, & the Committee on Scientific Principles for Education, 2002).
However, the continuing advocacy of experimental methods by many evaluators (e.g., Fitz-Gibbon & Morris, 1987) stands in contrast to the critique of these methods by contemporary evaluation theorists. For example, Guba and Lincoln (1989) claim that evaluation should be concerned with understanding the nature of human phenomena (such as digital libraries) from multiple perspectives, emphasizing the roles of culture, gender, context, and other factors in the construction of “reality.” With regard to evaluation methodology, many contemporary evaluation experts are more likely to recommend anthropological or ethnographic methods rather than experimental ones.
Nonetheless, it is important to understand experimental methods of evaluation. Many clients view experimental methods as the only way of providing credible evidence of the effectiveness or impact of educational innovation such as digital libraries. In addition, in a digital library development context, small scale experiments can be useful for providing evidence of the relative effectiveness of some digital library design features over others (Maeda, 2002).