How do you do experiments?

If you apply experiments in a digital library evaluation, you (or your clients) may desire to be able to make some sort of causal statements about the library or some of its features. If so, this usually involves the specification of some sort of hypothesis. For example, you might hypothesize that undergraduate students with access to digital libraries will include more references in their term papers than students who only have access to traditional libraries. It would be feasible, but necessarily advisable, to design an experiment whereby college students would be randomly assigned to different courses, some of which promote the use of digital libraries and others that limit students to the use of traditional libraries. Both groups of students could be given an identical term paper writing assignment, and after all the papers are collected, the numbers of references could be counted, and the support for the hypothesis (or lack thereof) could be calculated. Statistical analysis would be applied to determine whether any differences found were statistically significant (i.e., did not occur by chance).

There are obvious weaknesses in this example of an experimental (or quasi-experimental) approach to evaluation. First, the control of treatment variables, as required by experimental methodologies, is impractical in most contexts where digital libraries are implemented. Although, students might be admonished to only use digital libraries in some courses and traditional libraries in others, there is no guarantee that there would not be considerable variance in library usage within the two treatment groups. Second, the emphasis on what appears to be a clear cut quantitative outcome measure, number of references, is flawed by the failure to establish the importance or relevance of this outcome indicator. Suppose that it was found that the students using digital libraries had more references than the students in the other courses. Such a result would say nothing about the quality of references. It could be that the students using traditional libraries had fewer references, but had better ones in terms of quality and relevance to the topic of the term paper. Third, the experimental approach can only support or fail to support pre-stated hypotheses; it cannot discover unexpected effects of a digital library or other innovation. Perhaps access to digital libraries increased the number of references used in the term papers, but also increased plagiarism within the papers. Fourth, randomized experiments can be unethical in some situations. Restricting access to one type of library or other might be viewed as limiting the learning potential of the participating students.

Perhaps the most serious problem with experimental methods is that their application often requires a stripping away of contextual variables. The use of digital libraries (or any other innovation) is greatly influenced by the context in which it occurs (Guba & Lincoln, 1981). The requirements of experimental evaluation designs demand that contextual aspects be controlled by random assignment of subjects to treatments, but it is these contextual factors that may be most important. In actuality, the vast majority of evaluations conducted with this model are “quasi-experimental,” a compromise that introduces many difficulties with respect to the analysis and interpretation of findings. As a result, evaluators operating within the experimental model frequently fall back upon designs that can be most easily managed, focus on variables that are easiest to measure, apply statistical method s without meeting the assumptions underlying their use, and draw conclusions that have little or no practical application (Schwab, 1970).