How do you do transaction log analysis?

The most common type of analysis of transaction logs is the generation of usage statistics to determine which collections are accessed most often and/or which documents are retrieved most frequently. But as Bollen and Luce (2002) argue, transaction data can be used for much more than generating usage statistics. Transaction data can be analyzed to determine the structure of relationships between documents, document impact on user communities, and to reveal other characteristics of user communities. Bollen and Luce believe that user log data can be helpful in informing policies regarding acquisitions for the digital library collection as well as the organization of services provided.

Of course, before you decide to use user log data in the evaluation of your digital library, it is important to think about the kinds of decisions you hope to influence as this will directly influence the type of data you collect. For example, information about the number of users and user session lengths might be used to inform decisions about the content of the library collection or the need for advertising the collection to attract more users. Information about users' navigation choices could be used to inform decisions about page design and layout. Error rates and user actions to recover from errors may provide useful information about the skill level of typical users, and this information in turn may influence future decisions about interface design and help features of the library.

There is no step-by-step procedure to follow if you want to make use of user logs for the evaluation of your digital library. One very positive thing about user logs is that the information is usually already collected for you. The difficult part is sorting through, and making sense of the huge amounts of data collected. Here are some helpful hints for including user log information as part of your evaluation:

Know what you are looking for.

As with any evaluation or data collection technique you should try to decide up front the questions you want answered from the data and the decisions you intend to influence down the road. Particularly when dealing with a large amount of user log files, it will make it much easier for you if you know exactly what kinds of information are important to you and what kinds of information you can safely ignore.

Good software is the key.

Find a good software program to help you sort through your data. In the end it will save you a lot of time and energy if you are able to sort through your log files in an efficient way.

Look beyond the obvious.

In your analysis of transaction logs, you will certainly generate lots of statistics that demonstrate use, time of use, retrieval patterns, etc., but be sure to consider the implications of such information. For example, from your transaction logs you can generate statistics that demonstrate the typical access pattern over a weekly period. You discover that users most frequently access the digital library on Mondays but usage is very low on Sunday evenings. You can use this information to make informed decisions about the best times to conduct system maintenance or upgrades.