USE OF ELECTRONIC JOURNALS AND BOOKS: EMPIRICAL STUDIES
Analysis of JSTOR
The Impact on Scholarly Practice of Access to On-line Journal Archives
Thomas A. Finholt and JoAnn Brooks
Innovations introduced over the past thirty years, such as computerized library catalogs and on-line citation indexes, have transformed scholarly practice. Today, the dramatic growth of worldwide computer networks raises the possibility for further changes in how scholars work. For example, attention has focused on the Internet as an unprecedented mechanism for expanding access to scholarly documents through electronic journals (Olsen 1994; Odlyzko 1995), digital libraries (Fox et al. 1995), and archives of prepublication reports (Taubes 1993). Unfortunately, the rapid evolution of the Internet makes it difficult to accurately predict which of the many experiments in digital provision of scholarly content will succeed. As an illustration, electronic journals have received only modest acceptance by scholars (Kling and Covi 1996). Accurate assessment of the scholarly impact of the Internet requires attention to experiments that combine a high probability of success with the capacity for quick dissemination. According to these criteria, digital journal archives deserve further examination. A digital journal archive provides on-line access to the entire digitized back archive of a paper journal. Traditionally, scholars make heavy use of journal back archives in the form of bound periodicals. Therefore, providing back archive content on-line may significantly enhance access to a resource already in high demand. Further, studying the use of experimental digital journal archives may offer important insight into the design and functionality of a critical Internet-based research tool. This paper, then, reports on the experience of social scientists using JSTOR, a prototype World Wide Web application for viewing and printing the back archives of ten core journals in history and economics.
The Jstor System
JSTOR represents an experiment in the technology, politics, and economics of online provision of journal content. Details of JSTOR's evolution and development
are covered elsewhere in this volume (see chapter 7). At the time of this study, early 1996, the faculty audience for JSTOR consisted of economists, historians, and ecologists-reflecting the content of JSTOR at that time. This paper focuses on reports of JSTOR use shortly after the system became officially available at the test sites. Respondents included historians and economists at five private liberal arts colleges (Bryn Mawr College, Denison University, Haverford College, Swarthmore College, and Williams College) and one public research university (the University of Michigan). The core economics journals in JSTOR at the time of this study included American Economic Review, Econometrica, Quarterly Journal of Economics, Journal of Political Economy, and Review of Economics and Statistics. The core history journals included American Historical Review, Journal of American History, Journal of Modern History, William and Mary Quarterly, and Speculum. In the future, JSTOR will expand to include more than 150 journal titles covering dozens of disciplines.
Journal Use in the Social Sciences
To understand JSTOR use requires a general sense of how social scientists seek and use scholarly information. In practice, social scientists apply five main search strategies. First, social scientists use library catalogs. Broadbent (1986) found that 69% of a sample of historians used a card catalog when seeking information, while Lougee, Sandler, and Parker (1990) found that 97% of a sample of social scientists used a card catalog. Second, journal articles are a primary mechanism for communication among social scientists (Garvey 1979; Garvey, Lin, and Nelson 1970). For example, in a study of social science faculty at a large state university, Stenstrom and McBride (1979) found that a majority of the social scientists used citations in articles to locate information. Third, social scientists use indexes and specialty publications to locate information. As an illustration, Stenstrom and McBride found that 55% of social scientists in their sample reported at least occasional use of subject bibliographies, and 50% reported at least occasional use of abstracting journals. Similarly, Olsen (1994) found that in a sample of sociologists, 37.5% reported regular use of annual reviews. Fourth, social scientists browse library shelves. For instance, Lougee et al. and Broadbent both found that social scientists preferred to locate materials by browsing shelves. Sabine and Sabine (1986) found that 20% of a sample of faculty library users reported locating their most recently accessed journal via browsing. On a related note, Stenstrom and McBride found that social scientists used departmental libraries more heavily than the general university library. Finally, social scientists rely on the advice of colleagues and students. For example, various studies show that colleagues have particular value when searching for a specific piece of information (Stenstrom and McBride; Broadbent; Simpson 1988). Also, students working on research projects often locate background material that social scientists find useful (Olsen; Simpson). Simi-
larly, faculty report a valuable but infrequent role for librarians in seeking information (Stenstrom and McBride; Broadbent; Lougee et al.).
Computer-based tools do not figure prominently in the preceding description of how social scientists search for scholarly information. Results from previous studies show that the primary application of digital information technology for social scientists consists of computerized searching, which social scientists do at lower rates than physical scientists but at higher rates than humanists (Lougee et al. 1990; Olsen 1994; Broadbent 1986). Lougee et al. and Olsen both report sparse use of on-line catalogs by social scientists. Evidence of the impact of demographic characteristics on use of digital resources is mixed. For example, Lougee et al. found a negative correlation between age and use of digital information technology, while Stenstrom and McBride (1979) found no correlation. Finally, in a comparison of e-mail use by social scientists and humanists, Olsen found higher use rates among the social scientists, apparently correlated with superior access to technology.
In terms of journal access, previous studies indicate that economics faculty tend to subscribe to more journals than do faculty in other social science disciplines (Simpson 1988; Schuegraf and van Bommel 1994). Journal subscriptions are often associated with membership in a professional society. For example, in their analysis of a liberal arts faculty, Schuegraf and van Bommel found that 40.9% of faculty journal subscriptions-including 12 of the 15 most frequently subscribed-to journals-came with society memberships. Stenstrom and McBride (1979) found that membership-related subscriptions often overlapped with library holdings. However, according to Schuegraf and van Bommel, other personal subscriptions included journals not held in library collections. In terms of journal use, Sabine and Sabine (1986) found that only 4% of faculty in their sample reported reading the entire contents of journals, while 9% reported reading single articles, and 87% reported reading only small parts, such as abstracts. Similarly, at least among a sample of sociologists, Olsen (1994) found that all respondents reported using abstracts to determine whether to read an article. Having found a relevant article, faculty often make copies. For instance, Sabine and Sabine found that 47% of their respondents had photocopied the most recently read journal article, Simpson found that 60% of sampled faculty reported "always" making copies, and all the sociologists in Olsen's sample reported copying important articles.
Goals of this Study
The research described above consists of work conducted prior to the advent of the World Wide Web and widespread access to the Internet. Several recent studies suggest that Internet use can change scholarly practice (Finholt and Olson 1997; Hesse, Sproull, and Kiesler 1993; Walsh and Bayma 1997; Carley and Wendt 1991). However, most of these studies focused on physical scientists. A key goal of this study is to create a snapshot of the effect of Internet use on social scientists,
specifically baseline use of JSTOR. Therefore, the sections that follow will address core questions about the behavior of JSTOR users, including: (1) how faculty searched for information; (2) which faculty used JSTOR; (3) how journals were used; (4) how the Internet was used; and (5) how journal use and Internet use correlated with JSTOR use.
The population for this study consisted of the history and economics faculty at the University of Michigan and at five liberal arts colleges: Bryn Mawr College, Denison University, Haverford College, Swarthmore College, and Williams College. History and economics faculty were targeted because the initial JSTOR selections drew on ten journals, reflecting five core journals in each of these disciplines. The institutions were selected based on their status as Andrew W. Mellon Foundation grant recipients for the JSTOR project.
Potential respondents were identified from the roster of full-time history and economics faculty at each institution. With the permission of the respective department chairs at each school, faculty were invited to participate in the JSTOR study by completing a questionnaire. No incentives were offered for respondents, and participation was voluntary. Respondents were told that answers would be confidential, but not anonymous due to plans for matching responses longitudinally. The resulting sample contained 161 respondents representing a response rate of 61%. In this sample, 46% of the respondents were economists, 76% were male, and 48% worked at the University of Michigan. The average respondent was 47.4 years old and had a Ph.D. granted in 1979.
Design and Procedure
Respondents completed a 52-item questionnaire with questions on journal use, computer use, attitudes toward computing, information search behavior, demographic characteristics, and JSTOR use. Respondents had the choice of completing this questionnaire via a telephone interview, via the Web, or via a hard-copy version. Questionnaires were administered to faculty at the five liberal arts colleges and to the faculty at the University of Michigan in the spring of 1996.
Journal Use Journal use was assessed in four ways. First, respondents reported how they traditionally accessed the journal titles held in JSTOR, choosing from: no use; at the library; through a paid subscription; or through a subscription received with membership in a professional society. Second, respondents ranked the journals they used in order of frequency of use for a maximum of ten journals. For each of these journals, respondents indicated whether they had a personal subscription to the journal. Third, respondents described their general use of
journals in terms of the frequency of browsing journal contents, photocopying journal contents, saving journal contents, putting journal contents on reserve, or passing journal contents along to colleagues (measured on a 5-point scale, where 1 = never, 2 = rarely, 3 = sometimes, 4 = frequently, and 5 = always). Finally, respondents indicated the sections of journals they used, including the table of contents, article abstracts, articles, book reviews, reference lists, and editorials.
Computer Use Computer use was assessed in three ways. First, respondents described their computer systems in terms of the type of computer (laptop versus desktop), the computer family (e.g., Apple versus DOS), the specific model (e.g., PowerPC), and the operating system (e.g., Windows 95). Second, respondents reported their level of use via a direct network connection (e.g., Ethernet) of the World Wide Web, e-mail, databases, on-line library catalogs, and FTP (measured on a 5-point scale, where 1 = never, 2 = 2-3 times per year, 3 = monthly, 4 = weekly, and 5 = daily). Finally, respondents reported their level of use via a modem connection of the Web, e-mail, databases, on-line library catalogs, and FTP (using the same scale as above).
Attitudes toward Computing Attitudes toward computing were assessed by respondents' reported level of agreement with statements about personal computer literacy, computer literacy relative to others, interest in computers, the importance of computers, confusion experienced while using computers, and the importance of programming knowledge (measured on a 5-point scale, where 1 = strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, and 5 = strongly agree).
Information Search Behavior Information search behavior was assessed in three ways. First, respondents indicated their use of general search strategies, including: searching/browsing on-line library catalogs; searching/browsing paper library catalogs; browsing library shelves; searching/browsing on-line indexes; searching/browsing paper indexes; browsing departmental collections; reading citations from articles; and consulting colleagues. Second, respondents described the frequency of literature searches within their own field and the frequency of on-line literature searches within their own field (both measured on a 5-point scale, where 1 = never, 2 = 2-3 times per year, 3 = monthly, 4 = weekly, and 5 = daily). Finally, respondents described the frequency of literature searches outside their field and the frequency of on-line literature searches outside their field (measured on the same 5-point scale used above).
Demographic Characteristics Respondents were asked to provide information on demographic characteristics, including age, sex, disciplinary affiliation, institutional affiliation, highest degree attained, and year of highest degree.
JSTOR Use Finally, JSTOR use was assessed in two ways. First, respondents reported whether they had access to JSTOR. Second, respondents described the
frequency of JSTOR use (measured on a 5-point scale, where 1 = never, 2 = 2-3 times per year, 3 = monthly, 4 = weekly, and 5 = daily).
The data were analyzed to address five core questions related to the impact of JSTOR: (1) how faculty searched for information; (2) which faculty used JSTOR; (3) how journals were used; (4) how the Internet was used; and (5) how journal use and Internet use correlated with JSTOR use.
Table 11.1 summarizes data on how faculty searched for information. The proportion of faculty using the search strategies did not differ significantly by institution or discipline, with the exception of three strategies. First, the proportion of Michigan economists who reported browsing library shelves (46%) was significantly less than the proportion of five-college historians who used this strategy (86%). Second, the proportion of Michigan economists who reported searching card catalogs (14%) was significantly less than the proportion of five-college historians who used this strategy (39%). And finally, the proportion of Michigan economists who reported browsing departmental collections (48%) was significantly greater than the proportion of five-college historians who used this strategy (4%).
Who Used JSTOR
Overall, 67% of the faculty did not use JSTOR, 14% used JSTOR once a year, 11% used JSTOR once a month, and 8% used JSTOR once a week. None of the faculty used JSTOR daily. Table 11.2 summarizes JSTOR frequency of use by type of institution and discipline. A comparison of use by type of institution shows a higher proportion of JSTOR users at the five colleges (42%) than at the University of Michigan (27%). A further breakdown by discipline shows that the five college economists had the highest proportion of users (46%), followed by the Michigan economists (40%), the five-college historians (39%), and the Michigan historians (16%). One way to put JSTOR use into perspective is to compare this activity with similar, more familiar on-line activities, such as literature searching. Overall, 21% of the faculty did not do on-line searches, 25% searched once a year, 25% searched once a month, 25% searched once a week, and 4% searched daily. Table 11.3 summarizes data on the frequency of on-line searching by type of institution and discipline for the same faculty described in Table 11.2. A comparison of on-line searching by type of institution shows a higher proportion of on-line searchers at the five colleges (85%) than at the University of Michigan (76%). A further breakdown by discipline shows that the five-college economists had the highest proportion of searchers (89%), followed by the five-college historians (82%), and the Michigan economists and historians (both 76%).
Figure 11.1 shows a plot of the cumulative percentage of faculty per institution who used JSTOR and who did on-line searches versus the frequency of these activities. For example, looking at the values plotted on the y-axis against the "Monthly" category shows that over three times as many Michigan faculty searched once a month or more (51%) compared with those who used JSTOR at least once a month (15%). Similarly, over two times as many of the five-college faculty searched once a month or more (62%) compared with those who used JSTOR at least once a month (25%). A further breakdown by discipline shows that
over twice as many of the five-college economists searched once a month or more (73%) than used JSTOR at least once a month (31%), that over six times as many of the Michigan historians searched once a month or more (54%) than used JSTOR at least once a month (8%), that over twice as many of the five-college historians searched once a month or more (50%) than used JSTOR at least once a month (21%), and that over twice as many of the Michigan economists searched once a month or more (48%) than used JSTOR at least once a month (23%).
Table 11.4 summarizes how faculty used features of journals. Across all journal features, patterns of use were similar except in two areas. First, the proportion of Michigan historians who used article abstracts (31%) was significantly smaller than the proportion of Michigan economists (81%), five-college economists (89%), and five-college historians (61%) who used abstracts. Second, the proportion of Michigan economists who used book reviews (49%) was significantly smaller than the proportion of five-college historians (100%), Michigan historians (98%), and five college economists (85%) who used book reviews.
Overall, faculty in the sample reported that they regularly used 8.7 journals, that they subscribed to 4.1 of these journals, and that 2.2 of these journals were also in JSTOR. Table 11.5 summarizes journal use by institution and discipline. There were no significant differences in the number of journals used across institution and discipline, although Michigan historians reported using the most journals (8.9). There were also no significant differences across institution and discipline in the number of paid journal subscriptions among the journals used, although again Michigan historians reported having the most paid subscriptions (4.6). There was a significant difference in the number of journals used regularly by the economists that were also titles in JSTOR (M = 2.9) compared with those used by the historians ([M = 1.7], t  = 5.71, p < .01).
Further examination of differences in use of journals shows a much greater consensus among the economists about the importance of the economics journals in JSTOR than among the historians about the history journals in JSTOR. For example, Table 11.6 shows the economists' ranking in order of use of the five economics journals chosen for JSTOR. The American Economic Review was cited among the top ten most frequently used journals by over 75% of both the Michigan and the five-college economists; the Journal of Political Economy was cited
among the top ten by over 60% of both the Michigan and the five-college economists; and the Quarterly Journal of Economics and the Review of Economics and Statistics were cited among the top ten by over 50% of the Michigan economists and by over 40% of the five-college economists. By contrast, Table 11.7 shows the historians' ranking in order of use of the five history journals chosen for JSTOR. The American Historical Review was cited among the top ten most frequently used journals by over 60% of both the Michigan and the five-college historians. However, none of the other four journals were used by a majority of the historians at Michigan or at the five colleges.
Overall, faculty reported weekly use of e-mail (M = 4.3), monthly use of on-line catalogs (M = 3.2) and the Web (M = 3.0), and two or three uses per year of FTP (M = 2.3) and on-line database (M = 2.1). Table 11.8 summarizes the use of these Internet applications by institution and discipline. In terms of e-mail use, Michigan historians (M = 3.3) were significantly lower than the Michigan economists (M = 4.9), the five-college economists (M = 5.0), and the five-college historians (M = 4.7). In terms of World Wide Web use, Michigan historians (M = 1.8) were significantly lower than everyone, while the five-college historians (M = 2.9) were significantly lower than the five-college economists (M = 4.2) and the Michigan economists (M = 3.9). In terms of FTP use, the Michigan historians (M = 1.4) and the five-college historians (M = 1.7) differed significantly from the Michigan economists (M = 3.4) and the five-college economists (M = 2.7). In terms of on-line database use, the Michigan historians (M = 1.6) were significantly lower than the five-college economists (M = 2.9). Faculty did not differ significantly in terms of on-line catalog use.
The Relationship of Journal and Internet Use to JSTOR Use
Examination of the frequency of JSTOR use among faculty aware of JSTOR (n = 78) showed that 58% of the respondents had varying levels of use, while 42% reported no use. Using the frequency of JSTOR use as the dependent variable, the faculty who reported no use were censored on the dependent variable. The standard zero, lower-bound tobit model was designed for this circumstance (Tobin 1958). Most important, by adjusting for censoring, the tobit model allows inclusion of negative cases in the analysis of variation in frequency of use among positive cases, which greatly enhances degrees of freedom. Therefore, hierarchical tobit regression analyses were used to examine the influence of demographic characteristics, journal use, search preferences, Internet use, and attitude toward computing on the frequency of JSTOR use. Independent variables used in these analyses were selected on the basis of significance in univariate tobit regressions
on the frequency of use variable. Table 11.9 summarizes the independent variables used in the multiple tobit regression analyses.
Table 11.10 summarizes the results of the hierarchical tobit regression of demographic, journal use, search preference, Internet use, and computing attitude variables on frequency of JSTOR use. The line second from the bottom in Table 11.10 summarizes the log likelihood score for each model. Analysis of the change in log likelihood score between adjacent models gives a measure of the significance of independent variables added to the model. For example, in Model 1, the addition of the demographic variables failed to produce a significant change in the log likelihood score compared to the null model. By contrast, in Model 2, the addition of journal use variables produced a significant change in the log likelihood score compared to Model 1-suggesting that the addition of the journal
use variables improved the fit in Model 2 over Model 1. Similarly, the addition of search variables in Model 3 and of Internet use variables in Model 4 both produced significant improvements in fit, but the addition of the computer attitude variable in Model 5 did not. Therefore, Model 4 was selected as the best model. From Model 4, the coefficients for gender, article copying, abstract reading, and searching on-line catalogs are all positive and significant. These results suggest that, controlling for other factors, men were 0.77 points higher on frequency of JSTOR use than were women, that there was a 0.29-point increase in the frequency of JSTOR use for every point increase in the frequency of article copying, that faculty who read article abstracts were 0.82 points higher on frequency of JSTOR use than were faculty who didn't read abstracts, and that there was a 1.13point increase in the frequency of JSTOR use for every point increase in the frequency of on-line catalog searching. From Model 4, the coefficients for affiliation with an economics department and the number of paid journal subscriptions are both negative and significant. These results suggest that, controlling for other factors, economists were 0.88 points lower on frequency of JSTOR use than were historians and that there was a 0.18-point decrease in frequency of JSTOR use for every unit increase in the number of paid journal subscriptions.
This study addressed five questions related to the preliminary impact of JSTOR: (1) how faculty searched for information; (2) which faculty used JSTOR; (3) how journals were used; (4) how the Internet was used; and (5) how journal use and Internet use correlated with JSTOR use.
Summary of Findings
In terms of how faculty searched for information, results were consistent with earlier findings reported in the literature. Specifically, a strong majority of the faculty reported relying on citations from related publications, on colleagues, on electronic catalogs, and on browsing library shelves when seeking information. Faculty did not differ dramatically in selection of search strategies, except that Michigan
economists were less likely to browse library shelves and less likely to search card catalogs.
In terms of JSTOR use, Michigan faculty were less likely to know about JSTOR than were the five-college faculty, and Michigan faculty were less likely to use JSTOR than were the five-college faculty. These results probably reflected the delayed rollout and availability of JSTOR at Michigan. Economists were more likely to use JSTOR than historians were. Of the faculty who reported JSTOR use, frequency of use did not differ dramatically from frequency of use of a related, more traditional technology: on-line searching. That is, 58% of the faculty who used JSTOR said they used JSTOR once a month or more, while 69% of the faculty who did on-line searches reported doing searches once a month or more. Note, however, that over twice as many faculty reported doing on-line searches (75%) as reported use of JSTOR (33%).
In terms of journal use, faculty did not vary greatly in their use of journal features, except that Michigan historians were less likely to use article abstracts and that Michigan economists were less likely to use book reviews. Economists and historians did not differ in the total number of journals used; however, there was greater consensus among the economists about core journals. Specifically, two of the five economics titles included in JSTOR (the American Economic Review and the Journal of Political Economy ) were cited among the top 10 most frequently used journals by a majority of the economists, while four of the five titles (the two mentioned above plus the Quarterly Journal of Economics and the Review of Economics and Statistics ) were cited among the top 10 most frequently used journals by a majority of the Michigan economists. By contrast, only one of the five history titles included in JSTOR (the American Historical Review ) was cited among the top 10 most frequently used journals by a majority of the historians.
In terms of Internet use, the Michigan historians lagged their colleagues in economics at Michigan and the five-college faculty. For example, the Michigan historians reported less use of e-mail, the World Wide Web, FTP, and on-line databases than did the other faculty. The economists were more likely to use FTP and more likely to use the World Wide Web than the historians were. Faculty used online catalogs at similar rates.
In terms of factors correlated with JSTOR use, the tobit regressions showed that a model including demographic factors, journal use factors, search factors, and Internet use factors offered the best fit to the data on frequency of JSTOR use. The addition of the computer attitude variable did not improve the fit of this model. In the best fit model, gender, article copying, abstract reading, and searching on-line catalogs were all positively and significantly related to frequency of JSTOR use. Also from the best fit model, affiliation with an economics department and greater numbers of journal subscriptions were negatively and significantly related to frequency of JSTOR use.
Limitations of the Study
These data represent a snapshot of faculty response to JSTOR at an extremely early stage in the evolution of the JSTOR system. In the spring of 1996, JSTOR had been available to the five-college faculty for less than six months, while at Michigan, the system had not yet been officially announced to faculty. Therefore, the results probably underestimate eventual use of the mature JSTOR system. Further, as a survey study, self-reports of use were crude compared to measures that could have been derived from actual behavior. For example, it was intended to match use reports with automated usage statistics from the JSTOR Web servers, but the usage statistics proved too unreliable. Another problem was that the survey contained no items on the frequency of traditional journal use. Therefore, it is unknown whether the low use of JSTOR reported by the faculty reflected dissatisfaction with the technology or simply a low base rate for journal use. Finally, the faculty at Michigan and at the five colleges were atypical in the extent of their access to the Internet and in the modernity of their computing equipment. Faculty with older computers and slower network links would probably be even less likely to use JSTOR.
Implications for the JSTOR Experiment
Although extremely preliminary, these early data suggest trends that merit further exploration as JSTOR expands. First, it is encouraging to discover that among faculty who have used JSTOR, rates of use are already comparable to rates for use of on-line searching-a technology that predates JSTOR by two decades. It will be interesting to see if JSTOR use grows beyond this modest level to equal the use of key Internet applications, like e-mail and Web browsing. Second, there appear to be clear differences in journal use across disciplinary lines. For example, economists focus attention on a smaller set of journals than is the case in history. Therefore, it may be easier to satisfy demand for on-line access to back archives in fields that have one or two flagship journals than in more diverse fields where scholarly attention is divided among dozens of journals. This conclusion may lead commercial providers of back archive content to ignore more diverse disciplines at the expense of easier-to-service, focused disciplines. Finally, the negative correlation between the number of journal subscriptions and JSTOR use suggests the possibility of a substitution effect (i.e., JSTOR for paper). However, the significance of this correlation is difficult to determine, since there is no way to know the direction of causality in a cross-sectional study.
Preparation of this article was supported by a grant to the University of Michigan from the Andrew W. Mellon Foundation. JSTOR is the proprietary product of JSTOR, a nonprofit
corporation dedicated to provision of digital access to the back archives of scholarly journals. For more information, please consult www.jstor.org.
We gratefully acknowledge the assistance of Kristin Garlock, Marcia Heringa, Christina Maresca, William Mott, Sherry Piontek, Tony Ratanaproeksa, Blake Sloan, and Melissa Stucki in gathering the data for this study. Also, we thank Ann Bishop, Joan Durrance, Kristin Garlock, Kevin Guthrie, Wendy Lougee, Sherry Piontek, Sarah Sully, and the participants of The Andrew W. Mellon Foundation Scholarly Communication and Technology Conference for comments on earlier drafts. Finally, we thank the history and economics faculty of Bryn Mawr College, Denison University, Haverford College, Swarthmore College, the University of Michigan, and Williams College for their patience and cooperation as participants in this research.
Requests for copies should be sent to: (1) Thomas Finholt, Collaboratory for Research on Electronic Work, C-2420 701 Tappan Street, Ann Arbor, MI 48109-1234; or (2) email@example.com.
Broadbent, E. A. (1986). Study of humanities faculty library information seeking behavior. Cataloging and Classification Quarterly, 6, 23-37.
Carley, K., & Wendt, K. (1991). Electronic mail and scientific communication: A study of the SOAR extended research group. Knowledge: Creation, Diffusion, Utilization, 12, 406-440.
Finholt, T. A., & Olson, G. M. (1997). From laboratories to collaboratories: A new organizational form for scientific collaboration. Psychological Science, 8, 28-36.
Fox, E. A., Akscyn, R. M., Furuta, R. K., & Leggett, J. J. (Eds.). (1995). Digital libraries [Special issue]. Communications of the ACM, 38(4).
Garvey, W. D. (1979). Communication: The essence of science. Toronto: Pergamon Press.
Garvey, W. D., Lin, N., & Nelson, C. E. (1970). Communication in the physical social sciences. Science, 170, 1166-1173.
Hesse, Bradford W., Sproull, Lee S., & Kiesler, Sara B. (1993). Returns to science: Computer network in oceanography. Communications of the ACM, 26(8) (August), 90-101.
Kling, R., & Covi, L. (1996). Electronic journals and legitimate media. The Information Society, 11, 261-271.
Lougee, W. P., Sandler, M. S., & Parker, L.L. (1990). The Humanistic Scholars Project: A study of attitudes and behavior concerning collection storage and technology. College and Research Libraries, 51, 231-240.
Odlyzko, A. (1995). Tragic loss or good riddance? The impending demise of traditional scholarly journals. International Journal of Human-Computer Studies, 42, 71-122.
Olsen, J. (1994). Electronic journal literature: Implications for scholars. Westport, CT: Mecklermedia.
Sabine, G. A., & Sabine, P. L. (1986). How people use books and journals. Library Quarterly, 56, 399-408.
Schuegraf, E. J., & van Bommel, M. F. (1994). An analysis of personal journal subscriptions of university faculty. Part II: Arts and professional programs. Journal of the American Society of Information Science, 45, 477-482.
Simpson, A. (1988). Academic journal usage. British Journal of Academic Librarianship, 3, 25-36.
Stenstrom, P., & McBride, R. B. (1979). Serial use by social science faculty: A survey. College and Research Libraries, 40, 426-431.
Taubes, G. (1993). Publication by electronic mail takes physics by storm. Science, 259, 1246-1248.
Tobin, J. (1958). Estimation of relationship for limited dependent variables. Econometrica, 26, 24-36.
Walsh, J. P., & Bayma, T. (1997). Computer networks and scientific work. In S. B. Kiesler (Ed.), Culture of the Internet. Hillsdale, NJ: Lawrence Erlbaum Associates.
Patterns of Use for the Bryn Mawr Reviews
Bryn Mawr Classical Review (BMCR), one of the first electronic journals in the humanities, was started in 1990 to provide timely reviews of books in the classics. To lend solidity, a paper version was produced as well, and the two were issued simultaneously until late 1995, when the electronic reviews began to be published individually more or less as they were received and the paper versions issued four times a year. In 1993 a sister journal, Bryn Mawr Medieval Review (BMMR), was created to review books in medieval studies, and the two journals were combined to form the Bryn Mawr Reviews (BMR). After about two years of activity BMMR became dormant, and toward the end of 1996 both location and management were shifted. Since then it has become tremendously active, at one point even surpassing BMCR in its monthly output. Comparisons should be considered with this history in mind. (For more detail, see chapter 24.)
We have two sets of users: subscribers and gopher hitters. For data from the former we have subscription lists, which are constantly updated, and periodic surveys that we have conducted; for the latter we have monthly reports of gopher hits and gopher hitters (but not what the hitters hit). In considering this data our two main questions have been (1) how are we doing? and (2) how can we afford to keep doing it?
Our analysis of the monthly gopher reports has concentrated on the hitters rather than the hits. After experimenting rather fruitlessly in 1995 with microanalysis of
the data from the Netherlands and Germany hitter by hitter month by month for a year, we decided to collect only the following monthly figures:
• total number of users
• total by address (country, domain, etc.)
• list of top hits (those reviews that received 15+ hits/month and are over a year old )
• list of top hitters (those who use the system 30+/month)
Analysis shows that use has leveled off at a peak of about 3,800 users a month. With a second full year of gopher use to study we can see the seasonal fluctuation more easily. The one area of growth seems to be non-English foreign sites. If we compare the top hitters in the first ten months of 1995 with the comparable period in 1996, we find that the total increased only 5% but the total number of nonEnglish heavy users increased 120% (Table 12.1). Three countries were among the heavy users in both 1995 and 1996 (France, Germany, Netherlands); two appeared only in 1995 (South Africa, Taiwan), and eight only in 1996 (Brazil, Italy, Ireland, Poland, Portugal, Russia, Spain, Venezuela).
In terms of total number of users from 1995 to 1996 there was an overall increase of 10.8%, although the increase among U.S. users was only 9.1%. Conversely, most foreign countries showed a marked increase in total use over the ten months of 1996 versus 1995: Argentina, 16 to 27; Australia, 542 to 684; Brazil, 64 to 165; Denmark, 80 to 102; Spain, 107 to 197; Greece, 41 to 80; Ireland, 50 to 69; Israel, 89 to 108; Italy, 257 to 359; Japan, 167 to 241; Korea, 26 to 40; Netherlands, 273 to 315; Portugal, 16 to 26; Russia, 9 to 27; (former) USSR, 13 to 20; and South Africa, 63 to 88. On the other hand, Iceland went from 22 to 8, Malaysia from 30 to 21, Mexico from 68 to 56, Sweden from 307 to 250, and Taiwan from 24 to 14. Also, among U.S. users there was a large drop in the .edu domain, from 7,073 to 5,962, and a corresponding rise in the .net domain, from 1,570 to 4,118, perhaps because faculty members are now using commercial providers for home access.
In the analysis of top hits (Table 12.2), a curious pattern emerges: BMMR starts out with many more top hits despite offering a much smaller number of reviews (about 15% of BMCR's number), but toward the end of 1995 the pattern shifts. BMMR dominates at the beginning but drops when BMMR becomes inactive.
The shift is easily explained because it occurs about the time BMMR was becoming inactive, but the original high density is still surprising. Also surprising is that medieval books receive noticeably more attention: 32 medieval titles made the top hits list 116 times (avg 3.6), while 81 classical titles made the list only 219 times (avg 2.7), despite including two blockbuster titles, Amy Richlin's Pornography and Representation (10 times) and John Riddle's Contraception and Abortion (14 times). My guess is that medievalists, being more widely dispersed in interests and location, have found the Net more important than have classicists, who are mostly located
in a classics department and whose professional work is more circumscribed (and has a longer history).
Subscriptions to the e-journals continue to grow at a rate of 5% per quarter, although there is considerable seasonal fluctuation (see Table 12.3). Looking more broadly we see a steady slowdown in growth of all but the joint subscriptions (see Table 12.4).
If we look at the individual locations (Table 12.5), we find again that while the U.S. subscriptions continue to grow, they are becoming steadily fewer of the whole, going from 77% of the total in 1993 to 68% in 1996. English-speaking foreign countries have remained about the same percentage of the whole; it is nonEnglish speaking foreign countries that have shown the greatest increase, going from 4% of the total in 1993 to 13% of the total in 1996.
As opposed to the gopher statistics, which give breadth but little depth, our surveys offer the opportunity for deeper study of our users but at the expense of breadth. We cannot survey our subscribers too often or they will not respond. A further limitation is that we felt we could not survey those who take both BMCR and BMMR, a significant number, without skewing the results, since many subscribers lean heavily toward one journal or the other and the journals are significantly different in some ways. So far we have conducted five surveys:
1. a 20-question survey in November 1995 to BMCR subscribers
2. a 21-question survey in February 1996 to BMMR subscribers
3. a 2-question survey in October 1996 to all subscribers
4. a 15-question survey in January 1997 to all BMCR reviewers whose e-mail addresses we knew
5. a 2-question survey in March 1997 to those who have canceled subscriptions in the past year
Table 12.6 presents the subscriber profile as revealed in the surveys. Many of the differences are easily explained by the checkered history of BMMR or by the differing natures of the two readerships. I doubt many readers will be surprised to learn that medievalists are more often female and less often faculty. The paucity of reader-reviewers of BMMR reflects the paucity of BMMR reviews. To me, the most surprising statistic is the low use of gopher by subscribers to either journal.
The key question, of course, is willingness to pay for subscriptions. With that in mind, we did some correlation studies for the BMCR survey, first seeing what variables correlated with a willingness to pay $5 for a subscription. We found posi-
tive correlation (Pearson product-moment correlation) with the following categories:
• ever found review useful for teaching (r = .19,.00037 likelihood of a chance correlation)
• ever found review useful for research (r = .21, .00005)
• ever hear a reference to BMCR (r = .23, .00001)
• ever written a review for BMCR (r = .17, .00084)
Further correlations were found, some not at all surprising:
• start to read many/most reviews//heard a reference to BMCR (r = .20, .00014)
• willing to review//heard a reference to BMCR (r = .22, .00002)
• get paper BMCR//have written review (r = .22, .00002)
• have written review//will write in future (r = .24, .00000)
• will write in future//library gets BMCR (r = .21, .00005)
• Ph.D. //willing to review (r = .24, .00000)
• institutional affiliation//useful for teaching (r = .21, .00009)
• useful for teaching//useful for research (r = .25, .00000)
• heard a reference/ /willing to review (r = .22 , .00002)
A follow-up two-question survey done in October 1996 asked whether subscribers would prefer to pay for e-mail subscription, receive advertisements from publishers, or cancel. Fourteen percent preferred to pay, 82% to receive advertisements, and 4% to cancel.
Our most recent survey, of those who had for one reason or another dropped from the list of subscribers, revealed that almost a third were no longer valid addresses and so were not true cancellations. Of those who responded, almost half (40,44%) of the unsubscriptions were only temporary (Table 12.7). The reason for cancellation was rarely the quality of the review.
If we return to our two questions-progress and cost recovery-we can see that our progress is satisfactory but that cost recovery is still uncertain.
BMCR is growing at the rate of 30% a year. The major American classics organization (The American Philological Association) has about 3,000 members, and on that basis we estimate very roughly that the total world population of classicists is somewhere between 7,000 and 10,000. BMCR, then, presently reaches between 22% and 32% of its total market. Presumably, only half of that market has access to computers, so BMCR's real penetration may be over 50%. If so, at its present rate of growth, BMCR may saturate its market in as few as five years. It is much more difficult to estimate the total world market for BMMR, but it is certainly greater than that for BMCR. With BMMR's present growth rate of perhaps 30%, it will take somewhat longer to reach saturation.
BMCR unrecovered costs are about $4,000 per year for over 700 pages of reviews. About half the cost goes for producing the paper version, and we anticipate costs of between $1,500 and $2,000 per year for preparing reviews for the Web. Uncompensated editorial time averages 34 hours per month. Therefore, total out-of-pocket expenses could be as high as $6,000 if the paper version con-
tinues and if markup continues to be done by hand. A third possible reduction in costs besides elimination of the paper version and automatic markup is a "fasttrack" system whereby the review never leaves the Net: it is e-mailed to the editor, who sends it to a member of the editorial board; when the two have made changes, it is sent back to the reviewer for approval and then published on the Net. The great advantage for the reviewer is that this system cuts publication time by a month; the disadvantage is that the reviewer is asked to do some simple markup on the text before sending it.
Possible revenue sources include advertising, subscriptions, and institutional support. As we have seen, our subscribers much prefer receiving advertising to paying for a subscription, but we have no idea how successful we will be in attracting advertising. Hal Varian has suggested that we try to arrange something with Amazon Books, and we have made a tentative agreement to list their URL on our Web reviews. We will not consider charging for subscriptions until BMCR is
on the Web; at that point we could charge for timely delivery of the review, perhaps several months before universal access. We also want to wait for wide acceptance of a simple electronic cash transfer system. Institutional support seems to us the most obvious way to cover costs, since the college gets considerable exposure for what seems to us a small cost.
The Crosscurrents of Technology Transfer
The Czech and Slovak Library Information Network
One would have no great difficulty in estimating the demand function, i.e., the relationship between the price and the quantity that can be sold at that price for, say, tomatoes. But one would have considerable problems in making sales predictions at various hypothetical alternative prices for a new product that looks like a blue tomato and tastes like a peach. (Quandt 1996, 20)
This vivid image of an odd-looking vegetable that tastes like a fruit is meant to highlight the difficulty of estimating the demand side in the overall cost picture of producing and distributing new products, such as electronic publications. Compared to the traditional printed material, electronic products are new, from their internal architecture to the mechanisms of production, distribution, and access that stem from it. After all, the world of readers is not a homogeneous social group, a market with a simple set of specific needs. Yet we assume that a segment of this market-the scholarly community-takes easily and more or less quickly to supplementing their long established habits (of associating the printed text with a paper object) with different habits, experienced as equally convenient, of searching for and reading electronic texts. While this observation may be correct, it should be emphasized at this point that precisely in the expression "more or less" is where the opportunity lies-for those of us interested in transitions-to see what is involved in this change of habit and why it is not just a "matter of time." As anyone who has tried to explain the possibilities of electronic text delivery to an educated friend will attest, the idea can be viewed with anxiety and taken to mean the end of the book. The Minister of Culture of the Czech Republic, a wellknown author and dissident, looked at me with surprise as I tried to explain the need for library automation (and therefore for his ministerial support): he held both hands clasped together as if in prayer and then opened them up like a book
close to his face. He took a deep breath, exhaled, and explained how much the scent of books meant to him. His jump from on-line cataloging and microfilm preservation to the demise of his personal library was a rather daring leap of the imagination but not an uncommon one, even among those who should know better. It is not just the community of scholars, then, but of politicians and even librarians who must change their attitudes and habits. The problem is further compounded if we consider that in the case of Eastern Europe this new product is being introduced into a setting where the very notion of a market is itself unsettled. The question of demand is quite different in a society that had been dominated by a political economy of command.
In the pages that follow I will give a brief account of an extensive interlibrary automation and networking project that The Andrew W. Mellon Foundation initiated and funded abroad, in the Czech and Slovak republics. While most of the papers in this volume deal with digital libraries, this one points to the complexities that affect the ability of any library to change its ways and expand its mandate to include access to digitized materials. My aim is critical rather than comprehensive. By telling the reader about some of the obstacles that were confronted along the way, I hope to draw attention to the kinds of issues that need to be kept in mind when we think of establishing library consortia-the seemingly natural setting for the new technologies-in other countries.
The Caslin Projects
The Mellon-funded proposal to establish the Czech and Slovak Library Information Network (CASLIN) commenced in January 1993. In its original stage it involved four libraries in what has now become two countries: the National Library of the Czech Republic (in Prague), the Moravian Regional Library (in Brno), the Slovak National Library (in Martin), and the University Library of Bratislava. These four libraries had signed an agreement (a Letter of Intent) that they would cooperate in all matters that pertained to fully automating their technical services and, eventually, in developing and maintaining a single on-line Union Catalogue. They also committed themselves to introducing and upholding formats and rules that would enable a "seamless" integration into the growing international library community. For example, compliance with the UNIMARC format was crucial in choosing the library system vendor (the bid went to ExLibris's ALEPH). Similarly, Anglo-American cataloging rules (AACR2 ) have been introduced, and most recently, there is discussion of adopting the LC subject headings. Needless to say, the implementation was difficult and the fine-tuning of the system is not over yet, though most if not all of the modules are up and running in all four libraries. The first on-line OPAC terminals were made available to readers during 1996. At present, these electronic catalogs reflect only the library's own collection-there are no links to the other libraries, let alone to a CASLIN Union Catalogue-though
they do contain a variety of other databases (for example, a periodicals distribution list is available on the National Library OPAC that lists the location of journals and periodicals in different libraries in Prague, including the years and numbers held). A record includes the call number-a point of no small significance-but does not indicate the loan status, nor does the system allow users to "Get" or "Renew" a book. In spite of these shortcomings, the number of users of these terminals has grown sharply, especially among university students, and librarians are looking for ways to finance more (including some graphics terminals with access to the WWW).
In the period between 1994 and 1996, several additional projects (conceived as extensions of the original CASLIN project) were presented to The Mellon Foundation for funding. It was agreed that the new partners would adopt the same cataloging rules as well as any other standards and that they would (eventually) participate in the CASLIN Union Catalogue. Each one of these projects poses a unique opportunity to use information technology as an integrator of disparate and incongruous institutional settings.
The Library Information Network of the Czech Academy of Science (LINCA) was projected as a two-tiered effort that would (1) introduce library automation to the central library of the Czech Academy of Sciences and thereby (2) set the stage for the building of an integrated library-information network that would connect the specialized libraries of all the 6o scientific institutes into a single web with the central library as their hub. At the time of this writing the central library's LAN has been completed and most of the hardware installed, including the highcapacity CD-ROM (UltraNet) server. The ideal of connecting all the institutes will be tested against reality as the modular library system (BIBIS by Square Co., Holland) is introduced together with workstations and/or miniservers in the many locations in and outside the city of Prague.
The Koſsice Library Information Network (KOLIN) is an attempt to draw together three different institutions (two universities and one research library) into a single library consortium. If successful, this consortium in eastern Slovakia would comprise the largest on-line university and research library group in that country. The challenge lies in the fact that the two different types of institutions come under two different government oversight ministries (of education and of culture), which further complicates the already strained budgetary and legislative setup. Furthermore, one of the universities-the University of Pavel Josef Safarik (UPJS)-at that time had two campuses (in two cities 40 km apart) and its libraries dispersed among thirteen locations. UPJS is also the Slovak partner in the SlovakHungarian CD-ROM network (Mellon-funded HUSLONET) that shares in the usage and the costs of purchasing database licenses.
Finally, the last of the CASLIN "add-ons" involves an attempt to bridge incompatibilities between two established library software systems by linking two university and two state scientific libraries in two cities (Brno and Olomouc) into a
single regional network, the Moravian Library Information Network (MOLIN). The two universities-Masaryk University in Brno and Palacký University in Olomouc-have already completed their university-wide library network with TinLib (of the United Kingdom) as their system of choice. Since TinLib records do not recognize the MARC structure (the CASLIN standard adopted by the two state scientific libraries), a conversion engine has been developed to guarantee full import and export of bibliographic records. Though it is too soon to know how well the solution will actually work, it is clear already that its usefulness goes beyond MOLIN, because TinLib has been installed in many Czech universities.
Fortunately, storage, document preservation, retrospective conversion, and connectivity have all undergone substantial changes over the past few years. They are worth a brief comment:
1. Up until the end of the Communist era, access to holdings was limited not only by the increasingly ridiculous yet strict rules of censorship but also by the worsening condition of the physical plant and, in the case of special collections, the actual poor condition of the documents. The National Library in Prague was the most striking example of this situation; it was in a state of de facto paralysis. Of its close to 4 million volumes, only a small percentage was accessible. The rest were literally "out of reach" because they were either in milk crates and unshelved or in poorly maintained depositories in different locations around the country. This critical situation turned the corner in January 1996 when the new book depository of the NL was officially opened in the Prague suburb of Hostivar. Designed by the Hillier Group (Princeton, N.J.) and built by a Czech contractor, it is meant to house 4.5 million volumes and contains a rare book preservation department (including chemical labs) and a large microfilm department. Because more than 2 million volumes were cleaned, moved, and reshelved by the end of 1996, it is now possible to receive the books ordered at the main building (a book shuttle guarantees overnight delivery). Other library construction has been under way, or is planned, for other major scientific and university libraries in the Czech Republic. There is no comparable library construction going on in Slovakia.
2. The original CASLIN Mellon project included a small investment in microfilm preservation equipment, including a couple of high-end cameras (GRATEK) with specialized book cradles-one for each of the National Libraries-as well as developers, reader-printers, and densitometers. The idea was to (1) preserve the rare collection of nineteenth- and twentieth-century periodicals (that are turning to dust), (2) significantly decrease the turnaround time that it takes to process a microfilm request (from several weeks to a few days), and (3) make it technically possible to meet the highest international standards in microfilm preservation. The program has since evolved to a full-scale digitalization project (funded by the Ministry of Culture) that includes the collections of other libraries.
3. The most technologically ambitious undertaking, and one that also has the most immediate and direct impact on document accessibility, is the project for the retrospective conversion of the general catalog of the National Library in Prague. Known under the acronym RETROCON, it involves a laboratory-like setup of hardware and software (covered by a Mellon Foundation grant) that would-in an assembly-line fashion-convert the card catalog into ALEPH-ready electronic form (UNIMARC). RETROCON is designed around the idea of using a sophisticated OCR in combination with a specially designed software that semiautomatically breaks down the converted ASCII record into logical segments and places them into the appropriate MARC field. This software, developed by a Czech company (COMDAT) in cooperation with the National Library, operates in a Windows environment and allows the librarian to focus on the "editing" of the converted record (using a mouse and keyboard, if necessary) instead of laboriously typing in the whole record. As an added benefit, the complete scanned catalog has now been made available for limited searching (under author and title in a Windows environment), thereby replacing the original card catalog. One of the most interesting aspects of this project has been the outsourcing of the final step in the conversion to other libraries, a sort of division of labor (funded in part by the Ministry of Culture) that increases the pool of available expert catalogers.
4. For the most part, all installations of the LAN have proceeded with minimal problems, and the library automation projects, especially those involving technical services, are finally up and running. Unfortunately, this achievement cannot be said for the statewide infrastructure, especially not the phone system. Up until the end of 1997, the on-line connections between libraries were so poor that it was difficult to imagine, let alone test, what an on-line library network would have to offer. Needless to say, this holdback has had an adverse effect on library management, especially of the CASLIN consortium as a whole.
A comparison of the present condition and on-line readiness of research and university libraries in Central Europe with the status quo as it arrived at the doorstep of the post-1989 era leaves no doubt that dramatic improvements have taken place. But even though the once empty (if not broken) glass is now half filled, it also remains half empty. Certainly that is how most of the participants tend to see the situation, perhaps because they are too close to it and because chronic dissatisfaction is a common attitude. Yet the fact remains that throughout the implementation and in all of the projects, obstacles appeared nearly every step of the way. While most of the obstacles were resolved, although not without some cost, all of them can be traced to three basic sources of friction: (1) those best attributed
to external constraints-the budgetary, legal, political, and for the most part, bureaucratic ties that directly affect a library's ability to function and implement change; (2) those caused by cultural misunderstandings-the different habits, values, and expectations that inform the activity of localization; and (3) the internal problems of the libraries themselves, no doubt the most important locus of micropolitical frictions and therefore of problems and delays. In what follows, I will focus on the first source of friction (with some attention paid to the second), since my emphasis here is on the changing relations between what are taken to be separate institutional domains (particularly between libraries and other government organizations or the market) as I try to make sense of the persistently problematic relationships between libraries (particularly within the CASLIN group). Obviously, while these analytical distinctions are heuristically valuable, in reality, these sources of friction are intertwined and further complicated by the fact that the two countries are undergoing post-Communist aftershocks and an endless series of corrections. Not only are the libraries being transformed, but so is the world of which they form a part. To make sense of this double transition and to describe the multifaceted process that the library projects have moved through may pose some difficulties. But the task also offers a unique opportunity to observe whether, and if so how, the friction points move over time. What could have been predicted when the initial project commenced-that implementation and system localization would also mean giving in to a variety of constraints-is only beginning to take on the hard contours of reality four years later. In several instances, the results differ from our initial conception, but I don't think it would be fair to assume that the final outcome will be a compromise. Instead, the success of the Mellon library projects in Eastern Europe (of which CASLIN is only one) should be judged by the extent to which they have been accepted and have taken on a life of their own, initially distinguishable but finally inseparable from the library traditions already in place. After all, if the projects were designed to affect a change in the library system-and by "system," we must understand a complex of organizational structures, a real culture, and an actually existing social network-then we must also expect that the library system will respond that way, that is, as a complex sociocultural system. What appeared at first as a series of stages (goals) that were to follow one another in logical progression and in a "reasonable" amount of time may still turn out to have been the right series. It's just that the progression will have followed another (cultural) logic, one in which other players-individuals and the organizational rules that they play by-must have their part. As a result, the time it actually takes to get things done seems "unreasonable," and some things even appear to have failed because they have not taken place as and when expected. What is the meaning of these apparent problems? A seemingly philosophical issue takes on a very real quality as we wonder, for example, about the future of the CASLIN consortium. If establishing a network of library consortia was one of the central aims of the Mellon project, then it is precisely this goal that we have failed to reach, at least now, when it was supposed to be long in place according to our
scheme of things. There is no legal body, no formal association of participating libraries in place. This deficiency is particularly important and, needless to say, frustrating for those of us who take for granted the central role that networking and institutional cooperation play in education and scholarly research. But behind this frustration another one hides: it is probably impossible to say whether what is experienced as the status quo, in this case as a failure or shortcoming, is not just another unexpected curve in a process that follows an uncharted trajectory.
As I have noted above, in 1992 a Letter of Intent had been signed by the four founding CASLIN members. It was a principal condition of the project proposal. In January 1996, when this part of the project was-for all intents and purpose- brought to a close, there was still no formally established and registered CASLIN association with a statute, membership rules, and a governing body in place. Although the four libraries had initially worked together to choose the hardware and software, the work groups that had been formed to decide on specific standards (such as cataloging rules, language localization, or the structure of the Union Catalogue record) had trouble cooperating and their members often lacked the authority to represent their institution. Tasks were accomplished more because of the enthusiasm of individuals and the friendly relations that developed among them than because of a planned, concerted effort on the part of the library leadership guided by a shared vision. The initial stages of the implementation process were characterized by an uneven commitment to the shifting of priorities that would be necessary in order to carry the intent through. There was even a sense, in some instances, that the prestige of the project was more important than its execution or, more exactly, that while the funding for library automation was more than welcome, so was the political capital that came with being associated with this U.S.-funded project, even if such an attitude meant using this political capital at a cost to the consortium. As is well documented from many examples of outside assistance in economic development, well-intentioned technology transfer is a prime target for subversion by other, local intentions; it can be transformed with ease into a pawn in another party's game. Potential rivalries and long-standing animosities that existed among some of the libraries, instead of being bridged by the project, seemed to be exacerbated by it. In one instance, for example, affiliation with the Mellon project was used by a library to gain attention of high government officials (such as the cultural minister) responsible for policies affecting their funding and, most important, their mandate. The aim, as it now turns out, was to gain the status of a national library. This library's target, that is, the library that already had this status, was the Slovak National Library, its primary CASLIN partner. While both libraries participated in the CASLIN project's implementation and even cooperated in crucial ways at the technical level (as agreed), their future library cooperation was being undermined by a parallel, semiclandestine, political plot. Needless to say, this situation has left the CASLIN partnership weakened and the managements of both libraries dysfunctional.
As the additional library projects mentioned earlier were funded and the new
libraries joined the original CASLIN group, it became clear that the new, larger group existed more in rhetoric than in fact. From the newcomer's point of view there was not much "there" to join. "What is in this for us, and at what cost?" seemed to be the crucial question at the January 1996 meeting at which a written proposal for a CASLIN association was introduced by the National Library in Prague. This meeting was not the first time that an initiative had been presented but failed to take hold. Nor was it the last. The discussion about the proposal resulted in a squabble. An e-mail discussion group was established to continue the discussion but nothing came of it nor of several other attempts.
If the point of a consortium is for libraries to cooperate in order to benefit (individually) from the sharing of resources so as to provide better on-line service, then a situation such as this one must be considered counterproductive. How does one explain the chronic inability of CASLIN to get off the ground as a real existing organization? Where does the sense of apathy, reluctance, or even antagonism come from? Most of the answers (and there are many) lie hidden within the subtleties of society and history. But of these answers, a few stand out clearly: the fact that all the original CASLIN libraries come under the administrative oversight of the Ministry of Culture is one key piece of the puzzle. The dramatic cuts in the ministries' overall budgets are passed down to the beneficiaries who find themselves competing for limited goods. Another answer is in the lingering nature of the relationship: if the difference from the previous setup (under the "planned" socialist economy) lies with the fact that the library has the status of a legal subject that designs and presents its own budget, its relationship to the ministry-very tense and marked by victimization-seems more like the "same old thing." In other words, certain aspects of organizational behavior continue not only by force of habit (a not insignificant factor in itself), but also because these aspects are reinforced by a continuing culture of codependency and increased pressure to compete over a single source of attention. The situation appears as if, from our point of view, the formal command economy has been transformed into a market economy only to the extent that strategic and self-serving positioning is now more obvious and potentially more disruptive. So-called healthy competition (so called by those whose voices dominate in the present government and who believe in the self-regulating spirit of "free market forces") seems to show only its ugly side: we see the Mellon project embraced with eagerness in part because of the way its prestige could be used to gain a competitive advantage over other libraries. In the case of CASLIN partners, we see it take the form of suspicion, envy, and even badmouthing expressed directly to the Mellon grants administrator (myself).
What are the constraints under which a research or national library operates, and in what way is the present situation different from the "socialist" era [1948-1989]? An answer to these questions will give us a better sense of the circumstances under which attempts to bring these institutions up to international standards-and get them to actively cooperate-must unfold.
Figures 13.1 and 13.2 illustrate the external ties between a library and other important domains of society that affect its functioning and co-define its purpose before and after 1989 (while keeping in mind that economic, legal, and regulatory conditions have been in something of a flux in the years since 1989 and, therefore, that the rules under which a library operates continue to change).
1. Under "party" rule the library, like all other organizations, came under direct control of its ministry, in this case the Ministry of Culture [MK]. One could even say, by comparison with the present situation, that the library was an extension of the ministry. However, the ministry was itself an extension of the centralized political rule (the Communist party), including the watchful eye of the secret police [STB]. The director was appointed "from above" [PARTY] and the budget arrived from there as well. While requests for funding were entertained, it was hard to tell what would be funded and under what ideological disguise. For the most part the library was funded "just in order to keep it alive," though if the institution ran out of money in any fiscal year, more could be secured to "bail it out" [hence "Soft" Budget]. In addition to bureaucratic constraints (regarding job descriptions and corresponding wage tables, building maintenance and repairs, or the purchase of monographs and periodicals), many of which remain in place, there were political directives regarding employability and, of course, the ever-changing and continuously growing list of prohibited materials to which access was to be denied [Index]. In contrast, the library is now an independent legal body that can more or less decide on its priorities and is free to establish working relationships with other (including foreign) organizations. The decision making, including organizational changes, now resides within the library. While the budget is presented to the ministry and is public knowledge, it is also a "hard" budget that is set at the ministerial level as it matches its cultural policies against those of the Ministry of Finance [MF] (and therefore of the ruling government coalition). After an initial surge in funds (all marked for capital investment only), the annual budgets of the libraries have been cut consistently over the past five years (i.e., they are not even adjusted for inflation but each year are actually lower than the previous one). These cuts have seriously affected the ability of the libraries to carry out their essential functions, let alone purchase documents or be in the position to hire qualified personnel. For this reason, I prefer to speak of a relationship of codependence. The Ministry of Culture still maintains direct control over the library's ability to actualize its "independence"-though it has gradually shifted from an antagonistic attitude to one of genuine concern. The point is that whereas the Ministry of Culture is supposed to oversee the well-being of its institutions, it is, as is usually the case in situations of government supervision, perceived as the powerful enemy.
2. The publishing world was strictly regulated under the previous regime: all publishing houses were state enterprises (any other attempt at publishing was punishable by law), and all materials had to pass the scrutiny of the state (political) censor. Not everything that was published was necessarily political trash, and editions were limited; the resulting economy of shortage created a high demand for printed material, particularly modern fiction, translations from foreign languages, and the literary weekly [hence "Seller's Market"]. Libraries benefited from this situation. Because all state scientific and research libraries were recipients of the legal deposit, their (domestic) acquisitions were, de facto, guaranteed. At present the number of libraries covered by the deposit requirement has been reduced from some three dozen to half a dozen. This change was meant to ease the burden on publishers and give the libraries a freer hand in building their collection in a "competitive marketplace." But considering the severe cuts in the budget, many of the libraries cannot begin to fulfill even the most Spartan acquisitions policy. For the same reason publishers, of whom there are many and all of whom are private and competing for the readers' attention, do not consider libraries as important parts of their market. Furthermore, many of the small and often short-lived houses do not bother to apply for the ISBN or to send at least one copy (the legal deposit law is impossible to enforce) to the National Library, which, in turn, cannot fulfill its mandate of maintaining the national bibliographic record.
3. During the Communist era, access to materials was limited for several obvious reasons: political control (books on the Index, limited number of books from Western countries, theft) or deliberate neglect (the progressively deteriorating storage conditions eventually made it impossible to retrieve materials). Over the years there was less and less correspondence between the card catalogs in the circulation room and the actual holdings, and as a result, students and scholars stopped using the National Library in Prague because it was increasingly unlikely that their requests would be filled. This situation was also true for current Czech or Slovak publications because of an incredible backlog in cataloging or because the books remained unshelved. Of course, in such a system there was no place for user feedback. Since then, some notable improvements-many of them due to Mellon and other initiatives-have been made in public services, such as self-service photocopying machines and, to remain with the example of the National Library, quick retrieval of those volumes that have been reshelved in the new depository. Also, readers are now used to searching the electronic OPACs or using the CD-ROM databases in the reference room. On the other hand, the backlog of uncataloged books is said to be worse than before and, with acquisitions cut back and the legal deposit not observed, the reader continues to leave the circulation desk empty-handed. The paradoxical situation is not lost on the reader: if the books are out of print or, as is more often the case these days, their price beyond what readers could
afford, going to the library may not be a solution either. So far the basic library philosophy has remained the same as it has throughout its history: although there is concern for the user, libraries are not genuinely "user driven" (only a few university libraries have adopted an open stack policy) and, as far as I can tell, user feedback is not a key source of information actively sought, analyzed, and used in setting priorities.
4. Under the policies of socialist economy, full employment was as much a characteristic of the library as it was of the rest of society. In effect, organizations hoarded labor (as they did everything else) with a sort of just-in-case philosophy in mind, since the point was to fulfill "the plan" at just about any cost and provide full benefits for all with little incentive for career development (other than through political advancement). Goods and services became known for their poor quality; the labor force became known for its extremely low productivity and its lousy work morale. More time seemed to be spent in learning how to trick the system than in working with it, to the point where micro political intrigue-the backbone of the "second" economy- competed very well with the official chain of command. The introduction of a market economy after 1990 did very little to help change this situation in a library, a state organization with no prestige. Simply put, the novelty and promise of the private sector, coupled by its high employment rate and good wages, has literally cut the library out of the competitive market for qualified labor. Between the budget cuts and the wage tables still in place, there is little space left for the new management to negotiate contracts that would attract and keep talented people in the library, certainly not those people with an interest in information technologies and data management.
5. As mentioned above, the first information technologies arrived in the state scientific and national libraries in the late 1980s. Their impact on budgets was minimal (UNESCO's ISIS is freeware) as was their effect on technical services. On the other hand, the introduction of information technologies into these libraries, in particular the CASLIN group, was the single most visible and disruptive change-a sort of wedge that split the library organizations open-that has occurred since 1990 (or, according to some, during the last century). The dust has not yet settled, but in view of our present discussion, one thing is clear already: between the Mellon funds and the initial capital investment that followed, libraries have become a significant market for the local distributors of hardware and for the library software vendors (in contrast to the relationship with publishers). But as we all know, these purchases are not one-time purchases but only the first investments into a new kind of dependency, a new external tie that the library must continue to support and at no small cost. And the cost is not just financial. The ongoing complications with the technology and the chronic delays in systems localization only
contribute to the present sluggish state of affairs and thus lend support to the ever cynical factions within the organization that knew "all along" that "the whole automation project was a mistake." Obviously, the inability to attract qualified professionals doesn't help.
What I have painted here is but part of the picture (the other part would be made up of a detailed analysis of the micropolitics that actually go on, both inside the organization and in relation to other organizations, particularly other libraries). But the above discussion should help us see how and why the libraries feel trapped in a vicious circle from which they perceive little or no way out other than continuing to battle for their place in the sun. Of course, their tactics and battle cries only reinforce the relationship of codependency as well as their internal organizational problems. And that is exactly what the public and government officials see: that these institutions need to grow up and learn what real work is before more money is poured down the drain. Needless to say, a sizable portion of the blame must be carried by a government that has made a conscious choice against long-term investment into the educational, scientific, and information sectors.
If the long-standing administrative ties between libraries and the Ministry of Culture inform and override the building of new, potentially powerful ties to other libraries, then the flip side of this codependency, its result, is a lack of experience with building and envisioning the practical outcome of a horizontally integrated (i.e., nonhierarchical) association of independent organizations. The libraries had only limited exposure to automation, and the importance of long-term strategic planning was lost on some of them. At least two other factors further reinforced this situation: the slow progress (the notorious delays mentioned above) in the implementation of the new system, which had involved what seemed like impractical and costly steps (such as working in UNIMARC), and the sluggish Internet connection. These factors suggest that at the present, a traditional understanding of basic library needs (which themselves are overwhelming) tends to take precedent over scenarios that appear much too radical and not grounded in a familiar reality. Since the on-line potential is not fully actualized, its impact is hard to imagine, and so the running of the organization in related areas continues to be predominantly reactive rather than proactive. In other words, in-house needs are not related to network solutions, especially when such solutions appear to be counterintuitive for the established (and more competitive) relationship between the libraries.
Cooperation among the libraries exists at the level of system librarians and other technical experts. Without this cooperation, the system would not have been installed, certainly not as an identical system in all four libraries. In addition (and, I should say, ironically) the CASLIN project has now received enough publicity to make it a household name among librarians. The acronym
has a life of its own, and there is a growing interest among other scientific libraries to join this "prestigious" group (that both does and does not exist). But are we waiting for a moment at which the confluence of de facto advances in technical services and a growing interest of other libraries in logistical support (involving technology and technical services) will create a palpable need for a social organization that would exist (1) above and beyond the informal network of cooperation and (2) without association with the name and funds of The Andrew W. Mellon Foundation (its original reason for existence)? I have heard it said that "nothing more is needed," because the fundamentals of CASLIN are now embedded in the library process itself (reference here, I gather, was to cataloging) and in the existing agreements between individual libraries on the importing and exporting of records into and from the CASLIN Union Catalogue that is to be serviced by the two national libraries. In fact, as the most recent meeting (June 1997) of the Union Catalogue group made clear, such processes are indeed where the seed of an association of CASLIN libraries lies. The import and export of records and the beginning of the Union Catalogue database have yet to materialize, but they did bring together these individuals who represented individual libraries. If these people figure out a way to run their own show and stick to it, then there is a fair chance that an organization of CASLIN libraries will take off after all.
The above discussion raises three very important points. The first point regards cultural misunderstanding. The problem with the "misbehaving consortium" may lie to some extent with our (e.g., U.S.) expectations of what cooperation looks like and what basic fundamentals an on-line library consortium must embrace in order to do its job well. In the Czech and Slovak case, not only were the conditions not in place, they were counterindicative. While our naïveté caused no harm (the opposite is the case, I am repeatedly told!), it remains to be seen what the final result will look like. And in the final result resides the really intriguing lesson: maybe it is not so much that we should have or even could have thought differently and therefore ended up doing "this" rather than "that." Perhaps it is in the (information) technology itself-in its very organization-that the source of our (mis)understanding lies. After all, these technologies were developed in one place and not another. Our library automation systems obviously embody a particular understanding of technical and public services and an organization of work that share the same culture as a whole tradition of other technologies that emphasize speed, volume (just think of the history of railroads or the development of the "American system" of manufacturing), and finally, access. Every single paper in this volume exemplifies and assumes this world. In transferring a technology from one place to another, an implied set of attitudes and habits is being marketed as well. The intriguing question is whether the latter emerges logically from the former in the
new location. To this possibility, my second point lends some support: technology transfer involves a time lag, the duration of which is impossible to predict and that is accounted for by a complex series of micropolitical adjustments. It is this human factor that transforms the logical progression in the projected implementation process into a much less logical but essentially social act. Thanks to this human factor, the whole effort may fail. Without it, the effort will not exist. Only after certain problems and not others arise will certain solutions and not others seem logical. It is no secret that much social change is technology driven. It is less clear, ethnographically speaking, what exactly this process means, and even less is known about it when technology travels across cultural boundaries. There is much to be gained from looking carefully at the different points in the difficult process of implementing projects such as CASLIN. Apparently the ripple effect reaches far deeper (inside the institutions) and far wider (other libraries, the government, the market, and the users) than anyone would have anticipated. Before it is even delivering fully on its promise, the original Mellon project is demanding changes in library organization and management. Such changes are disruptive, even counterproductive, long before they "settle in." Nevertheless, and this is my third point, internal organizational change involves a gradual but, in consequence, quite radical realignment of ties with the outside, that is, with the Ministry of Culture (which-at least on the Czech side-has taken a keen interest in supporting library automation throughout the country; on the Slovak side, unfortunately, the situation is rather different), with other libraries (there has been a slow but palpable increase in interlibrary cooperation on specific projects that involve the use of information technologies, e.g., retrospective conversion, newspaper preservation, and, I hope, the CASLIN Union Catalogue), and most important, with the public. How far reaching and permanent these shifts are is difficult to say, especially when any accomplishments have been accompanied by a nagging feeling that they were done on a shoestring and against all odds. The persistent inability of the governments to pass legislation and appropriate funding that would support the newly emerging democracies' entrance into the global, information age in a sustainable manner highlights a serious lack of vision as well as of political savvy.E
At the beginning of this paper I argued that in discussing the introduction of new technologies, specifically information technologies, it is important to pay attention to the point of transition, to see all that is involved in this change of habit and why it is not just a "matter of time." The body of this paper, I hope, provided at least a glimpse of some of the friction points involved. For the time being, the last word, like the first, belongs to an economist, in this case to Václav Klaus, the prime minister of the Czech Republic (1993-1997), whose opinions expressed in a recent op-ed piece on "Science and our Economic Future" make him sound like someone who has just bitten into a blue tomato only to find that it tastes like a peach.
Science is not about information, but about knowing, about thinking, about the ability to generalize thoughts, make models of them and then testable hypotheses that
are to be tested. Science is not about the Internet and certainly not about its compulsory introduction. (Klaus 1997)
Consortial Access versus Ownership
This chapter reports on a consortial attempt to overcome the high costs of scholarly journals and to study the roots of the cost problem with the advent of high-speed telecommunication networks throughout the world. The literature on the problem of journal costs includes both proposals for new ways of communicating research results and many studies on pricing.
Prominent members of the library profession have written proposals on how to disengage from print publishers. Others have suggested that electronic publications soon will emerge and bring an end to print-based scholarship. Another scientist proposes that libraries solve the problem by publishing journals themselves. These proposals, however, tend not to accommodate the argument that loosely coupled systems cannot be easily restructured. While access rather than ownership promises cost savings to libraries, the inflation problem requires further analysis of the factors that establish journal prices before it is solved.
Many efforts to explain the problem occupy the literature of the library profession and elsewhere. The most exhaustive effort to explain journal price inflation, published by the Association of Research Libraries for The Andrew W. Mellon Foundation, provides ample data, but no solution. Examples of the problem appear frequently in the Newsletter on Serials Pricing Issues, which was developed expressly to focus discussion of the issue. Searches for answers appear to have started seriously with Hamaker and Astle, who provided an explanation based on currency exchange rates. Other analyses propose means to escape inflation by securing federal subsidies, complaining to publishers, raising photocopying charges, and convincing institutional administrators to increase budgets.
Many analyses attempt to isolate factors that determine prices and the difference in prices between libraries and individuals. Some studies look at the statistical
relevance of sundry variables, but especially publisher type. They confirm the belief that certain publishers, notably in Europe, practice price discrimination. They also show that prices are driven by cost of production, which is related to frequency of issue, number of pages, and presence of graphics. Alternative revenue from advertising and exchange rate risk for foreign publishers also affects price. Quality measures on the content, such as number of times a periodical is cited, affects demand that then impacts price. Economies of scale that are available to some journals with large circulation affects price. These articles help explain differentials between individual and library prices. Revenues lost to photocopying also account for some difference. Also, differences in the way electronic journals may be produced compared to print provides a point on which some cost savings could be based.
The costs of production and the speed of communication may be driving forces that determine whether new publications emerge in the electronic domain to replace print. However, in a framework shaped by copyright law, the broader issue of interaction of demand and supply more likely determines the price of any given journal. Periodical prices remain quite low over time when magazine publishers sell advertising as the principal generator of revenue. When for political or similar reasons, publication costs are borne by organizations, usually not scholarly societies, periodical prices tend to be lower. Prices inflate in markets with high demand such as the sciences, where multiple users include practicing physicians, pharmaceutical firms, national laboratories, and so forth.
Unfortunately for libraries, the demand from users for any given journal is usually inelastic. Libraries tend to retain subscriptions regardless of price increases, because the demand originates with nonpaying users. In turn, price increases charged to individual subscribers to scholarly journals drive user demands. Therefore, it might be expected that as publishers offer currently existing print publications in an electronic form, they will retain both their price as well as inelastic demand. Commercial publishers, who are profit maximizers, will seek to retain or improve their profits when expanding into the electronic market. However, there are some properties associated with electronic journals that could relax the inelasticity of prices.
This chapter describes a multidisciplinary study of the impact of electronic publishing on the pricing of scholarly periodicals. A brief overview of the pricing issue comparing print and electronic publishing is followed by a summary of the access approach to cost containment. A preliminary report on an attempt at this technique by a consortium and on an associated econometric study also is included.
Overview of Pricing Relevant to Electronic Journals
The industry of scholarly print publishing falls into the category of monopolistic competition, which is characterized by the presence of many firms with differentiated products and by no barriers to entry of new firms. As a result of product
differentiation, scholarly publishers do not encounter elastic aggregate demand typically associated with competitive markets. Rather, each publisher perceives a negatively sloped individual demand curve. Therefore, each supplier has the opportunity to partially control the price of its product, even though barriers to entry of new, competing periodical titles may be quite low. Given this control, publishers have raised their prices to libraries with some loss of sales but with consequent increases in profits that overwhelm those losses. They segment their market between individuals and libraries and charge higher prices to the latter in order to extract consumer surplus.
As publishers lose sales to individuals, scholars increase their dependency on libraries, which then increases interlibrary borrowing to secure the needed articles. Photocopies supplied via library collections constitute revenue lost to publishers, but is recaptured in the price differential. Additional revenue might accrue if publishers could offer their products in electronic databases where they could monitor all duplication. This potential may rest on the ability of publishers to retain control in the electronic domain of the values they have traditionally added to scholarship.
Scholars need two services from scholarly literature: (1) input in the form of documentation of the latest knowledge and/or information on scholarly subjects and (2) outlets for their contributions to this pool of scholarship. Partly in exchange for their trade of copyright, scholars receive value in four areas. First, scholars secure value in communication when every individual's contribution to knowledge is conveyed to others, thus impacting the reputation of each author. Second, although not provided by publishers directly, archiving provides value by preserving historically relevant scholarship and fixing it in time. Third, value accrues from filtering of articles into levels of quality, which improves search costs allocation and establishes or enhances reputation. Fourth, segmenting of scholarship into disciplines reduces input search costs to scholars. The exchange of copyright ownership for value could be affected with the emergence of electronic journals.
Electronic journals emerge as either new titles exclusively in electronic form or existing print titles transformed to electronic counterparts. Some new journals have begun exclusively as electronic publications with mixed success. The directory published by the Association of Research Libraries listed approximately 27 new electronic journals in 1991. By 1995 that figure had risen to over 300, of which some 200 claim to be peer reviewed. Since then hundreds more electronic journals have been added, but the bulk of these additions appear to be electronic counterparts of previously existing print journals. In fact, empirical work indicates that exclusively electronic publications have had little impact on scholarship.
The infrastructure of scholarly print publishing evolved over a long time. In order for a parallel structure to emerge in the electronic domain, publishers have to add as much value to electronic journals as they do print. Value must be added
in archiving, filtering, and segmenting in addition to communication. Establishing brand quality requires tremendous energy and commitment. Some electronic titles are sponsored by individuals who are fervent in their efforts to demonstrate that the scholarly community can control the process of communicating scholarship. However, it is unrealistic to expect an instantaneous, successful emergence of a full-blown infrastructure in the electronic domain that overcomes the obstacles to providing the values required by scholars. The advantage of higher communication speed is insufficient to drive the transformation of scholarship; thus traditional publishing retains an edge in the electronic domain.
A transformation is being achieved effectively by duplicating existing print journals in the electronic sphere, where publishers face less imposing investments to provide electronic counterparts to their product lines. For example, the Adonis collection on CD-ROM contains over 600 long-standing journals in medicine, biology, and related areas covering about seven years. Furthermore, EBSCO, University Microfilms (UMI), Information Access Company (IAC), Johns Hopkins University Press, OCLC, and other companies are implementing similar products. OCLC now offers libraries access to the full text of journal collections of more than 24 publishers. Johns Hopkins University Press has made all 46 plus titles that it publishes available on-line through Project MUSE.
During the past 15 years, libraries have experienced a remarkable shift from acquiring secondary sources in print to accessing them through a variety of electronic venues, which suggests that many scholarly periodicals will become available electronically as an automatic response to the economies available there. However, some monopoly power of publishers could be lost if barriers to the entry of new journals are lower in the electronic domain than in the print domain. With full text on-line, libraries may take advantage of the economies of sharing access when a group of libraries contracts for shared access to a core collection. Sharing a given number of access ports allows economies of scale to take effect. Were one access port provided to each member of a consortium of 15 libraries, the vendor would tie up a total of 15 ports, but any given library in the group would have difficulty servicing a user population with one port. Whereas by combining access, 15 libraries together might get by with as few as 10 ports collectively. The statistical likelihood is small that all 10 ports would be needed collectively by the consortium at any given moment. This saves the vendor some computer resources that can then lead to a discount for the consortium that nets out less cost to the libraries.
Numerous models for marketing exist, but publishers can price their products in the electronic domain fundamentally only two ways. Either they will offer their products on subscription to each title or group of titles, or they will price the content on an article-by-article transaction basis. Vendor collections of journals for one flat fee based on the size of the user population represents a variant on the subscription fee approach. Commercial publishers, who are profit maximizers, will choose the method with the higher potential to increase their profit. Transaction-based
pricing offers the possibility of capturing revenue lost to interlibrary lending. Also, demand for content could increase because of the ease of access afforded on-line. On the risk side, print subscription losses would occur where the cumulative expenditure for transactions from a given title is less than its subscription price.
Potentially, two mechanisms could flatten demand functions in the electronic domain. First, by making articles available individually to consumers, the separation of items of specific interest to given scholars creates quality competition that increases the elasticity of demand, because quality varies from article to article. Presumably, like individual grocery items, the elasticity of demand for particular articles is more elastic than that of periodical titles. Economists argue that the demand for tortillas is more elastic than for groceries in general because other bakery goods can be substituted, whereas there is no substitute for groceries in general except higher priced restaurant eating. Similarly, when faced with buying individual articles, price increases will dampen demand more quickly than would be the case for a bundle of articles that are of interest to a group of consumers.
Second, by offering articles in an environment where the consuming scholar is required to pay directly (or at least observe the cost to the library), the effect of separation of payer and demander common with library collections resulting in high inelasticity will be diminished. This mechanism will increase elasticity because scholars will no longer be faced with a zero price. Even if the scholar is not required to pay directly for the article, increased awareness of price will have a dampening effect on inelasticity. However, publishers may find it possible to price individual articles so that the sum of individual article fees paid by consumers exceeds the bundled subscription price experienced by libraries formerly forced to purchase a whole title to get articles in print.
For a product like Adonis, which is a sizable collection of periodicals in the narrow area of biomedicine, transaction-based pricing works out in favor of the consumer versus the provider, since there will likely be only a small number of articles of interest to consumers from each periodical title. This result makes purchasing one article at a time more attractive than buying a subscription, because less total expenditure will normally result. In the case of a product composed of a cross section of general purpose periodicals, such as the UMI Periodical Abstracts, the opposite will be true. The probability is higher that a user population at a college may collectively be interested in every single article in general purpose journals. This probability makes subscription-based pricing more favorable for libraries, since the cumulative cost of numerous transactions could easily exceed the subscription price. Publishers will seek to offer journals in accordance with whichever of these two scenarios results in the higher profit. Scientific publishers will tend to bundle their articles together and make products available as subscriptions to either individual journals or groups. Scholarly publishers with titles of general interest will be drawn toward article-by-article marketing.
An Elsevier effort to make 1,100 scientific titles available electronically will be
priced on a title-by-title subscription basis and at prices higher than the print version when only the electronic version is purchased. On the other hand, the general purpose titles included in UMI's Periodical Abstracts full text (or in the similar products of EBSCO and IAC), as an alternative interface to their periodicals, are available on a transaction basis by article. These two approaches seek to maximize profit in accordance with the nature of the products.
Currently, UMI, EBSCO, and IAC, which function as the aggregators, have negotiated arrangements that allow site licenses for unlimited purchasing. These companies are operating as vendors who make collections of general purpose titles available under arrangements that pay the publishers royalties for each copy of their articles printed by library users. UMI, IAC, and EBSCO have established license arrangements with libraries for unlimited printing with license fees based on expected printing activity, thus offering some libraries a solution to the fundamental pricing problem created by the monopoly power of publishers.
New research could test whether publishers are able to retain monopoly power with electronic counterparts to their journals. Theory predicts that in a competitive market, even when it is characterized as monopolistic competition, the price offered to individuals will tend to remain elastic. Faced with a change in price of the subscriptions purchased from their own pockets, scholars will act discriminately. Raise the price to individuals and some will cancel their subscriptions in favor of access to a library. In other words, the price of periodicals to individuals is a determinant of demand for library access. By exercising a measure of monopoly power in place of price, publishers have some ability to influence their earnings through price discrimination.
In contrast, publishers can set prices to libraries higher than the price to individuals as a means to extract consumer surplus. The difference in prices provides a reasonable measure of the extent of monopoly power, assuming that the individual subscription price is an acceptable proxy for the marginal cost of production. Even if not perfect, the difference in prices represents some measure of monopoly power. Extending this line of research may show that monopoly power is affected by the medium.
In monopolistic competition, anything that differentiates a product may increase monopoly power. Historically, tremendous amounts of advertising money are expended to create the impression that one product is qualitatively distinguishable from others. It may be that electronic availability of specific titles will create an impression of superior quality that could lead to higher prices. However, the prices of journals across disciplines also may be driven by different factors. In general, prices are higher in the sciences and technical areas and lower in the humanities. This price differential is understandable considering that there is essentially no market for scholarly publications in the humanities outside of academe, whereas scientific publications are used heavily in corporate research. As a result, monopoly power will likely be stronger in the sciences than in other areas. This
power would reflect additional price discrimination in the electronic environment by publishers who are able to capture revenue lost to photocopying.
Access Versus Ownership Strategy
Clearly, if commercial publishers continue to retain or enhance their monopoly power with electronic counterparts of their journals, the academic marketplace must adjust or react more effectively than it has in the past. The reaction of universities could lead to erosion of previous success achieved with price discrimination if an appropriate strategy is followed. Instead of owning the periodicals needed by their patrons, some libraries have experimented with replacing subscriptions with document delivery services. Louisiana State University reports canceling a major portion of their print journals. They replaced these cancellations by offering faculty and students unlimited subsidized use of a document delivery service. The first-year cost for all the articles delivered through this service was much less than the total cost to the library for the former subscriptions. Major savings for the library budget via this approach would appeal to library directors and university administrators as a fruitful solution. However, it may turn out to be a short-term solution at best.
Carried to its logical conclusion, this approach produces a world in which each journal is reduced to one subscription shared by all libraries. This situation is equivalent to every existing journal having migrated to single copies in on-line files accessible to all interested libraries. Some libraries will pay a license fee in advance to allow users unlimited printing access to the on-line title, while others will require users to pay for each article individually. Individual article payment requires the entire fixed-cost-plus-profit components of a publisher's revenue to be distributed over article prints only, whereas with print publications, the purchase of subscriptions of physical artifacts that included many articles not needed immediately brought with it a bonus. The library acquired and retained many articles with future potential use. Transaction-based purchasing sacrifices this bonus and increases the marginal cost of articles in the long run. In sum, the marginal cost of a journal article in the print domain was suppressed by the spread of expenditure over many items never read. In the electronic domain under transaction-based pricing, users face a higher, more direct price and therefore are more likely to forego access. While the marginal benefit to the user may be equivalent, the higher marginal cost makes it less likely that users will ask for any given article. The result may show up in diminished scholarly output or notably higher prices per article.
In the long term, should a majority of libraries take this approach, it carries a benefit for publishers. There has been no means available in the past for publishers to count the actual number of photocopies made in libraries and thus to set their price accordingly. The electronic domain could make all those hidden transactions readily apparent. As a result, publishers could effectively maintain their
corporate control of prices and do so with more accurate information with which to calculate license fees. Given this attempted solution, publishers would be able to regain and strengthen their monopoly position.
A more promising approach lies in consortial projects such as that conducted by the Associated Colleges of the South (ACS). Accompanying the Periodical Abstracts and ABI/Inform indexes of UMI that are made available on-line from the vendor or through OCLC are collections in full text of over 1,000 existing journals with backfiles. The ACS contracted an annual license for these two products (ABI/Inform and Periodical Abstracts ) for the 13 schools represented. Trinity University pays $11,000 per year for the electronic periodicals in the UMI databases, a cost that is similar to that paid by each ACS library. Coincidentally, Trinity subscribes to the print version of about 375 titles covered by these products. Trinity could cancel its subscriptions to the print counterparts of the journals provided and save $25,000. Even considering that Trinity's library will subsidize user printing for paper, toner, and so forth at an expected cost of several thousand dollars per year to service its 230 faculty and 2,500 students, it appears likely that favorable economies accrue from switching to these electronic products. Of course, these savings will be accompanied by a significant decrease in nondollar user cost to patrons, so unmet demand will emerge to offset some of the savings. Moreover, there is a substantial bonus for Trinity users inherent in this arrangement.
There are a number of titles made available in the UMI product for which subscriptions would be desirable at Trinity but have not been purchased in the past because of budget limitations. From some of these titles, users would have acquired articles through the normal channels of interlibrary loan. However, the interlibrary loan process imposes costs in the form of staff time and user labor and is sufficiently cumbersome, causing many users to avoid it for marginally relevant articles. However, if marginal articles could be easily viewed on screen as a result of electronic access, users would consider the labor cost of acquiring them to have been sufficiently reduced to encourage printing the articles from the system. Therefore, the net number of article copies delivered to users will be significantly increased simultaneous with a substantial net decrease in the cost of subscriptions delivered to libraries.
Included in this equation are savings that accrue to the consortial libraries by sharing access to electronic subscriptions. Shared access will result in a specific number of print cancellations, which will decrease publisher profit from subscriptions. Publishers offering their journals in the electronic domain will be confronted by a change in the economic infrastructure that will flatten the scholar's demand functions for their titles while simultaneously increasing the availability of articles to the direct consumers. By lowering the user's nondollar cost of accessing individual articles, demand will increase for those items. Scholars, therefore, will be more likely to print an article from an electronic library than they would be to request it through interlibrary loan. However, depending on library policy, those scholars may be confronted with a pay-per-print fee, which will affect their de-
mand function. If publishers raise the price to scholars for an article, they are more likely to lose a sale. Users will be more cautious with their own money than with a library's. That is, in the electronic domain, where scholars may be paying directly for their consumption, demand functions will be more elastic. This elasticity will occur to some extent even when users do not pay for articles but merely note the article price paid by their subsidizing library. Therefore, price discrimination may be more difficult to apply and monopoly power will be temporarily lost.
The loss might be temporary because this strategy is functionally the same as merging several libraries into one large library and providing transaction-based access versus ownership. This super library could ultimately face price discrimination similar to that currently existing in the print domain. This discrimination will lead, in turn, to the same kind of inflation that has been suffered for many years.
Preliminary Analysis of Financial Impact
This paper reports on the early stages of a three-year study funded by The Andrew W. Mellon Foundation. The study includes analysis directed at testing the viability of consortial access versus ownership for cost savings as well as the potential long-term solution that would derive from emergence of a new core of electronic titles. A complete financial analysis of the impact of consortial, electronic access to a core collection of general purpose periodicals and an econometric analysis of over 2,000 titles on the impact of electronic availability on pricing policy will issue from the study conducted under this grant. Some interesting issues have emerged with preliminary results of the study.
The Palladian Alliance is a project of the Associated Colleges of the South funded by The Andrew W. Mellon Foundation. This consortium of 13 liberal arts colleges-not just libraries-has a full-time staff and organizational structure. The Palladian Alliance came about as result of discussions among the library directors who were concerned about the problem described in this paper. As the project emerged, it combined the goals of several entities, which are shown in Table 14.1 along with the specific objectives of the project.
The Andrew W. Mellon Foundation awarded a grant of $1.2 million in December 1995 to the ACS. During the first half of 1996, the librarians upgraded hardware, selected a vendor to provide a core collection of electronic full-text titles, and conducted appropriate training sessions. Public and Ariel workstations were installed in libraries by July 1996 and necessary improvements were made to the campus networks to provide access for using World Wide Web technology. Training workshops were developed under contract with Amigos and SOLINET on technical aspects and were conducted in May 1996. During that same time, an analysis was conducted to isolate an appropriate full-text vendor.
After comparison of the merged print subscription list of all institutions with three products-IAC's InfoTrac, EBSCO's EBSCOHOST, and UMI's Periodical Abstracts and ABI/Inform-the project team selected UMI with access through OCLC. A contract with OCLC was signed in June for July 1, 1996, start-up of FirstSearch for the nine core databases: WorldCat, FastDoc, ERIC, Medline, GPO Catalog, ArticleFirst, PapersFirst, ContentsFirst, and ProceedingsFirst; and for UMI's two core indexes, Periodical Abstracts and ABI/Inform, along with their associated full-text databases. This arrangement for the UMI products provides a general core collection with indexing for 2,600 titles, of which approximately 1,000 are full-text titles.
The UMI via OCLC FirstSearch subscription was chosen because it offered several advantages including the potential for a reliable, proprietary backup to the Internet, additional valuable databases at little cost, and easy means to add other databases. The UMI databases offered the best combination of cost and match with existing holdings. UMI also offered the future potential of full-image as well as ASCII text. After the first academic year, the project switched to access via ProQuest Direct in order to provide full image when available.
Students have had access to the core electronic titles since the fall semester in 1996. As experience builds, it is apparent that the libraries do have some opportunity to cancel print subscriptions with financial advantages. The potential costs, savings, and added value are revealed in Tables 14.2 through 14.4. Specific financial impact on the institutions during the first year is shown in Tables 14.5 and 14.6. It should be noted that the financial impact is based on preliminary data that has been extremely difficult to gather. Publisher and vendor invoices vary considerably
between schools on both descriptive information and prices. Therefore, these results will be updated continually throughout the project.
At the outset, the project benefits the libraries in a significant way because of the power of consortial purchasing. Only a few of the larger libraries might be able to afford to purchase access to both full-text databases were they constrained to individual purchases. Added together, individual subscriptions to ABI/Inform and Periodical Abstracts accompanied with the full text would collectively cost the 13 libraries $413,590 for 1997/98. By arranging consortial purchase, the total cost to the ACS is $129,645 for this second year. Because the libraries can then afford their share of the collective purchase, the vendor benefits from added sales otherwise not available and the libraries add numerous articles to the resources provided their students. More detailed accounting of the benefits are determined in the accompanying tables.
These tables are based on actual financial information for the consortium. Table 14.2 summarizes the project costs. These calculations will be corrected to reflect revised enrollment figures immediately prior to renewal for the third year. The project was designed to use grant funds exclusively the first year, then gradually shift to full support on the library accounts by the fourth year.
As the project started, the ACS libraries collectively subscribed through EBSCO, FAXON, and Readmore to approximately 14,600 subscriptions as shown in Table 14.3. Of these subscriptions, 6,117 are unique titles; the rest are duplicates of these unique titles. Were the ACS libraries collectively merged into one collection, it would therefore be possible to cancel more than 8,000 duplications and save over $1,000,000. Since this merger was not possible, the libraries contracted for electronic access to nearly 1,000 full-text titles from UMI. Over 600 of these UMI titles match the print subscriptions collectively held by the libraries. As Table 14.3 indicates, canceling all but one subscription to the print counterparts of the UMI titles could save the libraries about $137,000 for calendar year 1996. Canceling all the print counterparts to the electronic versions would save nearly $185,000, which is about equal to the licensing costs for the first year per Table 14.2.
For calendar year 1996, the libraries canceled very few titles. In part, this came about because of reluctance to depend upon an untested product. There was no existing evidence that UMI (or any other aggregator) could maintain a consistent list of offerings. To date, cancellations for 1997 have also been fewer than expected at the outset. Furthermore, the project has begun to show that products such as ProQuest Direct come closer to offering a large pool of journal articles than they do to offering full electronic counterparts to print subscriptions. However, these products do provide significant benefits to the libraries.
The project adds considerable value to the institutional resources. The schools had not previously subscribed to many of the titles available through UMI. As an
illustration, Table 14.4 lists the number of print subscriptions carried by each institution and indicates how many of those are available in the UMI databases electronically. The fourth column reveals the potential savings available to each school were the print counterparts of all these electronic journals to be canceled. The column labeled Added E-Titles shows the number of new journals made available to each institution through the grant. The final column indicates the total titles now available at each institution as a result of the consortial arrangement. Comparison of the final column with the first reveals that the electronic project nearly doubles the journal resources available to students.
Table 14.5 details the preliminary financial impact on the ACS institutions for the first and second calendar year of the project. While the opening premise of
the project suggests that canceling print subscriptions would pay for consortial access to UMI's aggregated collections, actual practice shows otherwise. The data is still being collected in the form of invoices, but preliminary summaries of cancellations show meager savings. Total savings across the 13 campuses is little more than $50,000 per year. This savings is not enough to pay for the first two years of the project, which is over $350,000. However, the added value to the combined collections exceeds $2,000,000 per year as measured by the cost of print counter-parts to the UMI titles. Furthermore, additional action by some institutions, as shown in Table 14.6, reveals better outcomes.
Comparing the savings shown in Table 14.6 with the subsidized cost reveals that in the cases of Trinity and Millsaps, even without Mellon support, the consortial provision of the OCLC/UMI databases could be paid for by canceling indexes along with a few print subscriptions. In Trinity's case, two indexes previously purchased as CD-ROMs or direct links to another on-line source were canceled for savings of over $5,000 in the first year. Trinity canceled a CD-ROM subscription to a site license of ABI/Inform, which saved expenditures totaling over $6,000 per year, and an on-line general purpose index that previously cost over $12,000. The Trinity share to the Palladian Alliance project would have been just over $13,000 per year for the first three years. Similarly, Millsaps canceled one index and 74 periodical titles that overlapped the UMI content for net first-year savings of nearly $9,000. On this basis, the project more than pays for itself.
Added interesting outcomes of the project at this point include a couple of new
(table continued on next page)
(table continued from previous page)
pieces of important information. First, canceling individual subscriptions to indexes provides a viable means for consortial pricing to relieve campus budgets, at least in the short run. Were it necessary for Trinity to pay its full share of the cost, canceling indexes alone provided sufficient savings to pay for the project. Just considering trade-offs with indexes alone, Trinity's net savings over the project life span total nearly $18,000.
Second, on the down side, canceling journals and replacing them with an aggregator's collection of electronic subscriptions may not be very reliable. It is apparent that the aggregators suffer from the vagaries of publishers. In just the first few months of the project, UMI dropped and added a number of titles in both full-text databases. This readjustment means that instead of full runs of each title, the databases often contain only partial runs. Furthermore, in some cases the publisher provides only significant articles, not the full journal. Therefore, the substitution of UMI provides the libraries with essentially a collection of articles, not a collection of electronic subscription substitutes. This result diminishes reliability and discourages libraries from being able to secure really significant cost savings.
It should be noted however, that several of the libraries independently subscribed to the electronic access to Johns Hopkins Project MUSE. In contrast to an aggregated collection, this project provides full-image access to every page of the print counterparts and guarantees access indefinitely to any year of the subscription once it's paid for. This guarantee means that reliability of the product is substantially improved, and it provides reasonable incentives to the libraries to substitute access for
collecting. While it may be acceptable to substitute access to a large file of general purpose articles for undergraduate students, Project MUSE promises better results than the initial project for scholarly journal collections. The final report of this project will include information on the impact of the Project MUSE approach as well as on the original concept.
Third, the impact of on-line full-text content may or may not have an impact on interlibrary loan activity. Table 14.7 summarizes the searching and article delivery statistics for the first six months of the project compared to the total interlibrary borrowing as well as nonreturn photocopies ordered through the campus interlibrary loan offices. The change in interlibrary loan statistics for the first six months of the project compared to the previous year show that in some cases interlibrary borrowing increased and in other cases it decreased. Several variables besides the availability of full-text seem to affect use of interlibrary loan services. For instance, some of the institutions had full-text databases available before the project started. Some made more successful efforts to promote the project services than others. It seems likely that improved access to citations from on-line indexes made users more aware of items that could be borrowed. That effect probably offset an expected decrease in interlibrary loans that the availability of full text makes predictable. Regardless, statistics on this issue yield inconclusive results early in the project.
Fourth, it is curious that secondary journals in many fields are published by commercial firms rather than by professional organizations and that their publications are sold at higher prices. Libraries typically pay more for Haworth publications than they do for ALA publications. Haworth sells largely to libraries responding not to demand for content but for publication outlets. Libraries are willing to pay for the Haworth publications. This fact helps explain why secondary titles cost more than primary ones. Demand may be more for exposure of the contributor than it is for reading of content by subscribers. The econometric analysis included in the project may confirm this unintended hypothesis.
At this point, a meaningful econometric analysis is many months away. A model based on Lerner's definition of monopoly power will be used to examine pricing as journals shift into the electronic sphere. The model calls for regressing the price of individual titles on a variety of independent variables such as number of pages, advertising content, circulation, and publisher type, and for including a dummy variable for whether a journal is available electronically. Data is being collected on over 2,000 of the subscriptions held by Trinity for the calendar years 1995 through 1997. Difficulties with financial data coupled with the time-consuming nature of data gathering have delayed progress on the econometric analysis.
It would be desirable to conduct an analysis on time series data to observe the
consequences in journal price changes as a shift is made to electronic products. This analysis would provide a forecast of how publishers react. Lacking the opportunity at the outset to examine prices over time, a straightforward model applying ordinary least squares (OLS) regression on cross section data, similar to the analyses reported by others, will form the basis of the analysis. Earlier models have
typically regressed price on a number of variables to distinguish the statistical relevance of publisher type in determining price. By modifying the earlier models, this analysis seeks to determine whether monopoly power may be eroded in the electronic market. The methodology applied uses two specifications for an ordinary least squares regression model. The first regresses price on the characteristics of a set of journal titles held by the ACS libraries. This data set is considerably larger than those utilized in previous studies. Therefore, we propose to confirm the earlier works that concentrate on economic journals across a larger set of disciplines. This specification includes the variables established earlier: frequency of publication, circulation, pages per year, and several dummy variables to control for whether the journals contain advertising and to control for country of publication. Four dummy variables are included for type of publisher with the residual being commercial. A second specification regressing the difference in price for libraries compared to individuals will be regressed on the same set of variables with an additional dummy added to show whether given journals are available electronically.
The ACS libraries collectively subscribe to approximately 14,000 title. Where they duplicate, an electronic set has been substituted for shared access. We anticipate that at the margin, the impact on publishers would be minimal if ACS canceled subscriptions to the print counterparts of this set. However, the national availability of the electronic versions will precipitate cancellations among many institutions in favor of electronic access. Prices will be adjusted accordingly. Since most publishers will offer some products in print only and others within the described electronic set, we expect the prices of the electronic version will reflect an erosion of monopoly power. Thus the cross section data will capture the effect of electronic availability on monopoly power.
Since the data set is comprised of several thousand periodical titles representing general and more popular items, several concerns experienced by other investigators will be mitigated. The only study found in the literature so far that examines publishers from the standpoint of the exercise of monopoly power investigated price discrimination. This project intends to extend that analysis in two ways. First, we will use a much broader database, since most of the previous work was completed on limited data sets of less than 100 titles narrowly focused in a single academic discipline. Second, we will extend the analysis by assuming the existence of price discrimination given the difference in price to individuals versus libraries for most scholarly journals. With controls in the model for previous discoveries regarding price discrimination, we will attempt to test the null hypothesis that monopoly power will not decrease in the electronic domain.
In the data set available, we were unable to distinguish the specific price of each journal for the electronic replacement, because UMI priced the entire set for a flat fee. This pricing scheme may reflect an attempt by publishers to capture revenue lost to interlibrary lending. Alternatively, it may reflect publisher expectations that
article demand will increase when user nondollar costs decrease. Thus, monopoly power will be reflected back on to the subscription price of print versions. As a result we will use the price of print copies as a proxy for the specific electronic price of each title.
An alternative result could emerge. In monopolistic competition, anything that differentiates a product may increase its monopoly power. For example, firms selling products expend tremendous amounts of money on advertising to create the impression that their product is qualitatively distinguishable from others. Analogous electronic availability of specific titles may create an impression of superior quality.
The general model of the first specification is written:
where y equals the library price (LPRICE) for journal j = 1, 2, 3, ... n. The definitions of independent variables appear in Table 14.8 along with the expected signs on and calculations of the parameters b1 through b17 to be estimated by traditional single regression techniques.
The general model of the second specification is written:
where y equals two different forms of monopoly power (MPOWER1; MPOWER2) defined as measure i = 1 and 2 for journal j = 1, 2, 3, ... n. Again, the definitions of independent variables appear in Table 14.8 along with the expected signs on and calculations of the parameters b1 through b17 to be estimated by traditional single regression techniques.
The variables listed in Table 14.8 are suggested at this point based on previous studies that have demonstrated that they are appropriate. Testing with the regression model is required in order to determine those variables ultimately useful to this study. Additional variables will be introduced should experiments suggest them. A very brief rationale for the expected sign and the importance of the variables is in order. If the difference in price between what publishers charge libraries versus individuals represents price discrimination, then a variable for the individual price (IPRICE) will be a significant predictor of price to institutions (LPRICE). Higher individual prices will shift users toward the library, thus raising demand for library subscriptions, which will pull institutional prices higher. The sign on this variable is expected to be positive.
One group of variables deals with the issue of price discrimination based on
the monopoly power that can be exercised by foreign publishers. Publishers in Great Britain (GBRITAIN), western Europe (EUROPE), and other countries outside the United States (OTHER) may have enough market power to influence price. Therefore these variables will carry a positive sign if a sizable market influence is exerted. Some of these publishers will also be concerned with currency exchange risks (RISK), which they will adjust for in prices. However, since they offer discounts through vendors for libraries who prepay subscriptions, this variable will carry a negative sign if the price to individuals captures most of the financial burden of risk adjustment.
It is expected that commercial publishers discriminate by price more than their nonprofit counterparts do. Therefore, in comparison to the commercial residual, associations (ASSOC), government agencies (GOVERN), university presses (UNIVPR) and foundations (FOUNDTN) will capture generally lower prices of these nonprofit publishers. Negative signs are expected on these.
All the publishers will experience production costs, which can be exposed through variables that control for frequency (FREQ), total pages printed per year (PAGES), peer review (PEERREV), submission fees (SUBMISSFEE), processing/communication expenses and copyright clearance registration expenses (CCCREG), and the presence of graphics, maps, and illustrations (ILLUS), all of which will positively affect price to the extent they are passed along through price discrimination. Circulation (CIRO) will capture the effects of economies of scale, which those publications that are distributed in larger quantities will experience. Thus this variable is expected to be negative. Similarly, the inclusion of advertising (ADV) will provide additional revenue to that of sales, so this variable is expected to be negative since journals that include ads will have less incentive to extract revenue through sales. New entries into the publishing arena are expected to experience costs for advertising to increase awareness of their products, which will be partially passed on to consumers. Therefore, age (AGE), which is the difference between the current date and the date the journal started, will be a negative predictor of price and monopoly power.
Previous studies have developed measures of quality based on rankings of publications compared to each other within a given discipline. Most of these comparisons work from information available from the Institute for Scientific Information. Data acquired from this source that shows the impact factor, immediacy index, half-life, total cites, and cites per year will be summarized in one variable to capture quality (QUALITY) of journals, which is expected to be positive with regard to both price and monopoly power.
The prices of journals across disciplines may be driven by different factors. In general, prices are higher in the sciences and technical areas and lower in the humanities. This discrepancy is understandable when we consider the market for science versus humanities. As stated earlier, there is essentially no market for scholarly publications in the humanities outside of academe, whereas scientific
publications are used heavily in corporate research by pharmaceutical firms and other industries highly dependent on research. As a result, two additional dummies are included in the model to segment the specification along discipline lines. HUMAN and SOCSCI will control for differences in price among the humanities and social sciences as compared to the residual category of science. These variables are expected to be negative and strong predictors of price.
Finally, a dummy variable is included to determine whether availability of each journal electronically (ELECTRONIC) has a positive impact on ability to discriminate by price. Since we have predicted that monopoly power will erode in the electronic arena, ELECTRONIC should be statistically significant and a negative predictor of monopoly power. However, to the extent that availability of a journal electronically distinguishes it from print counterparts, there is some expectation that this variable could be positive. This would show additional price discrimination by publishers who are able to capture lost revenue in the electronic environment.
The data set will be assembled by enhancing the data on subscriptions gathered during the planning project. Most of the additional data set elements including prices will come from examination of the journals and invoices received by the libraries. Impact and related factors will be acquired from the Institute for Scientific Information. The number of subscriptions supplied in print by two major journal vendors, FAXON and EBSCO, will be used as a proxy for circulation. An alternative measure of circulation will be compiled from a serials bibliography. The rest of the variables were obtained by examination of the print subscriptions retained by the libraries or from a serials bibliography.
There may be other ways to attack the problem of price inflation of scholarly periodicals. Some hope arises from the production cost differences between print and electronic periodicals. The marginal cost of each added print copy diminishes steadily from the second to the n th copy, whereas for electronic publications, the marginal cost of the second and subsequent copies is approximately zero. Although distribution is not quite zero for each additional copy, since computer resources can be strained by volume of access, the marginal cost is so close to zero that technical solutions to the problem of unauthorized redistribution for free of pirated copies might provide an incentive for publishers in the electronic domain to distribute equitably the cost of the first copy across all consumers. If the total cost of production of the electronic publications is lower than it would be for printed publication, some publishers may share the savings with consumers. However, there is no certainty that they will, because profit maximizers will continue to be profit maximizers. Therefore, it is appropriate to look for a decoupled solution lying in the hands of consumers.
In the meantime, the outcomes of this research project will include a test of the
benefits of consortial access versus ownership. In addition, earlier work on price discrimination will be extended with this cross-discipline study to determine whether electronic telecommunications offers hope of relief from monopoly power of publishers.
The author wishes to acknowledge with thanks the financial support of The Andrew W. Mellon Foundation and the participation of several colleagues from libraries of the Associated Colleges of the South. Thanks also to my associate Tanya Pinedo for data gathering and analysis. All errors remain the responsibility of the author.
The Use of Electronic Scholarly Journals
Models of Analysis and Data Drawn from the Project MUSE Experience at Johns Hopkins University
James G. Neal
Project MUSE is a collaborative initiative between the Johns Hopkins University Press and the Milton S. Eisenhower Library at Johns Hopkins University to provide network-based access to scholarly journals including titles in the humanities, social sciences, and mathematics. Launched with electronic versions of 40 titles still published in print, Project MUSE coverage has now been expanded to include electronic-only publications. Funded initially by grants from The Mellon Foundation and the National Endowment for the Humanities, Project MUSE seeks to create a successful model for electronic scholarly publishing characterized by affordability and wide availability. It has been designed to take advantage of new technical capabilities in the creation and storage of electronic documents. It has been developed to provide a range of subscription options for individual libraries and consortia. It is based on a very liberal use and reuse approach that encourages any noncommercial activity within the bounds of the subscribing organization.
Project MUSE has been produced from the outset for usability, with a focus on user-centered features. This focus has evolved as a participative and interactive process, soliciting input and feedback from users and integrating user guidance components into the system. An on-line survey is available to all users, and libraries are providing information about the local implementation and the results of campus and community focus group discussions on Project MUSE. As the number of subscribing libraries expands and the activity grows, a valuable database of user experiences, attitudes, and behaviors will accumulate. A new feature will be the ability to track and analyze individual search sessions and to observe closely user activities. This feature will monitor the impact of new capabilities and the efficiency of searching practices.
Six models of use analysis are discussed in this paper that cover both the macro, or library-level, activity and the micro, or individual user-level, activity:
1. subscribing organizations-which libraries are subscribing to Project MUSE and how do they compare with the base of print journal customers?
2. subscriber behaviors-how do libraries respond as access to electronic journals is introduced and expanded, and in particular, how are acquisitions like Project MUSE accommodated in service and collection development programs and budgets?
3. user demography-what are the characteristics of the individual user population in such areas as status, background/experience, motivation, attitudes, and expectations?
4. user behaviors-how do individuals respond to the availability of scholarly materials in electronic format as they explore the capabilities of the system and execute requests for information?
5. user satisfaction-what objectives do users bring to network-based access to scholarly information, and how do users evaluate system design and performance and the quality of search results?
6. user impact-how are user research and information-seeking activities being shaped by access to full-text journal databases like Project MUSE?
One of the objectives of Project MUSE is to achieve full cost recovery status by the completion of the grant funding period in 1998. Therefore, it is important to monitor the growth in the base of subscribing libraries and to evaluate the impact on the print journal business of the Hopkins Press. An analysis of those libraries subscribing to the full Project MUSE database as of June 1997 (approximately 400 libraries) demonstrates a very significant expansion in the college, community college, and now public library settings with very low or no history of subscriptions to the print journals (see Table 15.1). The result is a noteworthy expansion in access to Hopkins Press titles, with 70% of the subscribing libraries currently purchasing less than 50% of the titles in print and over one-fourth acquiring no print journals from the Hopkins Press.
One explanation for these patterns of subscription activity is the purchase arrangement for Project MUSE. Over 90% of the libraries are subscribing to the full Project MUSE database of 43 titles. And due to very favorable group purchase rates, nearly 80% of Project MUSE subscribers are part of consortial contracts. The cooperative approach to providing access to electronic databases by libraries in a state or region is widely documented, and the Project MUSE experience further evidences this phenomenon.
Another objective of Project MUSE is to enable libraries to understand the use of collections and thus to make informed acquisitions and retention decisions. The impact on collection development behaviors will be critical, as libraries do indicate intentions to cancel print duplicates of MUSE titles and to monitor carefully the information provided on individual electronic title and article activity. Use information is beginning to flow to subscribing libraries, but there is no evidence yet of journal cancellations for Hopkins Press titles.
An important area of analysis is user demography, that is, the characteristics of the individuals searching the Project MUSE database. An on-line user survey and focus group discussions are beginning to provide some insights:
• The status of the user, that is, undergraduate student, graduate student, faculty, staff, community member, or library employee. As Project MUSE is introduced, library staff are typically the heaviest users, followed by a growth in student use as campus awareness and understanding expands.
• Type of institution, that is, research university, comprehensive university, liberal arts college, community college, or public library setting. As Project MUSE subscriptions have increased and access has extended into new campus settings, heavier use has initially been in the research universities and liberal arts colleges where there is either traditional awareness of Project MUSE titles or organized and successful programs to promote availability.
• The computer experience of users, that is, familiarity with searching fulltext electronic databases through a Web interface. Project MUSE users tend to be knowledgeable Internet searchers who have significant comfort with Web browsers, graphical presentations of information, and construction of searches in textual files.
• The location of use, that is, in-library, on-campus in faculty office and student residence hall, or off-campus. Preliminary data indicates that the searching of Project MUSE is taking place predominantly on library-based equipment. This finding can be explained by the inadequate network infrastructure that persists at many campuses or by the general lack of awareness of Project MUSE until a user is informed by library staff about its availability during a reference exchange.
• The browsers used to search the Project MUSE database. An analysis of searches over an 18-month period confirms that Netscape browsers are used now in over 98% of the database activity, with a declining percentage of Lynx and other nongraphical options.
Project MUSE enables searching by author, title, or keyword, in the table of contents or the full text of the journals, and across all the journals or just selected titles. All articles are indexed with Library of Congress subject headings. Hypertext links in tables of contents, articles, citations, endnotes, author bibliographies, and illustrations allow efficient navigation of the database. User searching behavior is an important area for investigation, and some preliminary trends can be identified:
• The predominant search strategy is by keyword, with author and title inquiries occurring much less frequently. This strategy can be partially explained by the heavy undergraduate student use of the database and the rich results enabled by keyword strategies.
• Use of the database is equally distributed across the primary content elements: tables of contents, article abstracts, images linked to text, and the articles. An issue for future analysis is the movement of users among these files.
• Given the substantial investment in the creation of LC subject headings and the maintenance of a structured thesaurus to enhance access to articles, their value to search results and user success is being monitored carefully.
• With the expansion of both internal and external hypertext links, the power of the Web searching environment is being observed, the user productivity gains are being monitored, and the willingness to navigate in an electronic journal database is being tested.
• Users are directed to the Project MUSE database through several channels. Libraries are providing links from the bibliographic record for titles in the on-line catalog. Library Web sites highlight Project MUSE or collections of electronic journals. Subject pages list the Project MUSE titles that cluster in a particular discipline.
• Users are made aware of Project MUSE through a variety of promotional and educational strategies. Brochures and point-of-use information are
being prepared. In some cases, campus media have included descriptive articles. Library instructional efforts have focused on Project MUSE and its structure and searching capabilities.
• Printing and downloading to disk are important services linked to the effective use of Project MUSE, given the general unwillingness of users to read articles on-line. Libraries have an interest in maximizing turnover on limited computer equipment and are focused on implementing cost-recovery printing programs.
• Project MUSE is increasingly enabling users to communicate with publishers, journal editors, and the authors of articles through e-mail links embedded in the database. Correspondence has been at a very low level but is projected to expand as graduate student and faculty use increases and as familiarity and comfort with this feature grows.
With over 400 subscribing libraries and over three million potential users of Project MUSE in the communities served, it is possible to document global use trends and the changing intensity of searching activity (see Table 15.2). The progression of use over time as a library introduces access to Project MUSE is being monitored. Early analysis suggests that the first two quarters of availability produce low levels of use, while third quarter use expands significantly.
Data is also being collected on the number of requests for individual journal titles. During the 12-month period ending August 1, 1997, the total number of requests to the MUSE database was just over nine million, for an average of just under 25,000 hits per day. For data on the average number of requests per month for individual journal titles, see Table 15.3.
In addition, data is now being collected on the number of requests for individual journal articles. During the 12-month period ending August 1, 1997, 100 articles represented 16.5% of the total articles requested. The article receiving the largest number of requests was hit 3,944 times. Two journals, Postmodern Culture (33 articles) and Configurations (22 articles), included 55% of the most frequently requested articles.
User satisfaction with the quality and effectiveness of Project MUSE will be the central factor in its long-term success. Interactions with users seek to understand expectations, response to system design and performance, and satisfaction with results. The degree to which individuals and libraries are taking advantage of expansive fair use capabilities should also be gauged.
Project MUSE has focused on various technical considerations to maximize the dependability and efficiency of user searching. Detailed information on platforms and browsers is collected, for example, and access denials and other server responses that might indicate errors are automatically logged and routed for staff investigation.
Expectations for technology are generally consistent: more content, expanded access, greater convenience, new capabilities, cost reduction, and enhanced pro-
ductivity. It will be important to monitor the impact of Project MUSE in the subscribing communities and to assess whether it is delivering a positive and effective experience for users.
It is also important to maximize the core advantages of using information in digital formats:
• accessibility, that is, delivery to locations wherever users can obtain network connections
• searchability, that is, the range of strategies that can be used to draw relevant information out of the database
• currency, that is, the ability to make publications available much earlier than is possible for print versions
• researchability, that is, the posing of questions in the digital environment that could not even be conceived with print materials
• interdisciplinarity, that is, the ability to conduct inquiries across publications in a range of diverse disciplines and discover new-but-related information
• multimedia, that is, access to text, sound, images, video in an integrated presentation
• linkability, that is, the hypertext connections that can be established among diverse and remote information sources
• interactivity, that is, the enhancement of user control and influence over the flow of information and the communication that can be integrated into the searching activity
Project MUSE will be evaluated against these quantitative and qualitative models. Its success will ultimately be determined by its support for the electronic scholarly publishing objectives outlined in the work of the American Association of Universities and the Association of Research Libraries:
• foster a competitive market for scholarly publishing by providing realistic alternatives to prevailing commercial publishing options
• develop policies for intellectual property management emphasizing broad and easy distribution and reuse of material
• encourage innovative applications of information technology to enrich and expand the means for distributing research and scholarship
• ensure that new channels of scholarly communication sustain quality requirements and contribute to promotion and tenure processes
• enable the permanent archiving of research publications and scholarly communication in digital formats
A New Consortial Model for Building Digital Libraries
Raymond K. Neff
Libraries in America's research universities are being systematically depopulated of current subscriptions to scholarly journals. Annual increases in subscription costs are consistently outpacing the growth in library budgets. This problem has become chronic for academic libraries that collect in the fields of science, engineering, and medicine, and by now the problem is well recognized (Cummings et al. 1992). At Case Western Reserve University, we have built a novel digital library distribution system and focused on our collections in the chemical sciences to investigate a new approach to solving a significant portion of this problem. By collaborating with another research library that has a strong chemical sciences collection, we have developed a methodology to control costs of scholarly journals and have planted the seeds of a new consortial model for building digital libraries. This paper summaries our progress to date and indicates areas in which we are continuing our research and development.
For research libraries in academia, providing sufficient scholarly information resources in the chemical sciences represents a large budgetary item. For our purposes, the task of providing high-quality library services to scholars in the chemical sciences is similar to providing library services in other sciences, engineering, and medicine; if we solve the problem in the limited domain of the chemical sciences, we can reasonably extrapolate our results to these other fields. Thus, research libraries whose mission it is to provide a high level of coverage for scholarly publications in the chemical sciences are the focus of this project, although we believe that the principles and practices employed in this project are extensible to the serial collections of other disciplines.
A consortium depends on having its members operating with common missions, visions, strategies, and implementations. We adopted the tactics of developing a consortial model by having two neighboring libraries collaborate in the initial project. The University of Akron (UA) and Case Western Reserve University
(CWRU) both have academic programs in the chemical sciences that are nationally ranked, and the two universities are fewer than 30 miles apart. It was no surprise to find that both universities have library collections in the chemical sciences that are of high quality and nearly exhaustive in their coverage of scholarly journals. To quantify the correlation between these two collections, we counted the number of journals that both collected and found the common set to be 76% in number and 92% in cost. The implications of the overlap in collecting patterns is plain; if both libraries collected only one copy of each journal, with the exception of the most used journals, approximately half of the cost of these subscriptions could be saved. For these two libraries, the cost savings is potentially $400,000 per year. This cost savings seemed like a goal worth pursuing, but to do so would require building a new type of information distribution system.
The reason scholarly libraries collect duplicative journals is that students and faculty want to be able to use these materials by going to the library and looking up a particular volume or by browsing the current issues of journals in their field. Eliminating a complete set of the journals at all but one of our consortial libraries would deprive local users of this walk-up-and-read service. We asked ourselves if it would be possible to construct a virtual version of the paper-based journal collection that would be simultaneously present at each consortium member institution, allowing any scholar to consult the collection at will even though only one copy of the paper journal was on the shelf. The approach we adopted was to build a digital delivery system that would provide to a scholar on the campus of a consortial member institution, on a demand basis, either a soft or hard copy of any article for which a subscription to the journal was held by a consortial member library. Thus, according to this vision, the use of information technology would make it possible to collect one set of journals among the consortium members and to have them simultaneously available at all institutions. Although the cost of building the new digital distribution system is substantial, it was considered as an experiment worth undertaking. The generous support of The Andrew W. Mellon Foundation is being used to cover approximately one-half of the costs for the construction and operation of the digital distribution system, with Case Western Reserve University covering the remainder. The University of Akron Library has contributed its expertise and use of its chemical sciences collections to the project.
It also seemed necessary to us to want to invite the cooperation of journal publishers in a project of this kind. To make a digital delivery system practical would require having the rights to store the intellectual property in a computer system, and when we started this project, no consortium member had such rights. Further, we needed both the ongoing publications and the backfiles so that complete runs of each serial could be constructed in digital form. The publishers could work out agreements with the consortium to provide their scholarly publications for inclusion in a digital storage system that would be connected to our network-based transmission system, and thus, their cooperation would become essential. The chemical sciences are disciplines in which previous work with electronic libraries
had been started. The TULIP Project of Elsevier Science (Borghuis et al. 1996) and the CORE Project undertaken by a consortium of Cornell University, the American Chemical Society, Bellcore, Chemical Abstracts, and OCLC were known to us, and we certainly wanted to benefit from their experiences. Publications of the American Chemical Society, Elsevier Science, Springer-Verlag, Academic Press, John Wiley & Sons, and many other publishers were central to our proposed project because of the importance of their journal titles to the chemical sciences disciplines.
We understood from the beginning of this effort that we would want to monitor the performance of the digital delivery system under realistic usage scenarios. The implementation of our delivery system has built into it extensive data collection facilities for monitoring what users actually do. The system is also sensitive to concerns of privacy in that it collects no items of performance information that may be used to identify unambiguously any particular user.
Given the existence of extensive campus networks at both CWRU and UA and substantial internetworking among the academic institutions in northeastern Ohio, there was sufficient infrastructure already in place to allow the construction and operation of an intra- and intercampus digital delivery system. Such a digital delivery system has now been built and made operational. The essential aspects of the digital delivery system will now be described.
A Digital Delivery System
The roots of the electronic library are found in landmark papers by Bush (1945) and Kemeny (1962). Most interestingly, Kemeny foreshadowed what prospective scholarly users of our digital library told us was their essential requirement, which was that they be able to see each page of a scholarly article preserved in its graphical integrity. That is, the electronic image of each page layout needed to look like it did when originally published on paper. The system we have developed uses the ACROBATR page description language to accomplish this objective.
Because finding aids and indices for specialized publications are too limiting, users also have the requirement that the article's text be searchable with limited or unlimited discipline-specific thesauri. Our system complements the page images with an optical character recognition (OCR) scanning of the complete text of each article. In this way, the user may enter words and phrases the presence of which in an article constitutes a "hit" for the scholar.
One of the most critical design goals for our project was the development of a scanning subsystem that would be easily reproducible and cost efficient to set up and operate in each consortium member library. Not only did the equipment need to be readily available, but it had to be adaptable to a variety of work flow and staff work patterns in many different libraries. Our initial design has been successfully tailored to the needs of both the CWRU libraries and the Library at the Univer-
sity of Akron. Our approach to the sharing of paper-based collections is to use a scanning device to copy the pages of the original into a digital image format that may be readily transmitted across our existing telecommunications infrastructure. In addition, the digital version of the paper is stored for subsequent retrieval. Thus, repeated viewing of the same work would necessitate only a one-time transformation of format. This procedure is an advantage in achieving faster response times for scholars, and it promotes the development and use of quality control methods. The scanning equipment we have used in this project and its operation are described in Appendix E. The principal advantage of this scanner is that bound serials may be scanned without damaging the volume and without compromising the resulting page images; in fact, the original journal collection remains intact and accessible to scholars throughout the project. This device is also sufficiently fast that a trained operator, including students, may scan over 800 pages per average workday. For a student worker making $7.00 per hour, the personcost of scanning is under $0.07 per page; the cost of conversion to searchable text adds $0.01 per page. Appendix E also gives more details regarding the scanning processes and work flow. Appendix F gives a technical justification for a digitization standard for the consortium. Thus, each consortium member is expected to make a reasonable investment in equipment, training, and personnel.
The target equipment for viewing an electronic journal was taken to be a common PC-compatible computer workstation, hereafter referred to as a client. This client is also the user platform for the on-line library catalog systems found on our campuses as well as for the growing collections of CD-ROM-based information products. Appendix B gives the specification of the workstation standards for the project. The implications for use of readily available equipment is that the client platform for our project would also work outside of the library-in fact, wherever a user wanted to work. Therefore, by selecting the platform we did, we extended the project to encompass a full campuswide delivery system. Because our consortium involves multiple campuses (two at the outset), the delivery system is general purpose in its availability as an access facility.
Just as we needed a place to store paper-based journals within the classical research library, we needed to specify a place to store the digital copies. In technical parlance, this storage facility is called a server. Appendixes B and C give some details regarding the server hardware and software configurations used in this project.
Appendix C also gives some information regarding the campuswide networks on both our campuses and the statewide network that connects them. It is important to note that any connected client workstation that follows our minimum standards will be able to use the digital delivery system being constructed.
Because the key to minimizing the operating costs within a consortium is interoperability and standardization, we have adopted a series of data and equipment standards for this project; they are given in Appendixes A and B.
Rights Management System
One of the most significant problems in placing intellectual property in a networked environment is that with a few clicks of a mouse thousands of copies of the original work can be distributed at virtually zero marginal cost, and as a result, the owner may be deprived of expected royalty revenue. Since we recognized this problem some years ago and realized that solutions outside of the network itself were unlikely to be either permanent or satisfactory to all parties (e.g., author, owner, publisher, distributor, user), we embarked on the creation of a software subsystem now known as Rights Manager(tm). With our Rights Manager system, we can now control the distribution of digitally formatted intellectual property in a networked environment subject to each stakeholder receiving proper due.
In this project, we use the Rights Manager system with our client server-based content delivery system to manage and control intellectual property distribution for digitally formatted content (e.g., text, images, audio, video, and animations).
Rights Manager is a working system that encodes license agreement information for intellectual property at a server and distributes the intellectual property to authorized users over the Internet or a campuswide intranet along with a Rights Manager-compliant browser. The Rights Manager handles a variety of license agreement types, including public domain, site licensing, controlled simultaneous accesses, and pay-per-use. Rights Manager also manages the functionality available to a client according to the terms of the license agreement; this is accomplished by use of a special browser that enforces the license's terms and that permits or denies client actions such as save, print, display, copy, excerpt, and so on. Access to a particular item of intellectual property, with or without additional functionality, may be made available to the client at no charge, with an overhead charge, or at a royalty plus overhead charge. Rights Manager has been designed to accommodate sufficient flexibility in capturing wide degrees of arbitrariness in charging rules and policies.
The Rights Manager is intended for use by individuals and organizations who function as purveyors of information (publishers, on-line service providers, campus libraries, etc.). The system is capable of managing a wide variety of agreements from an unlimited number of content providers. Rights Manager also permits customization of licensing terms so that individual users or user classes may be defined and given unique access privileges to restricted sets of materials. A relatively common example of this customization for CWRU would be an agreement to provide (1) view-only capabilities to an electronic journal accessed by an anonymous user located in the library, (2) display/print/copy access to all on-campus students enrolled in a course for which a digital textbook has been adopted, and (3) full access to faculty for both student and instructor versions of digital editions of supplementary textbook materials.
Fundamental to the implementation of Rights Manager are the creation and maintenance of distribution rights, permissions, and license agreement databases.
These databases express the terms and conditions under which the content purveyor distributes materials to its end users. Relevant features of Rights Manager include:
• a high degree of granularity, which may be below the level of a paragraph, for publisher-defined content
• central or distributed management of rights, permissions, and licensing databases
• multiple agreement types (e.g., site licensing, limited site licensing, and payper-use)
• content packaging where rights and permission data are combined with digital format content elements for managed presentation by Web browser plug-in modules or helper applications
Rights Manager maintains a comprehensive set of distribution rights, permissions, and charging information. The premise of Rights Manager is that each publication may be viewed as a compound document. A publication under this definition consists of one or more content elements and media types; each element may be individually managed, as may be required, for instance, in an anthology.
Individual content elements may be defined as broadly or narrowly as required (i.e., the granularity of the elements is defined by the publisher and may go below the level of a paragraph of content for text); however, for overall efficiency, each content element should represent a significant and measurable unit of material. Figures, tables, illustrations, and text sections may reasonably be defined as individual content elements and be treated uniquely according to each license agreement.
To manage the distribution of complete publications or individual content elements, two additional licensing metaphors are implemented. The first of these, a Collection Agreement, is used to specify an agreement between a purveyor and its supplier (e.g., a primary or secondary publisher); this agreement takes the form of a list of publications distributed by the purveyor and the terms and conditions under which these publications may be issued to end users (one or more Collection Agreements may be defined and simultaneously managed between the purveyor and a customer).
The second abstraction, a Master Agreement, is used to broadly define the rules and conditions that apply to all Collection Agreements between the purveyor and its content supplier. Only one Master Agreement may be defined between the supplier and the institutional customer. In practice, Rights Manager assumes that the purveyor will enter into licensing agreements with its suppliers for the delivery of digitally formatted content. At the time the first license agreement is executed between a supplier and a purveyor, one or more entries are made into the purveyor's Rights Manager databases to define the Master and Collection Agreements. Optionally, Publication and/or Content-Element usage rules may also be
defined. Licensed materials may be distributed from the purveyor's site (or perhaps by an authorized service provider); both the content and associated licensing rules are transferred by the supplier to the purveyor for distributed license and content management.
Depending on the selected delivery option, individual end users (e.g., faculty members, students, or library patrons) may access either a remote server or a local institutional repository to search and request delivery of licensed publications. Depending on the agreement(s) between the owner and the purveyor, individual users are assigned access rights and permissions that may be based on individual user IDs, network addresses, or both.
Network or Internet Protocol addresses are used to limit distribution by physical location (e.g., to users accessing the materials from a library, a computer lab, or a local workstation). User identification may be exploited to create limited sitelicensing models or individual user agreements (e.g., distributing publications only to students enrolled in Chemistry 432 or, perhaps, to a specific faculty member).
At each of the four levels (Master Agreement, Collection Agreement, Publication, and Content-Element), access rules and usage privileges may be defined. In general, the access and usage permissions rules are broadly defined at the Master and Collection Agreement level and are refined or restricted at the Publication and Content-Element levels. For example, a Master or Collection Agreement rule could be defined to specify that by default all licensed text elements may be printed at a some fixed cost, say 10¢ per page; however, high value or core text sections may be individually identified using Publication or Content-Element override rules and assessed higher charges, say 20¢ per page.
When a request for delivery of materials is received, the content rules are evaluated in a bottom-up manner (e.g., content element rules are evaluated before publication rules, which are, in turn, evaluated before license agreement rules, etc.). Access and usage privileges are resolved when the system first recognizes a match between the requester's user ID (or user category) and/or the network address and the permission rules governing the content. Access to the content is only granted when an applicable set of rules specifically granting access permission to the end user is found; in the case where two or more rules permit access, the rules most favorable to the end user are selected. Under this approach, site licenses, limited site licenses, individual licensing, and pay-per-use may be simultaneously specified and managed.
The following use of the Rights Manager rules databases is recommended as an initial guideline for Rights Manager implementation:
1. Use Master Agreement rules to define the publishing holding company or imprint, the agreement's term (beginning and ending dates), and the general "fair use" guidelines negotiated between a supplier and the purveyor. Because of the current controversy over the definition of "fair use," Rights
Manager does not rely on preprogrammed definitions; rather, the supplier and purveyor may negotiate this definition and create rules as needed. This approach permits fair use definitions to be redefined in response to new standards or regulatory definitions without requiring modifications to Rights Manager itself.
2. Use Collection Agreement rules to define the term (beginning and ending dates) for specific licensing agreements between the supplier and the purveyor. General access and permission rules by user ID, user category, network address, and media type would be assigned at this level.
3. Use Publication rules to impose any user ID or user category-specific rules (e.g., permissions for students enrolled in a course for which this publication has been selected as the adopted textbook) or to impose exceptions based on the publication's value.
4. Use Content-Element rules to grant specific end users or user categories access to materials (e.g., define content elements that are supplementary teaching aids for the instructor) or to impose exceptions based on media type or the value of content elements.
The Rights Manager system does not mandate that licensing agreements exploit user IDs; however, maximum content protection and flexibility in license agreement specification is achieved when this feature is used. Given that many institutions or consortium members may not have implemented a robust user authentication system, alternative approaches to uniquely identifying individual users must be considered. While there are a variety of ways to address this issue, it is suggested that personal identification numbers (PINs), assigned by the supplier and distributed by trusted institutional agents at the purveyor's site (e.g., instructors, librarians, bookstore employees, or departmental assistants) or embedded within the content, be used as the basis for establishing user IDs and passwords. Using this approach, valid users may enter into registration dialogues to automatically assign user IDs and passwords in response to a valid PIN "challenge."
While Rights Manager is designed to address all types of multimedia rights, permissions, and licensing issues, the current implementation has focused on distribution of traditional print publication media (text and images). Extensions to Rights Manager to address the distribution of full multimedia, including streaming audio and video, are being developed at CWRU.
The key to understanding our approach to intellectual property management is that we expect that each scholarly work will be disseminated according to a comprehensive contractual agreement. Publishers may use master agreements to cover a set of titles. Further, we do not expect that there will be only one interpretation of concepts such as fair use, and our Rights Manager system makes provision for arbitrarily different operational definitions of fair use, so that specific contractual agreements can be "enforced" within the delivery system.
A New Consortial Model
The library world has productively used various consortial models for over 30 years, but until now, there has not been a successful model for building a digital library. One of the missing pieces in the consortial jigsaw puzzle has been a technical model that is both comprehensive and reproducible in a variety of library contexts. To begin our approach to a new consortial model, we developed a complete technical system for building and operating a digital library. Building such a system is no small achievement. Similar efforts have been undertaken with the Elsevier Science TULIP Project and the JSTOR project.
The primary desiderata for a new consortial model are as follows:
• Any research library can participate using agreed upon and accepted standards.
• Many research libraries each contribute relatively small amounts of labor by scanning a small, controlled number of journal issues. Scanning is both systematic and based on a request for an individual article.
• Readily available off-the-shelf equipment is used.
• Intellectual property is made available through licensing and controlled by the Rights Manager software system.
• Publishers grant rights to libraries to scan and store intellectual property retrospectively (i.e., already purchased materials) in exchange for the right to license use of the digital formats to other users. Libraries provide publishers with digital copies of scholarly journals for their own use, thus enabling publishers to enrich their own electronic libraries.
A Payments System For The Consortium
It is unrealistic to assume that all use of a future digital library will be without any charging mechanisms even though the research library of today charges for little except for photocopying and user fines. This is not to assume that the library user is charged for each use, although that would be possible. A more likely scenario would be that the library pay on behalf of the members of the scholarly community (i.e., student, professor, researcher) that it supports. According to our proposed consortial model, libraries would be charged for use of the digital library according to the total pages "read" in any given user session. It could be easily worked out that users who consult the digital library on the premises of the campus library would not be charged themselves, but if they used the digital library from another campus location or from off-campus through a network, that they would pay a per-page charge analogous to the cost of photocopying. A system of charging could include categorization by type of user, and the Rights Manager system provides for a wide variety of charging models, including the making of distinctions of usage in soft-copy format, hard-copy format, and downloading of
a work in whole or in part. Protecting the rights of the owner is an especially interesting problem when the entire work is downloaded in a digital format. Both visible and invisible watermarking are techniques with which we have experience for protecting rights in the case of downloading an entire work.
We also have in mind that libraries that provide input via scanning to the decentralized, digital library would receive a credit for each page scanned. It is clear that the value of the digital library to the end user will increase as higher degrees of completeness in digitized holdings is achieved. Therefore, the credit system to originating libraries should recognize this value and reward these libraries according to a formula that charges and credits with a relative credit-to-charging ratio of perhaps ten to one; that is, an originating library might receive a credit for scanning one page equal to a charge for reading ten soft-copy pages.
The charge-and-credit system for our new consortial model is analogous to that used for the highly successful Online Computer Library Center's cataloging system. Member libraries within OCLC contribute original cataloging entries in the form of MARC records for the OCLC database as well as draw down a copy of a holding's data to fill in entries for their own catalog systems. The system of charging for "downloads" and crediting for "uploads" is repeated in our consortial model for retrospective scanning and processing of full-text journal articles. Just as original cataloging is at the heart of OCLC, original scanning is at the heart of our new consortial model for building the library of the future.
One of the most important aspects of this project is that the underlying software system has been instrumented with many data collection points. In this way we can find out through actual usage by faculty, students, and research staff what aspects of the system are good and which need more work and thought. Over the past decade many people have speculated about how the digital library might be made to work for the betterment of scholarly communications. Our system as described in this paper is one of the most comprehensive attempts yet to build up a base of usage experience data.
To appreciate the detailed data being collected by the project, we will describe the various types of data that the Rights Manager system captures. Many types of transactions occur between the Rights Manager client and the Rights Manager server software throughout a user session. The server software records these transactions, which will permit detailed analysis of usage patterns. Appendix D gives some details regarding the data collected during a user session.
Publishers And Digital Libraries
The effects of the new consortial model for building digital libraries are not confined to the domain of technology. During the period when the new digital distribution system was being constructed, Ohio LINK, an agency of the Ohio
Board of Regents, commenced an overlapping relationship with Academic Press to offer its collection of approximately 175 electronic journals, many of which were in our chemical sciences collections. Significantly, the Ohio LINK contract with Academic Press facilitated the development of our digital library because it included a provision covering the scanning and storage of retrospective collections (i.e., backfiles) of their journals that we had originally acquired by subscription. In 1997, Ohio LINK extended the model of the Academic Press contract to an offering from Elsevier Science. According to this later agreement, subscriptions to current volumes of Elsevier Science's 1,153 electronic journals would be available for access and use on all of the 57 campuses of Ohio LINK member institutions, including CWRU and the University of Akron. The cost of the entire collection of electronic journals for each university for 1998 was set by the Ohio LINK-Elsevier contract to be approximately 5.5% greater than the institution's Elsevier Science expenditure level for 1997 subscriptions regardless of the particular subset these subscriptions represented; there is a further 5.5% price increase set to take effect in 1999. Further, the agreement between Ohio LINK and Elsevier constrains the member institutions to pay for this comprehensive access even if they cancel a journal subscription. Notably, there is an optional payment discount of 10% when an existing journal subscription (in a paper format) is limited to electronic delivery only (eliminating the delivery of a paper version). Thus, electronic versions of the Elsevier journals that are part of our chemical sciences digital library will be available at both institutions regardless of the existence of our consortium; pooling collections according to our consortial model would be a useless exercise from a financial point of view.
Other publishers are also working with our consortium of institutions to offer digital products. During spring 1997, CWRU and the University of Akron entered into an agreement with Springer-Verlag to evaluate their offering of 50 or so electronic journals, some of which overlapped with our chemical sciences collection. A similar agreement covering backfiles of Elsevier journals was considered and rejected for budgetary reasons. During the development of this project, we had numerous contacts with the American Chemical Society with the objective of including their publications in our digital library. Indeed, the outline of an agreement with them was discussed. As the time came to render the agreement in writing, they withdrew and later disavowed any interest in a contract with the consortium. At the present time, discussions are being held with other significant chemical science publishers about being included in our consortial library. This is clearly a dynamic period in journal publishing, and each of the societal and commercial publishers sees much at stake. While we in universities try to make sense of both technology and information service to our scholarly communities, the publishers are each trying to chart their own course both competitively and strategically while improvements in information technology continually raise the ante for continuing to stay in the game.
The underlying goal of this project has been to see if information technology
could control the costs of chemical sciences serial publications. In the most extreme case, it could lower costs by half in our two libraries and even more if departmental copies were eliminated. As an aside, we estimated that departmentally paid chemical sciences journal subscriptions represented an institutional expenditure of about 40% of the libraries' own costs, so each institution paid in total 1.4 times each library's costs. For both institutions, the total was about 2.8 times the cost of one copy of each holding. Thus, if duplication were eliminated completely, the resulting expenditures for the consortium for subscriptions alone would be reduced by almost two-thirds from that which we have been spending. Clearly, the journal publishers understood the implications of our project. But the implications of the status quo were also clear: libraries and individuals were cutting subscriptions each year because budgets could not keep up with price increases. We believed that to let nature take its course was irresponsible when a well-designed experiment using state-of-the-art information technology could show a way to make progress. Thus, the spirit of our initial conversations with chemical sciences publishers was oriented to a positive scenario: libraries and the scholars they represented would be able to maintain or gain access to the full range of chemical sciences literature, and journals would be distributed in digital formats. We made a crucial assumption that information technology would permit the publishers to lower their internal production costs. This assumption is not unreasonable in that information technology has accomplished cost reductions in many other businesses.
In our preliminary discussions with the publishers, we expressed the long-term objective that we were seeking-controlling and even lowering our costs through the elimination of duplication as our approach to solving the "cancellation conundrum"-as well as our short-term objective-to receive the rights to scan, store, and display electronic versions of both current and back files of their publications, which we would create from materials we had already paid for (several times over, in fact). Current and future subscriptions would be purchased in only one copy, however, to create the desired financial savings. In exchange, we offered the publishers a complete copy of our PDF-formatted current issue and backfiles for their use, from which they could derive new revenue through licensing to others. Since these once-sold backfiles were being considered on the publishers' corporate balance sheets as a depleted resource, we thought that the prospect of deriving additional revenue from selling them again as a digital resource would seem to be an attractive idea. In the end, however, not one publisher was willing to take us up on this exchange. To them, the backfiles that we would create were not worth what we were asking. One chemical sciences journal publisher was willing to grant the rights to backfiles for additional revenue from our consortium. But this offer made no sense unless the exchange could be justified on the basis of savings in costs of library storage space and the additional convenience of electronic access (the digital library is never closed, network access from remote locations would likely increase marginal usage, etc.). When we saw the proposed charge, we
rejected this offer as being too expensive. Another publisher did grant us the rights we sought as part of the overall statewide Ohio LINK electronic and print subscription contract, but this arrangement locked in the current costs (and annual increments) for several years, so the libraries could not benefit directly in terms of cost savings. With that particular limited agreement, however, there still is the definite possibility for savings on departmentally paid, personal subscriptions.
When we began to plan this project, it was not obvious what stance the publishing community would take to it. Our contacts in some of the leading publishing houses and in the Association of American Publishers (AAP) led us to believe that we were on the right track. Clearly, our goal was to reduce our libraries' costs, and that goal meant that publishers would receive less revenue. However, we also believed that the publishers would value receipt of the scanned backfiles that we would accumulate. Thus, the question was whether the backfiles have significant economic value. Clearly, libraries paid for the original publications in paper formats and have been extremely reluctant to pay a second time for the convenience of having access to digital versions of the backfiles. In our discussions, the publishers and AAP also seemed interested in doing experiments in learning whether a screen-based digital format could be made useful to our chemical sciences scholars. Thus, there was a variety of positive incentives favoring experimentation, and a benign attitude toward the project was evinced by these initial contacts with publishers. Their substantial interest in the CWRU Rights Management system seemed genuine and sincere, and their willingness to help us with an experiment of this type was repeatedly averred. After many months of discussion with one publisher, it became clear that they were unwilling to participate at all. In the end, they revealed that they were developing their own commercial digital journal service and that they did not want to have another venue that might compete with this. A second publisher expressed repeated interest in the project and, in the end, proposed that our consortium purchase a license to use the backfiles at a cost of 15% more than the price of the paper-based subscription; this meant that we would have to pay more for the rights to store backfiles of these journals in our system. A third publisher provided the rights to scan, store, display, and use the backfiles as part of the overall statewide Ohio LINK contract; thus this publisher provided all the rights we needed without extra cost to the consortium. We are continuing to have discussions with other chemical sciences journal publishers regarding our consortium and Ohio LINK, and these conversations are not uncomplicated by the overlap in our dual memberships.
It is interesting to see that the idea that digital distribution could control publisher costs is being challenged with statements such as "the costs of preparing journals for World Wide Web access through the Internet are substantially greater than the costs of distributing print." Questions regarding such statements abound:For example, are the one-time developmental costs front-loaded in these calculations, or are they amortized over the product's life cycle? If these claims are true,
then they reflect on the way chemical sciences publishers are using information technology, because other societies and several commercial publishers are able to reflect cost savings in changing from print to nonprint distribution. Although we do not have the detailed data at this time (this study is presently under way in our libraries), we expect to show that there are significant cost savings in terms of library staff productivity improvement when we distribute journals in nonprint versions instead of print.
As a result of these experiences, some of these publishers are giving us the impression that their narrowly circumscribed economic interests are dominating the evolution of digital libraries, that they are not fundamentally interested in controlling their internal costs through digital distribution, and that they are still pursuing tactical advantages over our libraries at the expense of a different set of strategic relationships with our scholarly communities. As is true about many generalizations, these are not universally held within the publishing community, but the overwhelming message seems clear nonetheless.
A digital distribution system for storing and accessing scholarly communications has been constructed and installed on the campuses of Case Western Reserve University and the University of Akron. This low-cost system can be extended to other institutions with similar requirements because the system components, together with the way they have been integrated, were chosen to facilitate the diffusion of these technologies. This distribution system successfully separates ownership of library materials from access to them.
The most interesting aspect of the new digital distribution system is that libraries can form consortia that can share specialized materials rather than duplicate them in parallel, redundant collections. When a consortium can share a single subscription to a highly specialized journal, then we have the basis for controlling and possibly reducing the total cost of library materials, because we can eliminate duplicative subscriptions. We believe that the future of academic libraries points to the maintenance of a basic core collection, the selective acquisition of specialty materials, and the sharing across telecommunications networks of standard scholarly works. The consortial model that we have built and tested is one way to accomplish this goal. Our approach is contrasted with the common behavior of building up ever larger collections of standard works so that over time, academic libraries begin to look alike in their collecting habits, offer almost duplicative services, and require larger budgets. This project is attempting to find another path.
Over the past decade, several interesting experiments have been conducted to test different ideas for developing digital libraries, and more are under way. With many differing ideas and visions, an empirical approach is a sound way to make
progress from this point forward. Our consortium model with its many explicit standards and integrated technologies seems to us to be an experiment worth continuing. During the next few years it will surely develop a base of performance data that should provide insights for the future. In this way, experience will benefit visioning.
Borghuis, M., Brinckman, H., Fischer, A., Hunter, K., van der Loo, E., Mors, R., Mostert, P., and Zilstra, J.: TULIP Final Report: T he U niversity LI censing P rogram. New York: Elsevier Science, 1996.
Bush, V.: "As We May Think," The Atlantic Monthly, 176, 101-108, 1945.
Cummings, A. M., Witte, M. L., Bowen, W. G., Lazarus, L. O., Ekman, R. H.: University Libraries and Scholarly Communication: A Study Prepared for The Andrew W. Mellon Foundation. The Association of Research Libraries, 1992.
Fleischhauer, C., and Erway, R. L.: Reproduction-Quality Issues in a Digital-Library System: Observations on the Reproduction of Various Library and Archival Material Formats for Access and Preservation. An American Memory White Paper, Washington, D.C.: Library of Congress, 1992.
Kemeny, J. G.: "A Library for 2000 A.D." in Greenberger, M. (Ed.), Computers and the World of the Future. Cambridge, Mass.: The M.I.T. Press, 1962.
Appendix A Consortial Standards
• Enumeration and chronology standards from the serials holding standards of the 853 and 863 fields of MARC
- Specifies up to 6 levels of enumeration and 4 levels of chronology, for example
• Linking from bibliographic records in library catalog via an 856 field
-URL information appears in subfield u, anchor text appears in subfield z, for example
856 7¦uhttp://beavis.cwru.edu/chemvl¦zRetrieve articles from the Chemical Sciences Digital Library
Would appear as
Retrieve articles from the Chemical Sciences Digital Library
• The most widely used multipage graphic format
• Support for tagged information ("Copyright", etc.)
• Format is extensible by creating new tags (such as RM rule information, authentication hints, encryption parameters)
• Standard supports multiple kinds of compression
• Container for article images
• Page description language (PDF)
• PDF files are searchable by the Adobe Acrobat browser
• Encryption and security are defined in the standard
SICI (Serial Item and Contribution Identifier)
• SICI definition (standards progress, overview, etc.)
• Originally a key part of the indexing structure
• All the components of the SICI code are stored, so it could be used as a linking mechanism between an article database and the Chemical Sciences Digital Library
• Ohio LINK is also very interested in this standard and is urging database creators and search engine providers to add SICI number retrieval to citation database and journal article repository systems
• Future retrieval interfaces into the database: SICI number search form, SICI number search API, for example
Appendix B Equipment Standards for End Users
Minimum Equipment Required
Hardware: An IBM PC or compatible computer with the following components:
• 80386 processor
• 16 MB RAM
• 20 MB free disk space
• A video card and monitor with a resolution of 640 ³ 480 and the capability of displaying 16 colors or shades of gray
• Windows 3.1
• Win32s 1.25
• TCP/IP software suite including a version of Winsock
• Netscape Navigator 2.02
• Adobe Acrobat Exchange 2.1
Win32s is a software package for Windows 3.1 that is distributed without charge and is available from Microsoft.
The requirement for Adobe Acrobat Exchange, a commercial product that is not distributed without charge, is expected to be relaxed in favor of a requirement for Adobe Acrobat Reader, a commercial product that is distributed without charge.
The software will also run on newer versions of compatible hardware and/or software.
Recommended Configuration of Equipment
This configuration is recommended for users who will be using the system extensively. Hardware: A computer with the following components:
• Intel Pentium processor
• 32 MB RAM
• 50 MB free disk space
• A video card and monitor with a resolution of 1280 ³ 1024 and the capability of displaying 256 colors or shades of gray
• Windows NT 4.0 Workstation
• TCP/IP suite that has been configured for a network connection (included in Windows NT)
• Netscape Navigator 2.02
• Adobe Acrobat Exchange 2.1
The requirement for Adobe Acrobat Exchange, a commercial product that is not distributed without charge, is expected to be relaxed in favor of a requirement for Adobe Acrobat Reader, a commercial product that is distributed without charge.
Other software options that the system has been tested on include:
• IBM OS/2 3.0 Warp Connect with Win-OS/2
• IBM TCP/IP for Windows 3.1, version 2.1.1
• Windows NT 3.51
Additional Hardware Specifications
Storage for Digital Copies
To give us the greatest possible flexibility in developing the project, we decided to form the server out of two interlinked computer systems, a standard IBM System 390 with the OS/390 Open Edition version as the operating system and a standard IBM RS/6000 System with the AIX version of the UNIX operating system. Both these components may be incrementally grown as the project's server requirements increase. Both systems are relatively commonplace at academic sites. Although only one system pair is needed in this project, it is likely that eventually two pairs of systems would be needed for an effort on the national scale. Such redundancy is useful for providing both reliability and load leveling.
Both campus's networks and the statewide network that connects them uses the standards-based TCP/IP protocols. Any networked client workstation that follows our minimum standards will be able to use the digital delivery system being constructed. The minimum transmission speed on the CWRU campus is ten million bits per second (M bps) to each client workstation and a minimum of 155 M bps on each backbone link. The principal document repository is on the IBM System 390, which uses a 155 M bps ATM (asynchronous transfer mode) connection to the campus backbone. The linkage to the University of Akron is by way of the statewide network in which the principal backbone connection from CWRU is also operating at 155 M bps, and the linkage from the UA to the statewide network is at 3 M bps. The on-campus linkage for UA is also a minimum of 10 M bps to each client workstation within the chemical sciences scholarly community and to client workstations in the UA university library.
Appendix D System Transactions as Initiated by an End User
A typical user session generates the following transactions between client and server.
1. User requests an article (usually from a Web browser). If the user is starting a new session, the RM system downloads and launches the appropriate viewer, which will process only encrypted transactions. In the case of Adobe Acrobat, the system downloads a plug-in. The following transactions take place with the server:
a. Authenticate the viewer (i.e., ensure we are using a secure viewer).
b. Get permissions (i.e., obtain a set of user permissions, if any. If it is a new session, the user is set by default to be the general purpose category of PUBLIC).
c. Get Article (download the requested article. If step b returns no permissions, this transaction does not occur. The user must sign on and request the article again).
2. User signs on. If the general user has no permissions, he or she must log on. Following a successful logon, transactions 1b and 1c must be repeated. Transactions during sign-on include:
a. Sign On
3. Article is displayed on-screen. Before an article is displayed on the screen, the viewer enters the RM protocol, a step-by-step process wherein a single Report command is sent to the server several times with different state flags and use types. RM events are processed similarly for all supported functions, including display, print, excerpt, and download. The transactions include:
a. Report Use BEGIN (just before the article is displayed).
b. Report Use ABORT (sent in the event that a technical problem, such as "out of memory," prevents display of the article).
c. Report Use DECLINE (sent if the user declines display of the article after seeing the cost).
d. Report Use COMMIT (just after the article is displayed).
e. Report Use END (sent when the user dismisses the article from the screen by closing the article window).
4. User closes viewer. When a user closes a viewer, an end-of-session process occurs, which sends transaction 3e for all open articles. Also sent is a close viewer transaction, which immediately expires the viewer so it may not be used again.
a. Close Viewer
The basic data being collected for every command (with the exception of 1a) and being sent to the server for later analysis includes the following:
• Viewer ID
• User ID (even if it is PUBLIC)
• IP address of request
These primary data may be used to derive additional data: Transaction 1b is effectively used to log unsuccessful access attempts, including failure reasons. The time
interval between transactions 3a and 3e is used to measure the duration that an article is on the screen. The basic data collection module in the RM system is quite general and may be used to collect other information and derive other measures of system usage.
Scanning and Work Flow
Article Scanning, PDF Conversion, and Image Quality Control
The goal of the scan-and-store portion of the project is to develop a complete and tested system of hardware, software, and procedures that can be adopted by other members of the consortium with a reasonable investment in equipment, training, and personnel. If a system is beyond a consortium member's financial means, it will not be adopted. If a system cannot perform as required, it is a waste of resources.
Our original proposal stressed that all existing scholarly resources, particularly research tools, would remain available to scholars throughout this project. To that end, the scan-and-store process is designed to leave the consortium's existing journal collection intact and accessible.
Scan-and-Store Process Resources
• Scanning workstation, including a computer with sufficient processing and storage capacity, a scanner, and a network connection. Optionally, a second workstation can be used by the scanning supervisor to process the scanned images. The workstation used in this phase of the project includes:
-Minolta PS-3000 Digital Planetary Scanner
-Two computers with Pentium 200 MHz CPU, 64 MB RAM, 4 GB HD, 21" monitor
-Windows 3.11 OS (required by other software)
-Minolta Epic 3000 scanner software
-Adobe Acrobat Capture, Exchange, and Distiller software
-Image Alchemy software
-Network interface cards and TCP/IP software for campus network access
• Scanner operator(s), typically student assistants, with training roughly equivalent to that required for interlibrary loan photocopying. Approximately 8 hours of operator labor will be required to process the average 800 pages per day capacity of a single scanning workstation.
• Scanning supervisor, typically a librarian or full-time staff, with training in image quality control, indexing, and cataloging, and in operation of image processing software. Approximately 3 hours of supervisor labor will be required to process 800 scanned pages per day.
Scan-and-Store Process: Scanner Operator
• Retrieve scan request from system
• Retrieve materials from shelves (enough for two hours of scanning)
• Scan materials and enter basic data into system
-Evaluate size of pages
-Evaluate grayscale/black and white scan mode
-Test scan and adjust settings and alignment as necessary
-Log changes and additions to author, title, journal, issue, and item data on request form
-Repeat for remaining requested articles
• Transfer scanned image files to Acrobat conversion workstation
• Retrieve next batch of scan requests from system
• Reshelve scanned materials and retrieve next batch of materials
Scan-and-Store Process: Acrobat Conversion Workstation
• Run Adobe Acrobat Capture to automatically convert sequential scanned image files from single-page TIFF to multi-page Acrobat PDF documents, as they are received from scanner operator
• Retain original TIFF files
Scan-and-Store Process: Scanning Supervisor
• Retrieve request forms for scanned materials
• Open converted PDF files
• Evaluate image quality of converted PDF files
-Scanned article matches request form citation
-Completeness, no clipped margins
-Legibility, especially footnotes and references
-Clarity of grayscale or halftone images
-Appropriate margins, no excessive white space
• Crop fingertips, margin lines, and so on, missed by Epic 3000 scanner software
-Retrieve TIFF image file
-Mask unwanted areas
-Resave TIFF image file
-Repeat PDF conversion
-Evaluate image quality of revised PDF file
• Return unacceptable scans to scanner operator for rescan or correction
• Evaluate, correct, and expand entries in request forms
• Forward corrected PDF files to the database
• Delete TIFF image files from conversion workstation
Notification to and Viewing by User of Availability of Scanned Article
Insertion of the article into the database
• The scanning technician types the scan request number into a Web form.
• The system returns a Web form with most of the fields filled in. The technician has an opportunity to correct information from the paging slip before inserting the article into the database.
• The Web form contains a "file upload" button that when selected allows the technician to browse the local hard drive for the article PDF file. This file is automatically uploaded to the server when the form is submitted.
• The system inserts the table of contents information into the database and the PDF file to the Rights Manager system.
Notification/delivery of article to requester
• E-mail to requester with URL of requested article (in first release)
• No notification (in first release)
• Fax to requester an announcement page with the article URL (proposed future enhancement)
• Fax to requester a copy of the article (proposed future enhancement)
Technical Justification for a Digitization Standard for the Consortium
A major premise in the technical underpinnings of the new consortial model is that a relatively inexpensive scanner can be located in the academic libraries of consortium members. After evaluating virtually every scanning device on the market, including some in laboratories under development, we concluded that the 400 dot-per-inch (dpi) scanner from Minolta was fully adequate for the purpose of scanning all the hundreds of chemical sciences journals in which we were interested. Thus, for our consortium, the Minolta 400 dpi scanner was taken to be the
digitization standard. The standard that was adopted preserves 100% of the informational content required by our end users.
More formally, the standard for digitization in the consortium is defined as follows:
The scanner captures 256 levels of gray in a single pass with a density of 400 dots per inch and converts the grayscale image to black and white using threshold and edge-detection algorithms.
We arrived at this standard by considering our fundamental requirements:
• Handle the smallest significant information presented in the source documents of the chemical sciences literature, which is the lowercase e in superscript or subscript as occurs in footnotes
• Satisfy both legibility and fidelity to the source document
• Minimize scanning artifacts or "noise" from background
• Operate in the range of preservation scanning
• Be affordable by academic and research libraries
The scanning standard adopted by this project was subjected to tests of footnoted information, and 100% of the occurrences of these characters were captured in both image and character modes and recognized for displaying and searching.
At 400 dpi, the Minolta scanner works in the range of preservation quality scanning as defined by researchers at the Library of Congress (Fleischhauer and Erway 1992).
We were also cautioned about the problems unique to very high resolution scanning in which the scanner produces artifacts or "noise" from imperfections in the paper used. We happily note that we did not encounter this problem in this project because the paper used by publishers of chemical sciences journals is coated.
When more is less: images scanned at 600 dpi require larger file sizes than those scanned at 400 dpi. Thus, 600 dpi is less efficient than 400 dpi. Further, in a series of tests that we conducted, a 600 dpi scanner actually produced an image of effectively lower resolution than 400 dpi. It appears that this loss of information occurs when the scanned image is viewed on a computer screen where there is relatively heavy use of anti-aliasing in the display. When viewed with software that permitted zooming in for looking at details of the scanned image (which is supported by both PDF and TIFF viewers), the 600 dpi anti-aliased image actually had lower resolution than did an image produced from the same source document by the 400 dpi Minolta scanner according to our consortium's digitization standard. With the 600 dpi scanner, the only way for the end user to see the full resolution was to download the image and then print it out. When a side-by-side comparison was made of the soft-copy displayed images, the presentation image quality of 600 dpi was deemed unacceptable by our end users; the 400 dpi image was just right. Thus, our delivery approach is more useful to the scholar who needs to examine
fine details on-screen. We conducted some tests on reconstructing the journal page from the scanned image by printing it out on a Xerox DocuTech 6135 (600 dpi). We found that the smallest fonts and fine details of the articles were uniformly excellent. Interestingly, in many of the tests we performed, our faculty colleagues judged the end result by their own "acid test": how the scanned image, when printed out, compared with the image produced by a photocopier. For the consortium standard, they were satisfied with the result and pleased with the improvement in quality that the 400 dpi scanner provided in comparison with conventional photocopying of the journal page.
On-line Books at Columbia
Early Findings on Use, Satisfaction, and Effect
Mary Summerfield and Carol A. Mandel
with Paul Kantor, Consultant
The Online Books Evaluation Project at Columbia University explores the potential for on-line books to become significant resources in academic libraries by analyzing (1) the Columbia community's adoption of and reaction to various on-line books and delivery system features provided by the libraries over the period of the project; (2) the relative life-cycle costs of producing, owning, and using on-line books and their print counterparts; and (3) the implications of intellectual property regulations and traditions of scholarly communications and publishing for the on-line format.
On-line books might enhance the scholarly processes of research, dissemination of findings, teaching, and learning. Alternatively, or in addition, they might enable publishers, libraries, and scholars to reduce the costs of disseminating and using scholarship. For example:
• If the scholarly community were prepared to use some or all categories of books for some or all purposes in an on-line format instead of a print format, publishers, libraries, and bookstores might be able to trim costs as well as enhance access to these books.
• If on-line books made scholars more efficient or effective in their work of research, teaching, and learning so as to enhance revenues or reduce operating costs for their institutions, on-line books might be worth adopting even if they were no less costly than print books.
• If an on-line format became standard, publishers could offer low-cost on-line access to institutions that would not normally have purchased print copies, thus expanding both convenient access to scholarship to faculty and students at those institutions and publishers' revenues from these books.
This paper focuses on user response to on-line books and reports on:
1. the conceptual framework for the project
2. background information on the status of the collection and other relevant project elements, particularly design considerations
3. the methodology for measuring adoption of on-line books by the Columbia community
4. early findings on use of on-line books and other on-line resources
5. early findings on attitudes toward on-line books
The variables representing usage of a system of scholarly communication and research are both effects and causes. Since scholars, the users of the system, are highly intelligent and adaptive, the effect of the system will influence their behavior, establishing a kind of feedback loop. As the diagram in Figure 17.1 shows, there are two key loops. The upper one, shown by the dark arrows, reflects an idealized picture of university administration. In this picture, the features of any system are adjusted so that, when used by faculty and students, they improve institutional effectiveness. This adjustment occurs in the context of continual adaptation on the part of the users of the system, as shown by the lighter colored arrows in the lower feedback loop.
These feedback loops are constrained by the continual change of the environment, which affects the expectations and activities of the users, affects the kind of features that can be built into the system, and affects the very management that is bringing the system into existence. The dotted arrows show this interaction.
Our primary research goal, in relation to users, uses, and impacts, is to understand these relationships, using data gathered by library circulation systems, Internet servers, and surveys and interviews of users themselves.
The On-line Books Collection
The project began formal activity in January 1995. However, discussions with publishers began in 1993, if not earlier. As noted in the project's Analytical Principles and Design document, "The Online Books Evaluation Project is a component of the developing digital library at Columbia University. As part of its digital library effort, the Columbia University Libraries is acquiring a variety of reference and monographic books in electronic format to be included on the campus network; in most cases, those books will be available only to members of the Columbia community. Some of the books are being purchased; others are being provided on a pilot project basis by publishers who are seeking to understand how the academic community will use online books if they become more widely available in the future."
Design of the On-line Books Collection
When this project was proposed, the World Wide Web was just emerging, and we expected to develop custom SGML browsers, just as other on-line projects were doing at the time. However, by the time the project was ready to mount books on-line, the Web seemed the best delivery system for maximizing availability of the books to scholars.
Many other on-line projects are providing users with materials in PDF, scanned, or bitmapped format. These formats are effective for journal articles, which are finely indexed through existing sources and which are short and easily printed. However, the greatest potential for added value from on-line books comes with truly digital books. Only this on-line format allows the development of interactive books that take advantage of the current and anticipated capabilities of Web technology, such as the inclusion of sound and video, data files and software for manipulating data, and links to other on-line resources. Perhaps only such enhanced on-line books will offer sufficient advantages over traditional print format that scholars will be willing to substitute them for the print format for some or all of their modes of use and for some or all classes of books.
As of June 1997, the project included 96 on-line texts. The libraries have each book in print form-circulating from the regular collection or from reserves or noncirculating in reference-as well as in one or more on-line formats. Appendix A summarizes the print access modes for all the modern books in the collection.
Methodology for Studying Use of and Reactions to Various Book Formats
The project's Analytical Principles and Design document lays out the evaluation methodology. Formulated in the first year of the project, this methodology re-
mains the working plan. Here are some of the key measures for documenting use of the on-line books:
• The records of the Columbia computing system provide, for the most part, the use data for the on-line books. For books accessed via the World Wide Web, information on date, time, and duration of session involving an on-line book, user's cohort, location of computer, number of requests, amount of the book requested, and means of accessing the book will be available. These data became available in summer 1997 with the full implementation of the authentication system and related databases.
• Circulation data for each print book in the regular collection provides information on number of times a book circulates, circulation by cohort, duration of circulation, number of holds, and recalls. For most libraries, the data available for reserve books is the same as that for books in the regular collection as the CLIO circulation system is used for both.
• The records of the Columbia computing system provide, for the most part, the use data for the books accessed via CNet, Columbia's original, gopherbased Campus Wide Information System, including the number of sessions and their date and time. These records do not include the duration of the session, the activity during the session, e.g., printing or saving, or anything about the user. Thus, all we can analyze are the patterns of use by time of day, day of week, and over time.
• Until March 15, 1997, for books accessed via CWeb, we knew the use immediately preceding the hit on the book and the day and time of the hit. For data collected through that point, our analysis is constrained to patterns of use by time of day, day of the week, and over time. By manual examination of server data, we counted how many hits a user made on our collection during one session and the nature of those hits.
• Since March 15, 1997, we are able to link user and usage information and conduct a series of analyses involving titles used, number of hits, number of books used, and so on by individual and to group those individuals by department, position, and age. These data do not yet include sessions of use, just the magnitude of overall use during the period. Session-specific data are available starting in fall 1997.
We are using a wide range of tools in trying to understand the factors that influence use of on-line books. Table 17.1 summarizes our complex array of surveys and interviews.
Use of Books in On-line Collection
At this point we will report on (1) trends in use of the on-line books; (2) user location and cohort; and (3) use of the on-line books by individuals. Summarized
below separately are findings for reference works and nonreference books, e.g., monographs and collections.
Three reference works have been available on-line long enough to have generated substantial usage data. These are The Concise Columbia Electronic Encyclopedia, Columbia Granger's World of Poetry, and The Oxford English Dictionary. Three other titles (Chaucer Name Dictionary, African American Women, Native American Women ) have been on-line only since early 1997, so usage data are very short-term for these titles. All three are accessible both through CNet and CWeb.
Most available reference books are used more heavily on-line than in print.
Of the six reference works in the collection, only The Oxford English Dictionary receives sizable use in its print form in the Columbia libraries. At most a handful of scholars use the library copies of the others each month. As the accompanying tables and figure show, each of these books receives much more use on-line. On-line availability seems to increase awareness of these resources as well as make access more convenient.
Early on-line reference books have experienced falling usage over time, substitution of use of a new delivery system for an old one, or a smaller rate of growth of use than might be expected given the explosion in access to and use of on-line resources in general.
In the early to mid-1990s, novelty may have brought curious scholars to the on-line format somewhat without concern for design, the utility of the delivery system, or the qualities of the books. With enhancement in delivery systems and expansion in the number of on-line books, being on-line is no longer a guarantee that a book will attract users. As access to the Web spreads, new graphical Web delivery systems are offering superior performance that is increasingly likely to draw scholars away from these early, text-based systems. In addition, as more competing resources come on-line and provide information that serves the immediate needs of a user better or offer a more attractive, user-friendly format, scholars are less likely to find or to choose to use any single resource.
The Oxford English Dictionary is the most heavily used reference work in the collection. Its CNet format offers good analytic functionality but it is difficult to use. The CWeb format is attractive and easy to use, but its functionality is limited to looking up a definition or browsing the contents.
OED CNet usage dropped 59% from fourth quarter 1994 (2,856 sessions) to first quarter 1997 (1,167 sessions). OED CWeb use increased by 27% from fall semester 1996 (1,825 hits) to spring semester 1997 (2,326 hits). The OED had 173 unique users in the period from March 15 to May 31, 1997, with an average of 2.8 hits per user.
The Concise Columbia Electronic Encyclopedia remains on the text-based platform CNet. As Figure 17.2 shows, usage declined 84% over the past three years, from 1,551 sessions in April 1994 to 250 sessions in April 1997. Usage declined most in the 1996-97 academic year; 7,861 sessions were registered from September 1995 to May 1996 and 2,941 sessions (63% fewer) from September 1996 to May 1997.
Columbia now provides CWeb access to the Encyclopedia Britannica (directly from the publisher's server); many scholars may be using this resource instead of the Concise Encyclopedia. Recently the Columbia community has registered about 5,000 textual hits a month on the Encyclopedia Britannica.
Columbia Granger's World of Poetry is available on both CNet and CWeb. The CNet version is a Lynx, nongraphical Web formulation of the CWeb version. This resource, which became available to the community in on-line form in October 1994, locates a poem in an anthology by author, subject, title, first line, or keywords in its title or first line. In addition, it provides easy access to the 10,000 most often anthologized poems. In first quarter 1995, CNet sessions totaled 718; in first quarter 1997, they totaled 90 (or about one a day). CWeb hits totaled about 700 in the first quarter of 1997. Thus, even though it has declined, total usage of Granger's is still considerable.
Garland's Chaucer Name Dictionary was added to the CWeb collection at the end of 1996. Native American Women was added in January 1997, and African American Women went on-line in February 1997. Their early usage on CWeb is shown in Table 17.2.
The Online Books Evaluation Project includes two types of nonreference books: Past Masters, 54 classical texts in social thought; and modern monographs and collections from Columbia University Press, Oxford University Press, and Simon and Schuster Higher Education. Most of these books came on-line during the 1996-97 academic year.
On-line scholarly monographs are available to and used by more people than their print counterparts in the library collection.
Once a print book is in circulation, it can be unavailable to other scholars for hours (the reserve collection) or weeks or months (the regular collection). An online book is always available to any authorized user who has access to a computer with an Internet connection and a graphical Web browser.
Table 17.3 tracks usage of the contemporary nonreference books in the on-line collection for the last part of the spring semester 1997 and in the print circulation of these titles for the first six months of 1997. Fourteen of these books had no online use during this 2.5-month measurement period; 12 had no print circulations during their 6-month measurement period. In total, the on-line versions had 122 users while the print versions had 75 users. Looking at only the on-line books that circulated in print form, we find 122 on-line users and 45 print circulations, or that there were nearly three times as many on-line users as circulations. These data suggest that, compared for an equal period, these books will have many more users in on-line form than in print form.
An on-line book may attract scholars who would not have seen it otherwise.
Once a group within the community becomes aware of the on-line books, they are likely to review books in the collection that seem related to their interests-at least while the collection is small. For example, half of the use of Autonomous Agents: From Self Control to Autonomy was from social work host computers. This title might seem related to social work issues even though it is not a social work book or part of the collection of the Social Work Library.
The fifth and sixth most used books-Self Expressions: Mind, Morals, and the Meaning of Life and Bangs, Crunches, Whimpers, and Shrieks -are both philosophy titles.
• Self Expressions is listed in the Current Social Science Web page along with the social work titles. Five of its seven users were from the School of Social Work, one from the Center for Neurobiology and Behavior, and one from Electrical Engineering.
(table continued on next page)
(table continued from previous page)
• Bangs, Crunches, Whimpers, and Shrieks is listed under Physics in the Current Science Web page. Two of its seven users were from the Physics department, another two from unidentified departments, and one each from Electrical Engineering, Engineering, and General Studies.
It is not clear whether scholars' productivity or work quality will be enhanced by such serendipity. The important concept of collation is transformed, in the networked environment, to a diversity of finding and navigational systems. As the online collection expands, browsing will require the focused use of on-line search tools rather than use of project-oriented Web pages. However, the Web's search systems may uncover a book's relevance to a scholar's work when the library catalog or browsing the library or bookstore shelves would not have done so. This new ability to identify relevant books should improve scholars' research and teaching.
Scholars use some on-line books relatively little.
As Table 17.3 shows, use of on-line books is not evenly balanced among titles. Instead it is driven by course demands or other interest in a book. The initial use of the 54 on-line Past Masters classic texts in social thought confirms this finding. In academic year 1996-97, these texts registered a total of about 2,460 hits from the Columbia scholarly community. However, 1,692 (69%) of these hits were on only eight (15%) of the titles, for an average total of 212 hits each, or 24 hits each per month. The other 46 texts averaged about 17 hits each over this period, or about two hits each per month.
Patterns of usage may be expected to change over time as various texts are used in courses or by researchers and as the Columbia community becomes more aware of the on-line collections. In general, it is likely that nonreference books that are being used in courses, but which the students need not own, will be in greater demand on-line than will books that students must own or books that are of interest to only a small set of scholars.
The data to date suggest that to the extent that there are meaningful costs to creating on-line books and to maintaining them as part of a collection, publishing and library planners must select items for the on-line collection carefully. The decision rules will vary depending on what type of organization is taking on the risks of providing the access to the on-line books.
Some scholars, especially students with a reading assignment that is in the on-line collection, are looking at on-line books in some depth, suggesting that they find value in this means of access.
As Table 17.3 shows, the on-line books averaged 5.5 hits per unique user, suggesting that some users are looking at several elements of the book or at some elements repeatedly. In fall 1996, three social work books were most intensively used because they were assigned reading for courses. We analyzed the server statistics through the end of 1996 for these books in an effort to learn how deeply the books
were used-to what extent use sessions included book chapters, the search engine, the pagination feature, and so on.
Table 17.4 shows that relatively few sessions (7%-24%) involved someone going only to the Title Page/Table of Contents file for a book. Many sessions (28%-59%) involved use of more than one chapter of the book; sessions averaged 1.4 to 3.5 hits on chapters, depending on the book used. Some users would seem to be repeat users who had set up a bookmark for a chapter in the book or had made a note of the URL because some sessions (9%-17%) did not include a hit on the Table of Contents/Title Page.
Table 17.5, illustrating the distribution of hits on the on-line books collection per unique user over the last part of the spring 1997 semester, indicates that while many users are making quite cursory use of the on-line books, more are looking at multiple files (e.g., reference entry, chapter) in the collection. As Table 17.6 shows, the distribution of unique titles viewed by these users over this period indicates that most users come to the collection to look at a single book. The greatest number of books used by a single person was seven (by two persons).
Not surprisingly, there is a certain correlation between number of hits and number of titles used. Those users with only one hit could only have looked at one title (42% of those using one book). The range of hits among those who used only one book is wide-20 (9%) had more than 10 hits. Six users had more than 25 hits; two of them looked at only one book, one each at two and three books, and two at seven books. These statistics indicate some significant use of the collection as measured by average number of hits per title used.
However, hits on several titles need not indicate heavy use of the on-line books collection. The individual who looked at five books had a total of only six to ten hits, as did four of the seven people who looked at four books (one to two hits each). The person who looked at six books had 11 to 15 hits in total (an average of about two hits per book).
Table 17.7 shows that graduate students tended to have more hits, and undergraduates and faculty, fewer hits.
The preceding discussion highlights the current data on usage by individuals. Using newer data on sessions, we will be able to derive valuable information on user behavior-not only number of books used and hits on those books but parts of the book used and repeat usership. We will begin to see revealed preference in user behavior and will be less reliant on responses to questionnaires.
Data for the last half of the spring 1997 semester suggest that when a social work book available in both print and on-line formats was used in a course, the share of students using the on-line version was at most one-quarter.
Table 17.3 shows that the four most used nonreference books were all in the field of social work. Almost 91% of the users of these books were from the School of Social Work; they accounted for 98% of the hits on those books. The vast majority of these users (56 of 64) were graduate students. With the exception of the
most used book, Task Strategies, these texts were on reserve for social work courses during the spring 1997 semester.
• Three sections, with a total of about 70 students, used Supervision in Social Work as a key text. Thus, potentially, if all seven graduate students who used this book were participants in these courses, about 10% of the most likely student user group actually used this book on-line during this period.
• Three other course sections, again with about 70 students in total, used Mutual Aid Groups. This book was a major reading; in fact, one of its authors taught two of the sections in which it was used. Sixteen graduate students used this title for a potential penetration of about 23%.
• Philosophical Foundations of Social Work (as well as Qualitative Research in Social Work ) was on reserve for a doctoral seminar that had an enrollment of 11 students. The instructor reported that this book was a major text in the course that students would have bought traditionally. She did not know how many of her students used the on-line version. If all eight users-seven graduate
students and the one professional student-were class members, that suggests a substantial penetration for that small class. However, it is likely that some of these users were not enrolled in that course.
Location of Use of On-line Books
Scholars are not using on-line books from off-campus locations to the extent expected.
One of the key potential advantages to on-line books is their instant availability to scholars at any location at which they have access to a computer with a modem and a graphical Web browser. This benefit might well lead to substantial use of the on-line books from locations other than the Columbia campus. So far we are seeing only modest use of the books from off-campus.
From May 1996 to March 1997, 11% of the hits on the Columbia University Press nonreference books were dial-up connections from off-campus. Looking at the use of the social work titles, we find that computers in the School of Social Work were responsible for the following shares of hits on the social work titles:
Closer analysis of the usage data finds substantial use from the computer lab in the School of Social Work as well as from faculty computers. This finding suggests that many of the graduate students, most of whom do not live on or near campus, may not have Web access in their homes and, hence, are not equipped at this point in time to take full advantage of the on-line books. Students who use the on-line books at the School of Social Work, however, avoid walking the several blocks to the social work library, worrying about the library's hours, or encountering nonavailability of the book in its print form. In our interviews, scholars report that key constraining factors to using the on-line books and other Web resources from home are the expense of dialing in to campus or maintaining an Internet account, the lack of sufficiently powerful home computers and Web software, the frequency of busy signals on the dial-up lines, and the slowness of standard modems.
Students residing on campus may have Ethernet connections to the campus network-providing both speedy and virtually free access to the on-line collection.
At the end of the 1996-97 academic year, approximately 2,300 students were registered for residence hall network connections. With the exception of the three Garland reference books, a very small share of reference collection use occurs on
computers in the libraries; the Columbia community is taking advantage of the out-of-library access to these resources. For example, 42% of the hits on The Oxford English Dictionary in the ten months following May 1996 were from residence hall network connections.
However, these undergraduates have shown little interest in the nonreference books on-line. Residence hall connections accounted for only 1% of the use of the Columbia University Press titles in social work, earth and environmental science, and international relations and 3% of the use of the Oxford University Press titles in literary criticism and philosophy from May 1996 to May 1997. These small shares are not surprising given that few of these books are aimed at the undergraduate audience. The undergraduates' use of the Past Masters classical texts in social thought from Ethernet connections in their rooms is somewhat higher-654 hits, or almost 13% of the total use of those texts from May 1996 to March 1997.
Scholars' Access to On-line Resources
We theorize that scholars with greater perceived access to networked computers and with greater familiarity with on-line resources are more likely first to sample on-line books and later to adopt them for regular use (assuming that books of interest are available on-line). All project questionnaires ask about both these factors. The access question is, "Is there a computer (in the library or elsewhere) attached to the campus network (directly or by modem) that you can use whenever you want?" The question about use of on-line resources asks, "On average this semester, how many hours per week do you spend in on-line activities (Email, Listservs & Newsgroups, CLIO Plus, Text, Image or Numeric Data Sources, Other WWWeb Uses)?" In some cases, the question asks for a single value; in others, it has five spaces in which respondents are asked to enter their hours for each of these activities.
Over 80% of Columbia library users report adequate access to a networked computer.
In the Columbia Libraries annual survey of on-site users in March 1997, 2,367 individuals responded to this question on access to networked computers. Almost 81% answered "Yes." Masters students were least likely to respond positively (67%) while the other scholarly cohorts-faculty, doctoral students, and undergraduate students-ranged from 85% to 87%. Users of science libraries were generally more likely to respond affirmatively.
Columbia library users report an average of about six hours a week in on-line activities with no significant difference across scholarly cohorts.
Even many of the survey respondents who did not claim easy access to a networked computer reported spending considerable time in on-line activities-22% spent four to six hours a week and 23% spent more than six hours a week.
Scholars' Choice among Book Formats
Scholars' patterns of using books in their various formats and their reactions to on-line books are being tracked through a variety of surveys, individual interviews, and focus groups (see Table 17.1).
One survey involves visiting a class session for which an assigned reading was in an on-line book. A question asks which format(s) of the book the student used for this assignment. Responses were distributed as shown in Table 17.8.
In 70% of the responses for fall 1996, as seen in Table 17.8, the student had used his or her own copy of the text. The next most common method was to use a friend's copy (14%). The shares for those two modes are insignificantly different in spring 1997. We are obtaining course syllabi from instructors so that, in the future, we can analyze these responses based on what portion of the book is being used in a course and whether students are expected to purchase their own copies.
Preferences for Studying Class Reading
We obtained far fewer responses (119 in fall 1996 and 88 in spring 1997) as to the preferred mode of studying. Table 17.9 shows that in both semesters, about twothirds of respondents reported a preference for reading their own copy.
Scholars' Reactions to Book Formats and Characteristics
Scholars reporting easy access to a networked computer spend more time on-line and are more likely to prefer to use one of the forms of the on-line book.
In our in-class surveys in spring 1997, students claiming easy access to a networked computer (74% of the 209 respondents) were greater users of on-line resources overall. Only 27% of students claiming easy access reported as few as one to two hours on-line a week, while 53% of those lacking easy access had this low level of on-line activity. About 31% of the former group spent six or more hours a week on-line while 18% of the latter group did.
About 26% of the easy access group gave some form of on-line book (reading directly on-line, printout of text, or download of text and reading away from the Web) as their preferred method of reading an assignment for which an on-line version was available, while only 13% of the students lacking easy access did so.
This combination of responses suggests that, over time as members of the scholarly community obtain greater access to computers linked to the Web, on-line books will achieve greater acceptance.
Students report that they particularly value easy access to the texts that are assigned for class and an ability to underline and annotate those texts.
Students seek the ability to print out all or parts of the on-line texts that they use for their courses, again indicating their desire to have the paper copy to use in their
studying. Computer access to a needed text is not equivalent to having a paper copy (whole book or assigned portion) in one's backpack, available at any time and at any place (see Table 17.10).
The cross-tabulation of preferred method of use and reasons for that preference produces logically consistent results. For example, all the respondents who gave "Printout using non-JAKE printer" or "Download of on-line text to disk to be read away from CWeb" as their preferred method gave "Less costly" as one of their reasons, while few of those students who preferred their own copy gave that reason.
If the effective choice for completing a required reading is between borrowing a book from the library, probably on a very short-term basis from reserves, and accessing the book on-line, the student is facing a parallel situation of needing to photocopy or print out the reading to obtain portable, annotative media. However, the on-line book's advantages are that it will never be checked out when the student wants to use it and that it will be accessible from a computer anywhere in the world at any time (as long as that computer has an Internet connection and a graphical Web browser).
In surveys and interviews, scholars indicate that they value the ability to do searches, to browse, and to quickly look up information in an on-line book.
They also like the ability to clip bits of the text and put them in an electronic research notes file. Willingness to browse and to read on-line for extended periods varies from person to person, but it does not seem to be widespread at this time.
Some scholars perceive gains in the productivity and quality of their work in using on-line books, particularly reference books.
Two key questions asked on all our questionnaires, other than those distributed in class, seek to determine the effect of on-line books on scholarly work:
1. In doing the type of work for which you used this book, do paper books or on-line books help you be more productive?
2. Do you find that you are able to do work of higher quality when you use paper books or on-line books?
The questionnaire offers a range of seven responses from "Much greater productivity (quality) with paper" through "No difference" to "Much greater productivity (quality) with on-line" plus "Cannot say."
As Table 17.11 shows, 52% of OED users felt that they were as productive or more productive using the on-line OED, while 39% of the users of the other on-line books felt that they are as productive or more productive using the on-line format. These responses are somewhat puzzling because the reference book most used on-line is The OED, suggesting that scholars do value it, and the CWeb version of the on-line OED provides as much if not more utility than does the print version (with the exception of being able to view neighboring entries at a
glance). Thus, one might expect the productivity rating for the on-line ODE to be higher.
The distribution of responses to the quality of work question supports the print format in general, although 47% of ODE users and 43% of the users of all the other books felt that quality was as good or better with on-line books.
Table 17.12 shows considerable correlation in the responses to these two questions-those who supported the paper version for productivity tended to support it for quality as well.
In the last part of the spring 1997 semester, 52% of the on-line book users who went to the on-line survey responded to it, but only 15% of users chose to click on the button taking them to the survey.
Designing an on-line survey that is available to the reader without overt action might enhance the response rate significantly. We are working on doing that using HTML frames on the on-line books. We are also experimenting with other methods of reaching the users of the on-line books, e.g., registration of users that will bring e-mail messages about new books in their field while also enabling us to query them about their reactions to on-line books.
These preliminary results of the Online Book Evaluation Project suggest that, at this early point in its development, the on-line format is finding a place in the work patterns of scholars who have had an opportunity to try it.
Interviews and focus groups substantiate the findings from the server data and surveys. Together they suggest the following about scholars' reactions to the online format:
• It is a convenient way to access information in reference books and potentially to do textual analyses in individual books or whole databases like the OED.
• Using a search function, one can quickly determine if a book or set of books addresses a topic of interest and warrants further investigation.
• It is an easy way to browse through a book to determine whether it is worth deeper exploration or whether only a small section is pertinent to one's work. If the latter is the case, it is as easy to print out that small section of the on-line book as it is to take the typical next step of photocopying that section of the paper book.
• A scholar who wants to read and annotate only a modest section of a book, say a chapter or an essay for a course assignment, will find that accessing and printing out the section from the on-line book can be quicker than doing the equivalent with a library copy of the paper book.
• Ready access from any location at any hour and not worrying about whether the book sought is on the library shelf are valued features of the on-line format.
On the other hand, if scholars want to read much or all of a book, they are likely to prefer the traditional format. If the book is core to their research or to a course, scholars are likely to prefer to own a copy. If they cannot afford such a copy, if the book is of more passing interest, or if they cannot obtain a print copy, scholars would typically prefer to retain a library copy for the duration of their interest in the book. If they cannot do so, say because the book is on reserve, scholars must decide among their options, e.g., buying their own copy or using an on-line copy, and decide which option is next preferred.
Over the duration of this project, we will continue to add books to the on-line collection and to pursue our explorations of scholars' reactions to this format. We will look for trends in the perceived accessibility of on-line books and in the desirability of this format for various uses. We will seek to measure the frequency with which scholars read such substantial portions of books borrowed from libraries that they will continue to seek library access to paper copies. In a related effort, we will assess the extent to which libraries now satisfy scholars' desires for access to such copies. If a library did not have a book in its collection in print format but did offer on-line access, a scholar would face a different trade-off between the two formats.
At the same time we will pursue our analyses of the cost and intellectual property issues involved in scholarly communication in an effort to determine whether the on-line book format can contribute to the perpetuation of research and learning and to the dissemination and preservation of knowledge.
The Library and the University Press
Two Views of Costs and Problems in Scholarly Publishing
Susan Rosenblatt and Sandra Whisler
The conflicts becoming apparent in scholarly communication have been anticipated for almost two decades. In 1982 Patricia Battin wrote:
During the decade of the 1970's, librarians faced declining budgets, increasing volume of publication, relentless inflation, space constraints, soaring labor costs, a horrifying recognition of the enormous preservation problems in our stacks, increasing devastation of our collections by both casual and professional theft, and continuing pressure from scholars for rapid access to a growing body of literature. It is ironic that both librarians and publishers introduced computer applications into libraries and publishing houses to save the book, not to replace it. Both were looking for ways to reduce labor costs rather than for visionary attempts to redefine the process of scholarly communication.... The former coalition shattered and publishers, scholars and librarians became adversaries in a new and unprecedented struggle to survive in the new environment, each trying in his or her own way to preserve the past and each seeing the other as adversary to that objective.
Library Materials: Print
The results of the economic crisis in scholarly publishing were documented statistically in 1992 in University Libraries and Scholarly Communication. Some of the principal findings included the fact that although materials and binding expenditures remain a relatively constant percentage of total library expenses, there has been a hidden, but significant, change in the ratio of books to serials expenses. Although materials expenditures have steadily risen, the average numbers of volumes added to library collections annually continue to decline. Not only are libraries spending more and receiving fewer items in absolute terms, but also libraries are collecting
an ever smaller percentage of the world's annual output of scholarly publications. Since 1974, even increases in university press outputs have outstripped increases in library acquisition rates.
Moreover, the study documents that some of the fields experiencing the greatest increases in their share of the total output are precisely those with the highest average per-volume hardcover prices: business, law, medicine, science, and technology. According to the report, science had the highest average prices; social sciences and business experienced price increase rates closer to the GNP deflator (p. xix).
Another finding was that serials prices consistently increase faster than general inflation. Serials had an overall annual inflation rate of more than 11% from 1986 to 1990. Prices of scientific and technical journals rose at the highest rates (13.5% per year, on average, from 1970 to 1990), and the most expensive serials experienced the largest relative price increases. In contrast, book prices inflated at 7.2% per year, while the general inflation rate averaged approximately 6.1%. In some institutions, science journals could comprise only 29% of the total number of journal subscriptions but consumed 65% of the serials budget. According to the Mellon report, "three European commercial publishers (Elsevier, Pergamon, and Springer ...) accounted for 43% of the increase in serials expenditures at one university between 1986 and 1987" (p. xxi). The report does not introduce the question of the extent to which these inflation rates in the prices of scientific journals reflect increasing costs of production, expansion in content, price gouging, or the market value of the information itself-a value that might extend well beyond the university.
Brian Hawkins's 1996 study of library acquisition budgets of 89 schools finds that although budgets nearly tripled from 1981 to 1995 and increased by an average of 82% when corrected for inflation using the Consumer Price Index, the average library in the study lost 38% of its buying power. In the 15 years covered by his study, the inflation rate for acquisitions was consistently in the midteens. Confirming the Mellon study, he finds that the costs of some science journals increase more than 20% per year. He also notes that the trend line for average increases in library acquisition budgets is downward, accelerating the rate of decline in volumes added to collections.
Harrassowitz regularly alerts libraries to subscription pricing information so that its customers can plan in advance to adjust purchasing patterns to stay within budget. In November 1996, Harrassowitz provided firm 1997-98 subscription pricing for six publishers publishing the majority of the STM (science, technology, and medicine) journals. The announced price increases ranged from 1.2% to 22%, averaging 11%. According to Harrassowitz's analysis, libraries categorized as "General Academic/including Sci-Tech" could expect average price increases from the six publishers of almost 14%.
Peter Brueggeman from the Scripps Institution of Oceanography (SIO) Library at UCSD has discussed the problem from the perspective of a science li-
brary. During the five-year period from 1992 to 1996, journal subscription costs at SIO rose 57% but the recurring collections budget increased 2.3%. Brueggeman singles out Elsevier and Pergamon for particular analysis: "Elsevier titles had a 28% increase between 1995 and 1996 and a 32% increase between 1992 and 1993. Pergamon titles had a 29% price increase between 1995 and 1996 and a 17% price increase between 1992 and 1993."
Various authors have demonstrated that not only do the most expensive journals experience the highest rates of inflation, but they are also among the most used. Chrzastowski and Olesko found that over a period of eight years, the cost of acquiring the ten most used chemistry journals increased 159% compared to an increase of 137% for the 100 most used journals. During the same period, usage of the top ten journals increased 60% compared to an increase of 41% for the top 100 journals.
Library budgets that inflate more slowly than the rate of inflation for scholarly journals will cause a steady decline in the number of titles held in each library. Because libraries generally cancel journals on the basis of use, high-use, high-inflating titles may be protected. This protection results in a gradual homogenization of collections among libraries. Lesser-used titles, many with low prices and low inflation rates, will be crowded out faster than the general rate of decline in library subscriptions.
Figure 18.1 illustrates a hypothetical scenario. This scenario assumes that the collections budget is inflated by 4% per year. However, the average rate of inflation in the cost of scholarly publications is greater. The graph shows that if science journals, because they demonstrate high usage patterns, are canceled more slowly than other titles, then science journals will eventually crowd out other journals. In the example, the budget for science journals is allowed to inflate at approximately 8% per year (slightly less than one-half the actual inflation rate, but twice the rate of inflation in the total serials budget). Other, lesser-used journals, with lower subscription prices and lower rates of inflation, therefore must be canceled more rapidly in order for the collections budget to be balanced. Within a few years, the crowding-out effect from protection of high-use/high-price/high-inflation journals is quite noticeable. While no particular library may implement a budget strategy exactly like that depicted, all libraries tend to retain subscriptions to the highest use journals and to cancel first the lesser-used journals. Although the curve may change as the time line lengthens or shortens, the eventual result will be similar to that shown.
Library Materials: Electronic
As yet, there is no evidence that the emergence of electronic journals will improve the fundamental economic problems in the cycle of scholarly communication. The basic premise of publishers is that they must protect their current revenue base and secure guarantees to cover future inflation and increases in content.
Thus, publishers frequently structure their initial subscription pricing for digital journals upon the actual cost of paper subscriptions acquired by the institution with which the publisher is negotiating. The proposed base subscription rate may include all subscriptions: library, departmental, personal, and other types identified with the campus, thereby greatly increasing the price that the library would have to pay to receive the digital journals. Clearly, publishers are concerned that network availability of electronic journals on the campus network will undermine nonlibrary subscriptions to the print versions.
In early 1996, Ann Okerson reported that
In general electronic licenses so far have cost on average 1/3 more than print equivalents.... Publishers are setting surcharges of as much as 35% on electronic journals, and libraries simply do not have the capacity to pay such monies without canceling a corresponding number of the journals of that particular publisher or dipping into other publishers' journals.
Moreover, during license negotiations for certain electronic journals, libraries may be asked to consent to such provisos as the following:
1. That there be multiyear price increase guarantees to compensate for inflation, often at somewhat lower rates than historical rates of price increase for print materials
2. That there be upward price adjustments for increases in content, often capped at lower rates than typical for print journals
3. That the publisher be protected against declines in revenue through cancellation
4. That fair use rights typical for print journals be abrogated for the digital journals
Maintaining a combination of print and electronic subscriptions for a multiyear period without incurring substantial new marginal costs for electronic versions and ensuring a cap on inflation are attractive to libraries. But neither "feature" of these new licenses will alter the basic economic difficulty facing libraries: inflation in the price of scholarly information outstrips libraries' ability to pay. In fact, by being locked into multiyear agreements that ensure price increases to particular publishers, libraries hasten the rate at which other journals and monographs are crowded out of the market.
Not all scientific publishers have negotiated as described above. For example, the American Physical Society and the American Mathematical Society offer electronic versions of their journals free to subscribers to the print editions. Clearly, publishers must find revenue streams that will enable them to survive, and the pricing structures for both print and digital journals are the key to those revenue streams. To base a pricing structure for electronic publishing on the costly print model will not be economically viable in the long run (it may, in fact, be unsustainable in the short term as well). Libraries' declining budgets will result inevitably in cancellations to avoid the structural problems associated with doubledigit inflation, thereby undermining the publishers as well.
The current economic model for scholarly publication cannot be sustained. Continued escalation in the prices for scholarly journals, stagnation in library budgets, and isolation of the creators and consumers of scholarly information (the faculty) from the effects of the economy could lead to the collapse of the system of scholarly communication itself.
Operations Costs in Libraries
Library operations costs associated with printed scholarly journals include the costs to acquire, process, preserve, and circulate journals. Each library's costs differ based on the organizational structure, degree of centralization or decentralization of processes, differentials in salary and benefit scales, effectiveness of automated systems, success at reengineering internal operations, and other factors.
University Libraries and Scholarly Communication reports that "salaries as a percentage of total library expenditures have declined over the past two decades, while other operating expenditures (heavily reflecting computerization) have risen
markedly" (p. xxii). The report infers that the increases in other operating expenditures reflect automation of technical service operations such as acquisition, cataloging, serials control, and circulation. It also notes that despite the decline in salaries as a percentage of total library expenses and the increase in other expenditures, "the number of volumes added per staff member has declined" (p. xxii), implying that there has not been a measurable staff efficiency gain from the investments in automation. In fact, on average, library staff increased by a total of 7% from 1970 to 1985 and by 6% from 1985 to 1991. Thus the rise in the nonsalary operations portion of the total operating expenses did not occur through staff reductions but rather as a result of budget augmentation for nonsalary items.
Presumably, greater efficiency in processing and circulation coupled with declining acquisitions should have resulted either in staff reductions or in substantial shifts of personnel away from the back room of technical processing to and into service to faculty and students, but it is not possible to discern from ARL statistics whether this is so. The ARL did not report service transactions until 1991, so one cannot discern changes in user demand for the earlier periods. Between 1991 and 1996, the ARL reports steady increases in interlibrary borrowing, library instruction, reference transactions, and total circulation. During the same period, total staff has declined by 2%.
The inability to learn from the ARL reports or other reliable studies how libraries might be changing their staff allocation among operations and services reflects a serious flaw common to almost all analyses of library costs relating to both collections and operations. It is not obvious to what extent nonsalary investments, for example in automated systems, have actually improved processing productivity or the quality of services rendered by staff; nor is it clear whether or to what degree these investments have moderated the rate of rise of operations costs.
Library rankings typically reflect inputs such as budget, volumes acquired, number of serial subscriptions maintained, size of staff; or operational statistics such as the number of circulation transactions, titles cataloged, hours of opening, or items borrowed through interlibrary services. Ironically, the ARL Index ranks research libraries in part on the number of staff they employ; improving productivity and reducing staff accordingly would have the paradoxical effect of reducing a library's ranking vis-à-vis its peers. Developing measurements of service outcomes related to the mission of the institution would be more helpful as comparative data. For example, how do a library's collections and services contribute to research quality, faculty productivity, or student learning? The problem of defining productivity of knowledge workers was mentioned 30 years ago by Peter Drucker and is further examined by Manuel Castells in his recent book The Rise of the Network Society. Library operations represent a clear example of this productivity paradox.
William Massy and Robert Zemsky, discussing the use of information technology to enhance academic productivity in general, remark on its transformational potential, calling it a "modern industrial revolution for the university" that can
create economies of scale, deliver broad information access at low marginal cost, and allow for mass customization. The analysis they provide for the academy at large would appear to be even more pertinent to libraries, many of whose functions are of a processing nature similar to those in industry and whose services can also be generalized to a greater degree than is possible for teaching and research.
Massy and Zemsky suggest that although capital investments in technology to enhance productivity will increase the ratio of capital cost to labor cost, they may not actually reduce overall costs. But the writers argue that those investments will save money in the long term because over time labor costs rise with productivity gains and technology costs decline.
The primary purposes of automating processing operations in libraries have been to reduce the rate of rise of labor costs and to improve timeliness and accuracy of information. From the point of view of faculty and students, the service improvements are the most important result of automation. For example, on-line catalogs and automated circulation services provide users with more rapid access to information about the library's collections, reduce errors in borrowing records, and support more timely inventory control. Use of on-line indexing and abstracting services rather than the print versions preserves the scarce time of scholars and effectively extends the library's walls throughout the network.
Despite the efficiencies that automation has brought, labor costs to perform library processing operations such as ordering and receiving, cataloging, maintenance of the physical inventory, and certain user services including interlibrary lending and borrowing remain substantial. A transition to electronic publishing of journals (accompanied by the elimination of print subscriptions) could enable libraries to reduce or eliminate many of the costs of these labor-intensive operations. The freed-up resources might then be moved into higher priority services, necessary capital investments in technology, or provision of technology-based information resources. The benefits to end users could also be significant: less time spent in finding and retrieving physical issues of journals.
In the very long term, restructuring of library operations in response to electronic scholarly publishing could, in theory, result in major improvements to the quality of library services and also reduce operations costs. However, to maximize operations costs reductions, libraries will need to define better the desired outcomes of their operations investments, measure those outcomes effectively, and engage in rigorous reengineering of processes.
Several studies have attempted to quantify typical costs of acquiring journals. In a study funded by CLR (The Council on Library Resources), Bruce Kingma found the average fixed cost of purchasing a journal subscription to be $62.96. In discussing the economics of JSTOR, Bowen estimates the costs of processing, check-in, and binding to be approximately $40.00. In 1996, the library of the University of California, Berkeley estimated the physical processing costs, including check-in of individual issues, bindery preparation, and binding, for print serial subscriptions received and housed in the main library to be as low as $17.47 for
a quarterly journal and up to $113.08 for a weekly journal. Berkeley's figures exclude the costs of ordering and order maintenance under the assumption that those costs will not differ significantly for electronic journals. The figures also exclude staff benefit costs and overhead and therefore understate the true cost to the university of receiving print subscriptions. Assuming an average annual processing cost of $50.00 per print serial subscription, a research library subscribing to 50,000 titles may incur an operations cost of $2.5 million per year simply to acquire journals.
Once the library acquires these journals, it begins to incur the costs of making them available to students, faculty, and other users. In the late 1980s, Michael Cooper reviewed the costs of alternative book storage strategies. He found that circulation costs ranged from a low of $.53 per transaction in a medium-sized, open-stack research library to a high of $9.36 per transaction from a remote storage facility. Adjusted for inflation of 3% per year, these costs would range from approximately $.67 to $11.86 per transaction today. Berkeley calculates that an average circulation transaction costs approximately $1.07, and Bowen's estimate is $1.00. According to ARL Statistics, 1995-96, the mean number of initial circulations per library was 452,428. Using a circulation transaction cost of $1.00, the average ARL library spent almost $500,000 to circulate materials during the fiscal year 1995/96.
A review of the costs of acquiring and circulating print journals indicates that a transition from print to electronic journals would eventually reduce annual library operations costs related to providing the university community with the fruits of recent scholarship, but it is not clear how much of these savings might be offset by costs of technology infrastructure and equipment replacement. Large recurring expenses in support of historical print collections would continue but gradually diminish over time as the aging of the collection reduced the rate of usage. The long-term cost reductions could be substantial in the sciences where currency of information is of utmost importance. Costs associated with traditional operations and physical facilities might be more rapidly reduced were high-use print collections converted to digital form. Ultimately, the shift from labor-intensive processing operations to capital investments in electronic content (current journals and retrospective conversion of high-use print collections) might bring about the kinds of effects envisioned by Massy and Zemsky.
However, caution must be exercised in forecasting these types of savings. Despite the potential for long-range cost reductions, savings are unlikely to occur to any significant degree in the short term. The pace of transition from print to digital journals is moving slowly, and only those publishers with a strong financial base will be likely to succeed in quickly providing on-line access. As noted above and in the section of this paper relating to publishers' cost structures, there is no clearly viable economic path to move from print to digital publishing. Moreover, because user acceptance of digital journals may not occur rapidly and because of the many uncertainties about archiving digital information, libraries will need to
maintain print collections-historical and prospective-into the foreseeable future, requiring that investments in operations and facilities be maintained.
Interlibrary borrowing and lending is a growing cost within research libraries, and its rate of increase promises to escalate as the inflation-generated rate of serials cancellations escalates. According to the ARL, faculty and students borrowed more than twice as many items through interlibrary loan in 1996 as they did in 1986. The University of California Libraries recently reported an annual increase approaching 10% per year. Interlibrary services are labor-intensive operations; in 1993, the ARL conducted a cost study that determined the average cost of a borrowing transaction to be $18.62 and that of a lending transaction to be $10.93. The average ARL university library processed 17,804 interlibrary borrowing transactions and 33,397 interlibrary lending transactions during 1995-96, incurring an average annual cost of approximately $700,000. Given the rate of rise of interlibrary resource sharing transactions as well as the rate of rise of labor costs, research libraries are likely to experience increasing interlibrary borrowing and lending costs of about 10% per year. The rate of rise of interlibrary lending costs could be reduced through use of on-line journals rather than print journals; but if traditional print-based fair use practices are abrogated in the on-line environment, publishers might create pay-per-view contracts that would actually increase costs beyond the costs of manual interlibrary loans. Thus there are unknown cost implications in interlibrary resource sharing of digital information.
Capital assets in libraries are of three basic types: buildings, collections, and equipment. Expenditures for the most costly of these assets, buildings, are typically not a part of library budgets and therefore are not generally considered by librarians in their discussions of library costs. This paper will not attempt to discuss capital costs for library buildings in any depth except to cite several relevant studies. In the late 1980s Cooper estimated the construction cost per volume housed in an on-campus open-stack library to range from $4.33 for compact shelving to $15.84 for traditional open stacks; he calculated the construction cost per volume of a remote regional storage facility to be $2.78. Bowen uses Cooper's construction costs, adjusted for inflation, and Malcolm Getz's life cycle estimates to calculate an annual storage cost of $3.07 per volume. Lemberg's research substantiates Bowen's premises regarding the capital cost that might be avoided through digitization of high-use materials. He demonstrates that, even considering the capital costs of technology necessary to realize a digital document access system, research libraries as a system could accrue substantial savings over time if documents are stored and delivered electronically rather than in print form.
Extrapolating from Bowen's estimate of an annual storage cost of $3.07 per volume, a research library subscribing to 50,000 journal titles per year, each of which constitutes one volume, accrues $153,000 in new storage costs each year.
Over 10 years the cumulative cost to house the volumes received through the 50,000 subscriptions would exceed $8 million.
The growing dependence on information technologies to deliver scholarly information requires that universities make new investments in capital equipment and allocate recurring operations resources to the maintenance of that equipment and the network infrastructure. Although universities have invested heavily in network technologies, the true costs are still inadequately understood, and it is clear that increasing dependence on digital, rather than print, scholarly information will require that reliable funding for technology be developed. While capital costs for print libraries entail buildings, whose construction costs fall within known ranges and whose life cycle is long, and collections, the long-term costs of which can be rather reliably estimated, capital costs for the digital library are distributed across the campus and indeed the world. As yet, there is no clear formula to indicate how much initial capital investment in technology might be required to deliver a given number of digital documents to a given size academic community. Moreover, the life cycle for capital assets relating to delivery of digital library content is typically very short, perhaps shorter than five years. Thus funding allocations must be made frequently and regularly to ensure continued access to digital information. The Berkeley library, for example, estimates that annual equipment replacement costs for the existing installed base of equipment would be approximately $650,000, assuming a five-year life cycle. But the library has never had an explicit budget to support that expense, so investments in computer equipment, networking, and equipment replacement have been made through periodic redirection of the operating budget. Similar technology funding and renewal problems exist across the campus. Berkeley's situation is not unusual, and further work needs to be done to understand more fully the capital cost differentials between the physical plant investments required for print collections and the network investments required to make digital information available to the campus community.M
It is possible that if libraries and their parent institutions, universities, could avoid some of the capital and operations costs associated with print-based dissemination of scholarly publications, these resources could be reallocated to capital investments in technology, provision of additional information resources available to the academic community, service improvements within libraries, and restoration of control of the system of scholarly publishing to universities and scholarly societies rather than the commercial sector.
The Economics of Electronic Publishing: A View from the University of California Press
The market realities described in the first portion of this paper are sobering, but the basic outlines have been well known to libraries and scholarly publishers for more than a decade. This section discusses the realities for nonprofit journal pub-
lishers (university presses and scholarly societies) as a way of answering the question "So why don't publishers just reduce their prices-at least for electronic publications?" Although the focus is on nonprofit presses, the basic economics are equally true for commercial publishers, except that they require profits and have the considerable advantage of greater access to capital to fund innovation.
The largest constraint on all publishers' ability to radically change the price structure for electronic publications is the first-copy costs, which commonly range from 70% to 85% of the print price (see Table 18.1 for an example of first-copy costs for University of California Press journals).
These first-copy costs will remain whether the format is electronic, paper, or both. Any pricing model must provide sufficient income to cover these costs in addition to the unique costs associated with publishing in any particular medium. Publishers are not wedded to maintaining print revenues per se but to maintaining enough revenues to cover their first-copy and unique-format costs and to cover the costs of the technological shift. In the transition period, when both print and electronic editions must be produced, this objective will inevitably result in prices that are higher than print-only prices. Whether wholly electronic publications are, in the long run, more economical will depend on the costs of producing a uniquely electronic product and on the size of the market. If substantially fewer libraries subscribe to electronic publications than subscribed to their print predecessors, the cost per subscription will inevitably increase in order to cover a larger share of first-copy costs.
Electronic Pricing Models
There are a number of models for pricing electronic resources. But all of them boil down to various ways of obtaining revenue to cover the same set of costs. They all ultimately depend on the same formula of first-copy costs plus print costs plus electronic costs. Table 18.2 shows humanities journal x.
Electronic Access Provided "Free" Publishers that are providing electronic access "free" with print subscriptions are, in fact, subsidizing the costs of the electronic edition out of the surplus revenues generated by the print publication; the print publication already covers the first-copy costs allocated to each subscription. For relatively high-priced scientific journals with high first-copy costs, this subsidization can be done without inflating the price too substantially; the uniquely electronic costs are then subsidized by all institutional subscribers and hidden as a percentage of the total cost of publication. Because the basic subscription price is high enough, relatively modest additional increases will also cover the cost of lost individual subscriptions (since individual subscriptions typically cover the run-on costs of producing additional issues but make only a partial contribution to first-copy costs). This approach has the added advantage of sidestepping for now the problems of negotiating prices and guarantees with libraries (and the associated
overhead costs). However, it does not contribute to developing commonly understood and agreed upon cost recovery models that will cover the costs of electronic scholarly communication in the long run.
Extra Charge for Electronic Access, Bundled with Paper An electronic edition that is provided with a print subscription for an extra charge is essentially the same as the previous cost recovery model, but the increase to cover electronic costs is made explicit. This approach may be especially necessary for journals whose base rate is low and whose markup for electronic costs cannot be covered by a typical inflationary increase. This model still has the advantage, for publishers, of spreading the cost over all institutional subscribers and of simplifying licensing negotiations.
Negotiated Price by Library Based on Paper Subscription Base Some publishers take the basic institutional print subscription base and guarantee this revenue for a period of years (typically three). Publishers are willing to guarantee limits to inflationary increases for this period in exchange for the guaranteed income and protection from cancellations to help cover transition costs. Again, this approach works better with higher priced journals for which the added costs of electronic publishing are a smaller proportion of the total cost.
Separate Price and Availability for Electronic and Paper, with an Incentive for Bundling Offering paper and electronic editions separately but with an incentive for bundling is the method deployed by SCAN and by Project MUSE. This model offers more flexibility to libraries, because libraries are allowed to cancel print and take only electronic or to select among the publications offered. Discount incen-
tives encourage maintaining paper and electronic subscriptions (a strategy used by both projects) and ordering larger groups of journals (the entire list for MUSE; discipline clusters for SCAN). The advantage to this approach is that the costs of electronic publishing are made clear. (See the revenues section below for a discussion of the adequacy of this model for supporting humanities publishing in the long run and of the impact of consortia discounts.)
In all these models, the ultimate economic effect in the transition period is the same: costs for libraries go up. Publishers must cover their first-copy costs; continue to provide paper editions for individuals, many libraries, and international markets; and generate revenue to cover the infrastructure and overhead costs of electronic innovation. For nonprofit publishers, at least, these costs must all be supported by the revenues from current journal subscriptions.
It is likely, in the long run, that eliminating print editions entirely will reduce costs somewhat for some kinds of journals. However, for journals that are trying fully to exploit the new capabilities offered by electronic technologies, it seems likely that the additional costs of generating links, specialized formats, and so on will continue to cost as much, or nearly as much, as the cost of printing and binding. But even for simpler humanities journals, the experience at the University of California Press raises questions about the assumption that ongoing electronic costs will be substantially lower.
Covering Costs of Development The University of California Press's original economic model assumed that the development costs were largely one-time expenses and that there was a single learning curve and set of skills to master, after which electronic publishing would be largely routinized; additional expenses would be easily absorbed by the margin generated by the savings in the paper edition. On the basis of the past three years, it seems apparent that this assumption was flawed. UC Press dedicated 3,500 staff hours on the SCAN project in 1994 (gopher site development); 4,100 hours in 1995 (WWW site development); and 3,700 hours in 1996 (largely on WWW development and on laying the groundwork for SGML implementation). It is apparent from ongoing trends in technological innovation that Internet technology and expectations for electronic publishing will continue to evolve very rapidly for at least the next 20 years. The Press's "bad luck" in initially developing for an outmoded platform (gopher) is an inevitable occurrence over the long term for electronic publishing projects. As a result, it seems foolhardy to assume that substantially less investment will be necessary for technical research, experimentation, and site redesign and revision in the future. Any viable economic model for the University of California Press must thus assume one or two technical FTE positions as part of ongoing overhead. (Please note that these positions will not include file server maintenance and enhancement, since the
costs of file service for the SCAN project are presently borne by University of California/Berkeley Library.)
The SCAN project has experienced ongoing instability in technical staff at the library and at UC Press. Being located in a region with such a strong hightechnology industry has actually proven to be a disadvantage, since current and potential employees can make so much more money at other jobs. This situation results in long staff vacancies and repeated training on the specifics of the project. In this way, the project again faces not one but rather a continual series of learning curves.
There is a third implication to this vision of a constantly evolving future. Functionality and continually changing platforms, combined with the Press's commitment to archiving and to long-term responsibility for viable electronic access, demand implementation of a coding system that is totally portable and highly functional. As a result, the commitment to SGML seems more and more wise as time goes on. This commitment leads the Press to reject image-based solutions like Acrobat, which would require less work and which would be faster to implement but which do not have long-term migration paths. Having experienced the painful process of completely receding individual files, the Press does not want to face the same problem with a much larger set of files in the future. The necessity and the difficulty of repeated conversions of legacy text is sadly underestimated by many publishers and librarians. Scaleability, an important and underrated issue in any case, becomes even more vital in a scenario in which larger and larger amounts of material must be converted each time the technological environment evolves.
Electronic publishing is adding new duties (and requiring new resources) within the Press, without removing present duties. For example, the Press has added .5 FTE in the journals production staff (a 25% increase) to handle liaison with suppliers, scanning and archiving of all images being published, archiving of electronic files, and routine file conversion duties. This position will clearly grow into a full-time position as all the journals are mounted on-line; only the slowness of the on-line implementation permits the luxury of this low staffing level. The seven people working on Project MUSE or the seven people working on The Astrophysical Journal Electronic Edition confirm this assumption. In addition, clearing electronic rights for images in already published books and journals and maintaining an ongoing rights database creates a new staff responsibility, since many rights holders are requiring renewal of rights and payments every five to ten years. The need for technical customer support is still unknown but surely represents some portion of an FTE.
Marketing is another area requiring addition of new expertise and staff. Successfully selling electronic product requires a series of changes within the publishing organization. The marketing necessary to launch a new print journal successfully or to sell a book is expensive and time-consuming, but the approaches and tasks are familiar and can be performed by existing marketing staff as part of their
existing marketing jobs. In contrast, successfully establishing a customer base of licensed libraries for electronic product requires new skills and abilities, a substantial staff commitment, a higher level of staff expertise and authority, and substantial involvement from the licensing libraries. Marketing electronic services requires all the brochures and ads that print publications do. In addition, it requires substantial publicity efforts, a travel schedule to perform demonstrations at a wide range of library and end user meetings, and participation in appropriate LISTSERVs. There must be at least one staff member who has the requisite knowledge and authority and who can dedicate a large portion of time to outreach, negotiations, and liaison with potential and actual license customers and subscription agents. There are also demands for ongoing customer relations work, including the provision of quarterly or annual use reporting. The Press has found it very difficult to fit those new functions into its traditional marketing and distribution job descriptions and workloads. As the Press moves more seriously into electronic publication of current season books, it will surely need to hire a new person to market on-line books; these functions cannot possibly be integrated into the already busy jobs of books marketing professionals with their focus on current season bookstore sales.
In short, the Press anticipates a permanent addition of at least three or four full-time staff to the overhead of the publishing operation. For now, some of these positions are covered by the Mellon Foundation grant, and some of them have been deferred (to the detriment of the project), but in the long run the electronic publishing model must absorb this additional $200,000 in annual costs.
Finally, UC Press and the UC Library have just begun to step up to the costs of long-term archiving, including periodic refreshing of technology and the requisite reconversion of files-another argument for structured standardized coding of text.
Income for Electronic Product
Unfortunately, in a period when electronic publishing generates additional costs that must be funded, several trends apparent in the emerging purchase patterns of electronic products limit the income available to support publication costs and create further pressures on publishers to increase prices.
Slowness to Adopt University presses attempting to sell electronic product directly (as opposed to bundling it automatically in the paper price and offering "free" access) are finding that sales to universities are progressing more slowly than projected. Project MUSE sales, for example, are at 378 after two years; sales to MIT's electronic-only journals hover at around 100; in no case are there more than 50 library subscriptions. There are under 25 subscriptions to the online edition of The Cigarette Papers at the University of California/San Francisco Library's Brown and Williamson Tobacco site after nine months (http://www.library.ucsf.edu/tobacco/cigpapers/ ). Sales to SCAN are a handful (although access
has been restricted for less than one month at the time this paper is written). Even for publications for which no additional charge is being made, library adoptions are slow in coming. The Astrophysical Journal Electronic Edition, for example, has 130 libraries licensed to date. There are, of course, good reasons for this slowness: libraries face the same difficulties in building infrastructure, funding, and staff expertise that publishers do. But the low sales nevertheless make funding the transition more difficult, because publishers can't count on sales income from the electronic product to help to cover the costs of electronic publication. The growth curves to which publishers are accustomed from launching paper journals (even in this age of low library adoptions) are too optimistic when applied to electronic publications. This slowness has real consequences for funding electronic innovation.
New Discount Structures Emerging business practices and discount expectations lessen the income per subscribing institution (at the same time that the efforts necessary to obtain that subscription are intensified). The expectations of consortia for deep discounting (both for number of consortia members and for adopting a bundle of publications) can go as high as 40% for academic institutions, with nontraditional markets receiving even deeper discounts. If 70-85% of the list price represents first-copy costs, a 40% discount means that these subscriptions are no longer carrying their full share of the first-copy costs. Deep discounting cannot be a long-term pricing strategy.
In addition, other consortial demands (for example, demands that inflationary increases not exceed a certain percentage for several years or that access be provided to high schools free of charge) further lessen the ability of publishers to fund electronic innovation out of electronic product sales. Again, it is easy to empathize with these library positions and to understand why they are evolving. But these efforts by libraries to control costs actually have an inflationary pressure on overall prices, since the base price must increase to make up the losses.
Loss of Subscriptions Publishers are also worried about losing subscriptions. Some losses will surely happen. At a minimum, subscriptions will be reduced by another major wave (or waves) of cancellations as libraries try to cope with the ongoing costs of paper and electronic subscriptions from the major commercial science publishers and by the loss of any duplicate subscriptions still remaining on campuses. In addition, publishers are haunted by the potential for substantial shrinkage of individual subscriptions or society memberships as more and more scholars have "free" access from their campuses, though loss of individual subscriptions is less sure than library cancellations. (By December 1996, almost 60% of SCAN uses were coming from U.S. non-.edu addresses as more and more people obtain access from home workstations; it is possible that individuals will pay for the convenience of noncampus access, just as they now do for nonlibrary print access.) Nevertheless, because individual subscriptions play an increasingly important role in financing many journals (especially journals launched within the past
ten years, when library support has been so eroded), widespread cancellation would have a substantial impact that would force journal prices higher.
Possible Increases in Sales Two possible new revenue sources may somewhat balance the losses in income described above, although both are highly speculative at this point. First, publishers may obtain new institutional markets and wider distribution as consortia bring institutions like junior colleges and high schools to scholarly publications. Project MUSE has begun to see this trend. It is not clear, however, that these customers will be long-term subscribers. Given the present nature of scholarship, many of these new subscribers may conclude that any amount of money is too much to pay after two or three years of low use statistics, especially when on-demand access by article becomes widely available. There will be a substantial market for scholarship at junior college, high school, and public libraries only when the possibility of wider audiences through the Internet fundamentally changes the ways in which scholars write and present their work-a change that will surely take many years to materialize. Other publishers are more optimistic about this potential source of income.
Second, a substantial revenue stream may exist in sale of individual chapters and articles to scholars whose institutions do not have access, who do not have an institutional base, or who are willing to pay a few dollars for the convenience of immediate access at their workstations (people who are now presumably asking their research assistants to make photocopies in the stacks). And there may be substantial sales among the general public. This new product may represent enough income to relieve some of the pressure on journal finances, if the process can be entirely automated (at $6 or $7 per article, there is no room for the cost of an employee ever touching the transaction). This solution needs substantial traffic, because it takes seven or eight article sales to cover the first-copy costs of one typical humanities subscription.
Of course, the ability to purchase single chapters or articles will also diminish subscription revenues, as some libraries choose to fill user needs on demand and to cancel their present subscriptions. It is too soon to tell what the mix of new audiences and subscription cancellations will be, and whether the revenue stream from new sources will replace that from canceled subscriptions.
Aggregators So far, the models we have examined have all assumed that the publisher is providing access to electronic resources. Publishers could, of course, avoid many of these costs by providing electronic files to aggregators and leaving marketing, file service, file conversion, and archiving to outside suppliers who would provide a single point of purchase for libraries and individuals. This scheme offers a number of advantages from a library point of view. The instant connection between search engine and ordering ability that the larger services like UnCover and OCLC offer may potentially bring more end users.
But from a publishing point of view, this model has two very large disadvantages. The first is strategic. In an electronic world, one of the major values that
publishers have to offer is the branding value of our imprints as symbols of excellence resulting from peer review and gatekeeping functions, which will be ever more valuable in the time-starved world of the Internet. This brand identity is inevitably diluted in an aggregated world, especially if the aggregator is handling marketing and distribution.
Second, and more relevant to the discussion at hand, it is hard to see how the royalties most typically offered by aggregators (for institutional licenses or for ondemand use) can begin to replace the revenue lost from direct subscriptions. A 30-40% royalty does not cover first-copy costs of 80%. Only by retaining the entire fee can publishers hope to generate enough revenue for on-demand sales to make a sufficient contribution to the costs of publication. A wide-scale move to aggregation would have the effect of making the first-copy costs for the few remaining subscriptions very large indeed, in addition to reducing the perceived value of what we sell (yes, it is possible for a humanities quarterly to cost $1,200 annually!).
The University of California Press and most other nonprofit scholarly publishers would like nothing better than to price electronic products substantially lower than print. However, the low margins under which they operate, the demands of users that print continue to be provided, the high first-copy costs typical of scholarly publishing, the need to fund the development of electronic product, and the expenses of producing full-featured electronic publications all mitigate against low prices, at least during the transition period.
The university press and the library face economic pressures that neither can address alone. So long as journal prices escalate more rapidly than library collection budgets, libraries will continue to reduce serial subscriptions to balance the collections budget. These reductions will adversely affect the revenues to university presses. Pressure from science, technology, medicine, and business faculties to retain high-cost, high-use journals will crowd out less-used scholarly journals, many of which are published by university presses. Because libraries must continue to provide access to and preserve print inventories, housing them in large physical plants that must be maintained, they will be unable to implement large-scale, costreducing changes in operations to free up resources for investments in technology. The trends noted in University Libraries and Scholarly Communication and in Hawkins's paper will result in a catastrophic decline in the system of scholarly communication unless there is a fundamental shift in the way in which its processes, products, and costs are analyzed. Each of the two partners, the library and the press, serves as an inadequate unit of analysis for the system of scholarly communication as a whole.
Sandra Braman's description of the three stages in the conceptualization of the information society provides a useful context in which to view today's problems of
press and library within the system of scholarly communication. In her conceptualization, the first stage of the information economy is recognized by the increasing importance of information sector industries. In the second stage, certain forms of information never before recognized as commodities, become so recognized. In this stage, political controversy about information's value as a public good versus its market value as a commodity is highlighted. The rising commercialization of scholarly publishing and the declining ability of libraries to provide access to scholarly information may be interpreted as a second-stage information society phenomenon.
Braman postulates that the third stage of the information society produces a more sophisticated understanding of the flow of information: the flow may replace the market as the primary feature of the information economy. This stage represents a paradigm shift in which the information economy operates in a qualitatively different manner than in the two previous stages. According to Braman: "key insights of this perspective include identification of a new unit of analysis, the project, involving multiple interdependent organizations, as more useful than either the industry or the firm for analytical purposes"(p. 112). She further describes the third-stage conceptualization of the information economy as including a production chain, or "harmonized production flows," including information creation, processing, storage, transportation, distribution, destruction, seeking, and use, in short, all the stages of the system of scholarly communication from author to user, including the library. In the third stage, networked information economy, economic viability stems not from maximizing profit or economic stability within each component of the system, but rather through building long-term relationships and a stable system or flow of information.
Michael Hammer makes a similar point with respect to industrial or business reengineering but applicable to the operations of libraries and presses as well. He notes that automation and other reengineering efforts frequently have not yielded the improvements that companies desire. He believes that heavy investments in information technology deliver disappointing results because the technology is used primarily to speed up traditional business processes, along with their inefficiencies, rather than to transform them. "Instead of embedding outdated processes in silicon and software, we should obliterate them and start over. We should ... use the power of modern information technology to radically redesign our business processes in order to achieve dramatic improvements in their performance" (p. 104).
Both Braman and Hammer emphasize the disquieting qualities that characterize this kind of paradigm shift implied by the third stage of the information economy and by radical reengineering. According to Hammer,
Reengineering cannot be planned meticulously and accomplished in small and cautious steps. It's an all-or-nothing proposition with an uncertain result.... At the heart of reengineering is the notion of discontinuous thinking-of recognizing and
breaking away from the outdated rules and fundamental assumptions that underlie operations. Unless we change these rules, we are merely rearranging the deck chairs on the Titanic. We cannot achieve breakthroughs in performance by cutting fat or automating existing processes. Rather, we must challenge old assumptions and shed the old rules that made the business under perform in the first place ... Reengineering requires looking at the fundamental processes of the business from a cross-functional perspective.
Manuel Castells takes a different approach, suggesting that technology-driven productivity increases in the informational economy have not thus far been evident. His thesis is that technology-driven productivity increases were steady in the industrial sector between 1950 and 1973, but since 1973 productivity, particularly in the service sector, has stagnated despite the intensive investment in technology. He suggests three factors that appear to be relevant to the library and press sector as well as to the service sectors of the economy in general. These factors include the following.
1. Diffusion: before technological innovation can improve productivity markedly, it must have permeated the whole economy, including business, culture, and institutions.
2. Measuring productivity: Service industries traditionally find it difficult to calculate productivity statistically; thus the lack of observable productivity enhancements may in part be a symptom of the absence of relevant measures.
3. The changing informational economy: Productivity cannot easily be measured because of the broad scope of its transformation under the impact of information technology and related organizational change.
If Castells, Braman, and Hammer are correct, then libraries and presses, alone or together, cannot implement technological solutions that can transform the processes, productivity, and economics of scholarly publishing.
The Mellon projects have been useful in introducing two players in the information flow to the problems of the other, and in forging collaborative relationships to aid in sustaining the system of scholarly communication. These cooperative projects between university libraries and presses have helped participants begin to understand the system of scholarly publishing as an information flow rather than as separate operational processes. But their effectiveness is limited because, outside the parameters of the projects, the partners must still maintain their separate identities and economic bases.
A fuller exploration of the potential of transforming the flow of scholarly information would incorporate a more integrated approach, including the creators of the information, the university administration, and the information consumers as well as the publisher and the library. In this approach, costs and subsidies of the entire process of scholarly communication could be better understood and resources made more flexibly available to support it. For example, it might be possible to view operational and capital savings to libraries resulting from a transition
to electronic publication as resources ultimately available to sustain the publication chain, or consumers could be asked to pay some or all of the costs of creating, storing, archiving, and delivering scholarly information. A critical flaw in the current system is the existence of a part of the gift economy, in the form of the library, within a monetary economy for commercial publishers. Because the consumers of the information within the university do not pay for it, they and the campus administration see the library as a "problem" when it cannot provide the information needed within the budget allotted.
A key problem in securing the future of scholarly communication is that both presses and libraries are undercapitalized. Although libraries incur huge capital costs over time in both inventory and facilities, they are not free individually nor as parts of the system of scholarly communication to reallocate present or future capital expenditures to investments in new modes of publication. However, such reallocation, if it occurs at all, will take place very slowly because the transition to digital publication will also be slow. It is possible that a more rapid transition to electronic publishing would reduce libraries' recurring operations costs, thereby enabling them to invest greater resources in information itself. But a more rapid transition is feasible for presses only if there is a rise in demand for digital publications from libraries and from end users or a substantial increase in subsidies from their parent universities. Presses can offer electronic publications, but they cannot change the demand patterns of their customers-libraries-nor the usage patterns of the end consumers in order to hasten a transition from print to electronic dissemination. As long as a substantial portion of their market demands print (or fails to purchase electronic product), presses will be forced to incur the resulting expenses, which, in being passed on to libraries as costs that inflate more rapidly than budgets, will reduce the purchases of scholarly publications.
Ironically, in the present environment, universities tend to take budgetary actions that worsen the economics of scholarly communication as experienced by both libraries and presses. University administrators increasingly interpret any subsidy of university presses as a failure of the press itself as a business; as university subsidies are withdrawn, presses must increase prices, which reduces demand and exacerbates the worsening fiscal situation for the presses. But in the networked economy where everyone can be an author and publisher, the value added by presses (for example, gatekeeping, editorial enhancement, distribution) may be more important than ever in helping consumers select relevant, high-quality information. At the same time, university administrators see the library as a black hole whose costs steadily rise faster than general inflation. Since library materials budgets grow more slowly than inflation in the costs of scholarly publications, the inevitable result is reduced purchasing of scholarly publications of all types, but particularly of university press materials, which in general are of lesser commercial value in the commodity market. Unless the system as a whole changes, both university presses and university libraries will continue to decline, but at accelerated rates.
Although it is not possible to envision with certainty exactly how a successful transition from the present system to a more sustainable system might occur, one plausible scenario would be for universities themselves to invest capital resources more heavily in university-based information flows and new forms of scholarly publication as well as place increased market pressures on the commercial sector. If universities were to make strategic capital and staffing investments in university presses during the short term, the presses could be more likely to make a successful and rapid transition to electronic publication. At the same time, intensive university efforts (i.e., investments) to recover scientific, technical, medical, and business publishing from the private sector could be made to reduce the crowding out of university press publications by for-profit publishers. These efforts to recover scholarly publishing could be accompanied by libraries' placing strong market pressures on commercial publishers through cancellation of journals whose prices rise faster than the average rates for scholarly journals in general. The investments in these two areas: converting publication processes to electronic form and returning commercial scholarly publishing to the university could be recovered over time through reductions in capital investments in library buildings. Ultimately, the university itself would encompass most of the information flow in scholarly communication through its networked capability. That information having commodity value outside the academy could be sold in the marketplace and the revenues used as a subsidy to the system itself.
Another way of accomplishing a harmonization of the scholarly information economy was suggested by Hawkins: the independent nonprofit corporation model in which universities and colleges would invest together in a new organization that would serve as a broker, negotiator, service provider, and focus for philanthropy. It would leverage individual resources by creating a common investment pool.
However the solution to the problem of the economic crisis in scholarly communication is approached, there must be a fundamental change in how the process as a whole is conceived and how intellectual property rights of both authors and universities are managed. Such a change cannot be made unilaterally by university libraries and presses but will require the strategic involvement and commitment of university administrators and faculty within the university and among universities. Patricia Battin, envisioning an integrated scholarly information flow, said almost ten years ago:
Commitment to new cooperative interinstitutional mechanisms for sharing infrastructure costs-such as networks, print collections, and database development and access-in the recognition that continuing to view information technologies and services as a bargaining chip in the competition for students and faculty is, in the end, a counterproductive strategy for higher education. If the scholarly world is to maintain control of and access to its knowledge, both new and old, new cooperative ventures must be organized for the management of knowledge itself, rather than the ownership of formats.