Electronic Agent
Publishers could establish their own digital distribution function by creating a Uniform Resource Locator (URL) for each title. The publisher would deal directly with libraries and individual readers. For a number of reasons, the publisher is likely to prefer to work with an agent for electronic distribution. Just as the typesetting and printing is usually performed by contractors, so the design and distri-
bution of electronic products is likely to involve specialized agents. However, the role of electronic distribution agent is becoming more important than that of the printer for two important reasons. The first arises because of economies of scale in managing access to electronic services. The second concerns the potential advantages of integrating individual journals into a wider database of academic information. The electronic agent accepts materials, say journal titles, from publishers and mounts them on electronic services to be accessed by the Internet. The agent captures economies of scale in maintaining the service and in supporting a common payment mechanism and a common search interface and search engine, and may take other steps to integrate articles and journal titles so that the whole is greater than the sum of the parts.
OCLC was an early entrant in the market for electronic distribution of academic journals with Online Clinical Trials. Online Clinical Trials was priced at $220 for institutions and $120 for individuals.[26] OCLC shifted to a World Wide Web interface in January 1997. In 1998, OCLC's First Search Electronic Collections Online offers access to hundreds of titles from many publishers. Most of the journals deliver page images using Adobe's PDF. OCLC's new approach offers publishers the opportunity to sell electronic access to journals by both subscription and pay-per-look.[27] It charges libraries an access fee based on the number of simultaneous users to be supported and the number of electronic journals to which the library subscribes. Libraries buy subscriptions from publishers. Publishers may package multiple titles together and set whatever rates they choose. The following discussion puts the strategies of OCLC and other electronic agents in a broader context.
Storage and Networks
With electronic documents, there is a basic logistical choice. A storage-intensive strategy involves using local storage everywhere. In this case, the network need not be used to read the journal. At the other extreme, the document might be stored once-for-the-world at a single site with network access used each time a journal is read. Between these two extremes is a range of choices. With the cost saving of fewer storage sites comes the extra cost of increased reliance on data communication networks.
Data storage is an important cost. Although the unit costs of digital storage have fallen and will continue to fall sharply through time, there is still a considerable advantage to using less storage. Data storage systems involve not simply the storage medium itself, but a range of services to keep the data on-line. A data center typically involves sophisticated personnel, backup and archiving activities, and software and hardware upgrades. If 10 campuses share a data storage facility, the storage cost per campus should be much less than if each provides its own. Having one storage site for the world might be the lowest storage cost per campus overall.
To use a remote storage facility involves data communication. The more remote the storage, the greater the reliance on data networks. A central problem for data communication is congestion. Data networks typically do not involve traffic-based fees. Indeed, the cost of monitoring traffic so as to impose fees may be cost-prohibitive. Monitoring network traffic so as to bill to individuals on the basis of use would require keeping track of the origin of each packet of data and accounting for it by tallying a register that notes source, time, and date. Because even simple mail messages may be broken into numerous packets for network shipment, the quantity of items to be tracked is much more numerous than tracking telephone calls. If every packet must go through the toll plaza, the opportunity for delay and single points of failure may be substantial. Because each packet may follow a different route, tracking backbone use with a tally on each leg would multiply the complexity. Traffic-based fees seem to be impractical for the Internet. Without traffic-based fees, individual users do not face the cost of their access. Like a driver on an urban highway at rush hour, each individual sees only his or her own trip, not the adverse effect of his or her trip in slowing others down. An engineering response to highway congestion is often to build more highways. Yet the added highways are often congested as well. In data networking, an engineering solution is to invent a faster network. Yet individuals deciding to use the network will see only their personal costs and so will have little incentive to economize. The demand for bandwidth on networks will surely grow with the pace of faster networks, for example, with personal videophones and other video-intensive applications. Without traffic-based pricing, congestion will be endemic in data networks.
Another response to network congestion is to build private networks with controlled access. Building networks dedicated to specific functions seems relatively expensive, but may be necessary to maintain a sufficient level of performance. Campus networks are private, and so access can be controlled. Perhaps investments in networking and technical change can proceed fast enough on individual campuses so as to allow the campus network to be reliable enough for access to journals and other academic information.
Because the telephone companies have launched data network services, they seem likely to introduce time-of-day pricing. Higher rates in prime time and higher rates for faster access speeds are first steps in giving incentives to economize the use of the network and so to reduce congestion. America Online (AOL) ran into serious difficulty when, in late 1996, it shifted from a per hour pricing strategy to a flat monthly rate to match other Internet service providers. AOL was swamped with peak period demand, demand it could not easily manage. The long distance telephone services seem to be moving to simpler pricing regimes, dime-a-minute, for example. The possibility of peak period congestion, however, likely means that some use of peak period pricing in telephones and in network services will remain desirable. In the end, higher education's ability to economize on data storage will depend on the success of the networks in limiting congestion.
Figure 6.4.
Network Intensity and Database Integration
Some milestones in the choice of storage and networks are illustrated along the horizontal margin of Figure 6.4. The rapid growth of the World Wide Web in the last couple of years has represented a shift toward the right along this margin, with fewer storage sites and more dependence on data communication. The World Wide Web allows a common interface to serve many computer platforms, replacing proprietary tools. Adobe's Portable Document Format (PDF) seems to offer an effective vehicle to present documents in original printed format with equations, tables, and graphics, yet allow text searching and hypertext links to other Web sites. The software for reading PDF documents is available without charge, is compatible with many Web browsers, and allows local printing. Some of the inconveniences of older network-based tools are disappearing.
That rightward shift may offer the electronic agent an advantage over either the publisher or the library. That is, the electronic agent may acquire rights from publishers and sell access to libraries, while taking responsibility for an optimal
choice of storage sites and network access. Storage might end up in a low-cost location with the electronic agent responsible for archiving the material and migrating the digital files to future hardware and software environments.
Integration into a Database
The second advantage for an electronic agent is in integrating individual journal titles and other electronic materials into a coherent database. The vertical margin of Figure 6.4 sketches a range of possibilities. At root, a journal title stands as a relatively isolated vehicle for the distribution of information. In the digital world, each title could be distributed on its own CD or have its own URL on the Web. Third party index publishers would index the contents and provide pointers to the title and issue and, perhaps, to the URL. Indeed, the pointer might go directly to an individual article.
However, relatively few scholars depend on a single journal title for their work. Indeed, looking at the citations shown in a sampling of articles of a given journal reveals that scholars typically use a range of sources. A database that provides coherent access to several related journals, as in the second tier of Figure 6.4, offers a service that is more than the sum of its parts.
At yet a higher level, an agent might offer a significant core of the literature in a discipline. The core of journals and other materials might allow searching by words and phrases across the full content of the database. The database then offers new ways of establishing linkages.
At a fourth level, the organizing engine for the database might be the standard index to the literature of the discipline, such as EconLit in economics. A search of the database might achieve a degree of comprehensiveness for the published literature. A significant fraction of the published essays might be delivered on demand by hitting a "fulfill" button. Fulfillment might mean delivery of an electronic image file via network within a few seconds or delivery of a facsimile within a few minutes or hours.
At a fifth level, the database might include hot links from citations in one essay to other elements of the database. The database might include the published works from journals with links to ancillary materials, numeric data sets, computer algorithims, and an author's appendices discussing methods and other matters. The database might invite commentary, and so formal publications might link to suitably moderated on-line discussions.
In integrating materials from a variety of sources into a coherent database, the electronic agent may have an advantage over publishers who offer only individual journal titles. The agent might set standards for inclusion of material that specifies metatags and formats. The agent might manage the index function; indeed, the index might be a basis for forward integration with database distribution, as EI has done. This issue is discussed more fully below.
Integration of diverse materials into a database is likely to come with remote
storage and use of networks for access. Integrating the material into a database by achieving higher levels of coherence and interaction among diverse parts may be at lower cost for an electronic agent than for publishers of individual journals or for individual libraries. The agent is able to incur the cost of integration and storage once for the world.
Agent's Strategy
Given the interest of publishers in licensing their products for campus intranets and the universities' interest in securing such licenses, there is opportunity for enterprises to act as brokers, to package the electronic versions of the journals in databases and make them accessible, under suitable licenses, to campus intranets. The brokers may add a markup to reflect their cost of mounting the database. The size of the markup will reflect the extent of integration as well as the choice of storage strategy.
SilverPlatter became the most successful vendor of electronic index databases by making them available on CDs for use on campus intranets with proprietary software. OCLC plays an important role in offering such databases from its master center in Ohio. Ovid, a third vendor, supports sophisticated indexing that integrates full text with Standard Generalized Markup Language (SGML) and Hypertext Markup Language (HTML) tagging. A number of other vendors have also participated in the index market and are likely to seek to be brokers for the electronic distribution of journals.
A core strategy will probably be to mount the database of journals on one or more servers on the World Wide Web, with access limited to persons authorized for use from licensed campuses or through other fee-paid arrangements. This strategy has three important parts: (1) the database server, (2) the Internet communication system, and (3) the campus network.
The advantage of the World Wide Web approach is that the data can be made accessible to many campuses with no server support on any campus. A campus intranet license can be served remotely, saving the university the expense of software, hardware, and system support for the service.
The risk of the Web strategy is with the Internet itself and its inherent congestion. OCLC used a private data communication network so as to achieve a higher level of reliability than the Internet and will do the same to ensure high-quality TCP/IP (the Internet Protocol) access. Some campuses may prefer to mount database files locally, using CD-ROMs and disk servers on the campus network. Some high-intensity campuses may prefer to continue to mount the most used parts of databases locally, even at extra cost, as a method of ensuring against deficiencies in Internet services.
The third element, after storage and the Internet, is the campus network. Campus networks continue to evolve. Among the hundred universities seeking to be top-ten universities, early investment in sophisticated networking may play a
strategic role in the quest for rank. On such campuses, network distribution of journals should be well supported and popular. Other campuses will follow with some lag, particularly where funding depends primarily on the public sector. Adoption within 10 years might be expected.[28]
The electronic agent, then, must choose a strategy with two elements: (1) a storage and network choice and (2) an approach to database integration.
Journal publishers generally start at the bottom left of Figure 6.4, the closest to print. They could make a CD and offer it as an alternative to print for current subscribers. The AEA offers the Journal of Economic Literature on CD instead of print for the same price.
Moves to the upper left seem to be economically infeasible. Integrating more materials together increases local storage costs and so tilts the storage-network balance toward less storage and more network. With more data integration, the agent's strategy will shift to the right.
Moves to the lower right, with reduced storage costs and more dependence on networks, should involve considerable cost savings but run risks. One risk is of network congestion. A second is of loss of revenues because traditional subscribers drop purchases in favor of shared network access. The viability of these strategies depends on the level of fees that may be earned from network licenses or pay-per-look.
Moves along the diagonal up and to the right involve greater database integration with cost savings from lower storage costs and more dependence on networks. The advantage of moves upward and to the right is the possibility that integration creates services of significantly more value than the replication of print journals on the Internet. When database integration creates significantly more value, subscribers will be willing to pay premium prices for using products with remote storage with networks. Of course, network congestion will remain a concern.
A move toward more database integration raises a number of interesting questions. The answers to these questions will determine the size of the markup by the electronic agent. How much should information from a variety of sources be integrated into a database with common structure, tags, and linkages? For a large database, more effort at integration and coherence may be more valuable. Just how much effort, particularly how much hand effort, remains an open question. If the electronic agent passively accepts publications from publishers, the level of integration of materials may be relatively low. The publisher may provide an abstract and metatags and might provide URLs for linking to other network sites. The higher level of integration associated with controlled vocabulary indexing and a more systematic structure for the database than comes from journal titles would seem to require either a higher level of handwork by an indexer or the imposition of standard protocols for defining data elements. Is a higher level of integration of journal material from a variety of sources sufficiently valuable to justify its cost? The index function might be centralized with storage of individual journals distributed around the Net. Physical integration of the
database is not necessary to logical integration, but will common ownership be necessary to achieve the control and commonality necessary for high levels of integration?
A second question concerns how an agent might generate a net revenue stream from its initial electronic offerings sufficient to allow it to grow. The new regime will not be borne as a whole entity; rather, it will evolve in relatively small steps. Each step must generate a surplus to be used to finance the next step. Early steps that generate larger surpluses will probably define paths that are more likely to be followed. Experimentation with products and prices is already under way. Those agents finding early financial success are likely to attract publishers and libraries and to be imitated by competitors.
JSTOR has captured the full historic run of a significant number of journals, making the promise of 100 titles in suites from major disciplines within three years. However, it does not yet have a program for access to current journals. Its program is primarily to replace archival storage of materials that libraries may or may not have already acquired in print.
OCLC's approach is to sell libraries access services while publishers sell subscriptions to the information. The publisher can avoid the cost of the distribution in print, a saving if the electronic subscriptions generate sufficient revenue. The unbundling of access from subscription sales allows the access to be priced on the basis of simultaneous users, that is, akin to the rate of use, while the information is priced on the basis of quantity and quality of material made available. Of course, the information may also be priced on a pay-per-look basis and so earn revenue as it is used. What mix of pay-per-look and subscription sales will ultimately prevail is an open question.
A third question is whether publishers will establish exclusive arrangements with electronic agents or whether they will offer nonexclusive licenses so as to sustain competition among agents. Some publishers may prefer to be their own electronic agents, retaining control of the distribution channels. If database integration is important, this strategy may be economic only for relatively large publishers with suites of journals in given disciplines. Many publishers may choose to distribute their products through multiple channels, to both capture the advantages of more integration with other sources and promote innovation and cost savings among competing distributors.
As the electronic agents gain experience and build their title lists, competition among them should drive down the markups for electronic access. If the store-once-and-network strategy bears fruit, the cost savings in access should be apparent. If higher levels of database integration prove to be important, the cost savings may be modest. Cost savings here are in terms of units of access. As the cost of access falls, the quantity of information products used may increase. The effect on total expenditure, the product of unit cost and number of units used, is hard to predict. If the demand for information proves to be price elastic, then as unit costs and unit prices fall, expenditures on information will increase.
The electronic agents will gather academic journals from publishers and distribute them in electronic formats to libraries and others. They will offer all available advantages of scale in managing electronic storage, optimize the use of networks for distribution, offer superior search interfaces and engines, and take steps to integrate materials from disparate sources into a coherent whole. The agent will be able to offer campus intranet licenses, personal subscriptions, and pay-per-look access from a common source. The agent may manage sales, accounting, billing, and technical support. Today, agents are experimenting with both technical and pricing strategies. It remains to be seen whether single agents will dominate given content areas, whether major publishers can remain apart, or whether publishers and universities can or should sustain a competitive market among agents.