As a bookseller and a (future) librarian, I'm deeply concerned with the future of books. Books are convenient, functional, relatively inexpensive, portable, semi-permanent and often beautiful physical objects, and I'm convinced that for those reasons the codex will be with us for many decades to come. (Image by Philippe Kurlapski.)
But there are immense pressures building to shift books into digital formats. And as you might guess, huge international media and consumer electronics conglomerates aren't doing this for your convenience. They're pushing for electronic books to replace print because digital licenses give them more control.
The Copyright Act of 1976 gives book owners rights--fair use and the first sale rule--which limit the restrictions copyright holders can place on the use of copyrighted material. Fair use means that limited portions of a copyrighted work can be reproduced and re-used without permission for purposes such as education, commentary and parody; the first sale rule means that once you own a legitimately obtained copy of a copyrighted work you can generally give it away or sell it to someone else.
But in the digital realm, licenses, electronic protection measures, and proprietary software allow publishers to prevent fair use or transfer. Now, you can't copy even a small portion of your e-book; once you're done with it, you can't resell it or even give it away. And even your right to keep a copy of a work you've paid for can be revoked at any time by the seller.
Big Brother, thy name is Amazon. This became rudely apparent this summer to owners of Amazon's Kindle e-book reader who purchased certain editions of George Orwell's novels Animal Farm and 1984. As reported by Brad Stone in the New York Times, when Amazon discovered that the vendor of the titles did not actually own the U.S. rights (the books are in the public domain in Canada but not the U.S.), it remotely deleted the books from its customer's Kindles. Amazon's Kindle terms of service states that customers are buying the "right to keep a permanent copy of the applicable Digital Content"--but the 1984 incident shows how hollow that promise is. The irony that the books involved were Orwell's dystopian visions of totalitarianism is almost too delicious: in 1984, of course, Winston Smith's job at the Ministry of Truth is to delete records of the past that don't conform to current Party orthodoxy by dropping them down the "memory hole" where they will vanish forever.
Stone quotes Kindle owner Bruce Schneier, chief security technology officer for British Telecom: “It illustrates how few rights you have when you buy an e-book from Amazon...I can’t lend people books and I can’t sell books that I’ve already read, and now it turns out that I can’t even count on still having my books tomorrow.”
It wasn't just the texts that were deleted. Kindle allows readers to make their own notes, highlighting and annotations to electronic texts, and those were disappeared along with the e-books themselves. Stone writes of high-school student Justin Gawronski, who had a summer assignment to read 1984 and woke up to find that all of the annotations he'd made to his Kindle version of the book were gone. Gawronski said, "They didn’t just take a book back, they stole my work." Two months later, as reported by Miguel Helft in the New York Times, Amazon offered to replace readers' copies of the deleted Orwell works, including their annotations--an offer that probably comes a bit late for Justin Gawronski.
And as Harvard professor Jonathan Zittrain points out in Heft's article, Amazon's ability to remotely delete Kindle content raises the specter of texts you've purchased being permanently deleted or altered in the future in response to lawsuits or government action. Only with Kindle editions, there's no need for anything so crude as an incinerator--a few keystrokes will suffice.
So is the Kindle any good? In a recent New Yorker piece, novelist Nicholson Baker compared the experience of reading a book on his Kindle 2 to driving "a white 1982 Impala with blown shocks." He details the Kindle's shortcomings--its clunky design, the low contrast of its screen, the "black flash" that occurs every time it displays a new page, the lack of page numbers (instead each section of text has a "location range"), the poor reproduction of illustrations such as photographs, tables and charts. Worst of all, Baker writes that a comparison of the print, online and Kindle versions of the New York Times reveals that the Kindle version is missing content: not only photographs, but "its Web-site links, its listed names of contributing reporters, and almost all captioned pie charts, diagrams, weather maps, crossword puzzles, summary sports scores, financial data....Sometimes whole articles and op-ed contributions aren’t there." In a check of the July 8th and 9th editions of the Times, Baker found five articles that are entirely missing from the Kindle versions. (Image by ShakataGaNai.)
Speaking of monopolies... Presumably, better-designed e-book readers will solve many of the Kindle's issues. And books and other texts that are created specifically for e-book readers, rather than converted from print versions, will offer photographs, tables, charts, and other illustrations that are clearer and more legible. But for the foreseeable future the vast majority of electronic texts will remain digitized print versions.
The largest print-to-digital conversion project is Google Books, and its history to date is not reassuring. For one thing, with any Google service there are significant privacy concerns. For another, Google now has monopoly control over a huge collection of digitized works taken from library collections. As Anthony Grafton writes in the online New Yorker,
"The out-of-print books Google has digitized come from nonprofit institutions that built their collections as a public good....These public treasures will now be monetized for the benefit of a private corporation. True, Google will give every public and university library one terminal where readers can access its entire collection. But these machines won’t be able to download or print texts--and you can imagine the lines. Libraries that want full access to all the books in Google will have to pay for the privilege, and for every download."
Finding the book you want isn't so easy, though. As UC Berkeley's Geoffrey Nunberg details in a recent article in the Chronicle of Higher Education, Google Books' metadata is a mess. Nunberg found that titles were garbled ("Moby-Dick: or the White Wall"), authors mismatched to books ("Madame Bovary By Henry James"), publications misdated (Robert Shelton's No Direction Home: The Life And Music Of Bob Dylan was dated 1899), subject headings misassigned (Jill Watt's biography Mae West: An Icon In Black And White was in the Religion category--although she might indeed be a holy figure to some), and text links misdirected (the link for the 1818 tract Theorie de l'Univers took Nunberg instead to Barbara Taylor Bradford's 1983 novel Voice of the Heart). A 1995 book about the web browser Mosaic was given the publication date of 1939 and attributed to Sigmund Freud.
These examples point to two main problems with the way Google creates metadata. The first is that instead of paying for the thorough, detailed, and accurate metadata painstakingly created by librarians for their collections, Google is clearly relying on automatic assignment of metadata via computer scanning. Nunberg finds that an 1890 guidebook, London of To-Day, was given the publication date of 1774 (a very different "to-day" than 1890) because the first pages contained an ad for a clothing manufacturer founded in 1774. The medieval studies journal Speculum was assigned to the subject category Health and Fitness, because the computer didn't understand the difference between the Latin word for mirror and the medical instrument. These are the sorts of errors that arise from trying to automate a process that requires human judgment.
The second problem is that instead of using library classification systems, such as call numbers and the Library of Congress Subject Headings, Google has chosen to use the Book Industry Standards and Communications (BISAC) system. The BISAC system was designed for bookstores containing thousands or perhaps tens of thousands of titles, not for research collections of millions of volumes. BISAC's idiosyncrasies get magnified by the sheer size of the Google Books collection. Call numbers permit distinctions between and groupings of similar works; for example, when you find a book in an online library catalog, most allow you to browse a virtual "shelf" to examine all the books that are classified in nearby call number ranges. But the BISAC categories are too crude to be a useful browsing tool when the collection consists of millions of volumes.
In the absence of usable metadata, the only efficient way to access Google Books is through text searching. That doesn't work very well if you are looking for a specific edition of a work (since the text will be identical in each), or are trying to investigate the literature of a particular period. Nunberg doesn't say so, but it's clear that Google needs to hire some librarians to sort out its metadata.
Meanwhile, libraries are eliminating books. David Abel of Boston.com reports that the administrators of Cushing Academy, a Massachusetts prep school, have decided to eliminate the books from its library:
"In place of the stacks, they are spending $42,000 on three large flat-screen TVs that will project data from the Internet and $20,000 on special laptop-friendly study carrels. Where the reference desk was, they are building a $50,000 coffee shop that will include a $12,000 cappuccino machine.
"And to replace those old pulpy devices that have transmitted information since Johannes Gutenberg invented the printing press in the 1400s, they have spent $10,000 to buy 18 electronic readers made by Amazon.com and Sony. Administrators plan to distribute the readers, which they’re stocking with digital material, to students looking to spend more time with literature.
"Those who don’t have access to the electronic readers will be expected to do their research and peruse many assigned texts on their computers."
Perhaps this isn't an issue for students whose families are rich enough to send them to Cushing (which costs nearly $43,000 a year for boarding students), but 18 e-book readers seem laughably inadequate as a replacement for an entire library. Students doing research and perusing assigned texts will evidently have to purchase their own copies to do so. (Clicking on the catalog link on Cushing's Fisher-Watkins Library page led only to an error message the night I tried it.)
Of course, many textbook publishers are moving into e-book formats because digital protection measures prevent students from sharing or selling their used textbooks. Now every student must buy their books new and only new, which means greater profits for publishers. (Image by Mark Wilson for the Boston Globe.)
The good old days. Which is what book publishing has been about for centuries. As Richard Nash writes in his review of Ted Striphas' The Late Age of Print (Columbia University Press, 2009), e-books are just the latest strategy in publishers' long-term struggle to control consumers by "inducing demand" and "creating artificial scarcity." Restrictive digital licenses are "the apotheosis of the publishing industry’s capacity to restrict a reader’s ability to do what they want with their books." In other words, we shouldn't mourn a genteel, non-commercial literary culture that is largely illusory. But if we want to retain fair use and first sale rights, we must organize.
Update 24 September 2009: The New York Times' Miguel Helft is reporting that the settlement between Google, the Authors Guild and the Association of American Publishers is being renegotiated. The settlement in the copyright infringement case brought by the authors' and publishers' groups against Google was reached last year, but is now being revised to address privacy and antitrust issues raised by many concerned individuals and groups, including a coalition of authors and publishers represented by the American Civil Liberties Union, UC Berkeley's Samuelson Clinic, and the Electronic Frontier Foundation.