Follow This Blog: RSS feed
Neverending Search
Inside Neverending Search

Wikipedia + the Internet Archive offer a new reason to include Wikipedia in student research

The Internet Archive and an army of Wikipedians are working towards achieving improved consensus around knowledge and greater historical accuracy by, as founder Brewster Kahle shares, weaving books into the fabric of the web itself.

Last week, at its annual celebration, the Internet Archive announced the initiative in which it will be Weaving Books into the Web—Starting with Wikipedia. The initiative is a response to Brewster Kale’s question, What should we do to help the internet work better?, a question he asked after hearing Wikimedia’s Executive Director Katie Maher’s post-election worry that truth might fracture.

Brewster Kale’s tweet about the announcement

In its blog post, the Archive announced that it’s been enhancing Wikipedia’s book citations with links to two-page digitized previews of scanned books. The move will allow researchers to go well beyond examining citation lists, which contain items of varied formats and quality. It will allow researchers to view digitized previews of cited passages in their original surrounding context. Many of the current book links in Wikipedia articles lead to Google Books which often lead researchers to snippets or blank pages. This partnership should provide Wikipedia researchers with increased access to full-text digital context.

In addition to being better able to validate the credibility of claims in Wikipedia entries, users will be able to borrow digital copies of books for two weeks using a Controlled Digital Lending (CDL) or lend like print or own one, loan one model. (The term CDL was coined by IA’s Open Library.) The CDL model was the theme of discussion at the recent Library Leaders Forum at San Francisco Public Library. (Note: This model is controversial and has been disputed by a number author and publisher groups.)

Unless the books are available in public domain–generally if they were written prior to 1923–the availability of texts will depend on whether other researchers have borrowed a book from which the content has been scanned. Internet Archive’s ambitious plan is to link to every book reference in Wikipedia, and ultimately to digitize every book ever published.

The IA Blog shares the progress to date:

The Internet Archive has transformed 130,000 references to books in Wikipedia into live links to 50,000 digitized Internet Archive books in several Wikipedia language editions including English, Greek, and Arabic. And we are just getting started. By working with Wikipedia communities and scanning more books, both users and robots will link many more book references directly into Internet Archive books. In these cases, diving deeper into a subject will be a single click.

How is this goal being scaled?

Mark Graham who directs the Internet Archive’s Wayback Machine described the effort to connect book reference links directly to scanned Internet Archive books. The InternetArchiveBot scans Wikipedia for dead links and automatically connects broken links to versions of books archived in the Wayback Machine. Graham estimates that of 14 million edited links. more than 11 million point to Internet Archive.

It’s not just about the bots, moving forward, Internet Archive will enlist the Wikipedia community in employing standard citation formats, specifying page numbers for their references, noting editions, and referencing ISBN numbers whenever they are available.

At 1:21:00 in the celebration video, you can view Brewster Kahle’s description of the progress that needs to be made after the 1923 public domain boundary. He points to a critical digital gap in availability of books not available on Amazon, and often not available in local libraries. “We are raising a generation without a century of books.”

This gap Kahle describes was apparent as I sought example citations. For nearly all my sample searches, Web links dominated and most book references I discovered were not yet linked with two-page context by IA. Instead, they led to WorldCat and Google Books, with either limited page views or snippets. In the case of WorldCat, of course, I could easily click on the Find in a copy in the library link for local holdings.

But the project is in its infancy. As the efforts gain steam and funding, results will increasingly lead to digital previews like this:

Internet Archive shared this example from Martin Luther King’s Wikipedia page leading to a two-page book preview..

One critical partner in the effort to span the missing book gap has been Better World Books, a mission-driven used book store that receives many of the weeded, excess books libraries no long need. With needed funding, it plans to send millions of the right books to IA.

If the Internet Archive / Wikipedia partnership is successful, the effort will provide serious support to students and researchers wherever they are working and will allow librarians to promote better use Wikipedia as a starting point for knowledge building.

At the San Francisco Library Leaders Forum, Detroit School of the Arts librarian Karen Lemmon spoke of the value of offering high school students books that are immediately available:

This might give them an opportunity to read in between practices. They can pull out their phone and read a few pages. It’s mobile and flexible. . . Our students really do want to be the best.

Lemmons also noted that she wants to be a model for other urban schools.

We want to be a driving force to get other libraries involved. This is a data-driven district and we will need data to show reading more makes a difference in student performance.

It costs the Internet Archive $20 to digitize and preserve a print title and present it to Web readers. If you’d like to help, contribution information is located here.

You may also be interested in:

Joyce Valenza About Joyce Valenza

Joyce is an Assistant Professor of Teaching at Rutgers University School of Information and Communication, a technology writer, speaker, blogger and learner. Follow her on Twitter: @joycevalenza


  1. That’s a good initiative from the Internet Archive to enhancing Wikipedia’s book citations that will help Wikipedia researchers with increased access to a full-text digital context. I will also share it with the students of Chula vista district school to increase its awareness. Thank you, Joyce, for sharing this update.

  2. That’s a great step taken by Internet Archive to enhance Wikipedia’s book, it will help students in research work. This is something that I will share with the students of GGUSD to help them in research. Thanks for the post, Joyce.

  3. “The effort to connect book reference links directly to scanned Internet Archive books” will surely uplift the trust on Wikipedia. Joyce, thanks for connecting all the possible dots and creating an informative article. Several schools like EGUSD will push their students towards research field as they can rely upon Wikipedia in order to access point to point knowledge in depth.

Speak Your Mind