Present and future
1 / 24

Google Books - PowerPoint PPT Presentation

  • Updated On :

Present and Future. Google Books. James Crawford Engineering Director Google Books. Why and how Google scans books  Challenges The Future . Overview. Google Confidential and Proprietary. Why and How Google Scans Books. Google’s mission.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Google Books' - eshe

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Present and future l.jpg

Present and Future

Google Books

James Crawford

Engineering Director

Google Books

Why and how google scans books challenges the future l.jpg

Why and how Google scans books 


The Future


Google Confidential and Proprietary

Google s mission l.jpg
Google’s mission

To organize the world’s information and make it universally accessible and useful.

Online contentBillions of web pages

Offline contentBillions of items becoming indexed

Google Confidential and Proprietary

Book team s mission l.jpg
Book Team's Mission

To organize the world’s books and make them universally accessible and useful.

Google Confidential and Proprietary

Make accessible limited previews from publishers authors 30 000 publisher partners l.jpg
Make accessible – Limited previews from publishers & authors (30,000 publisher partners)

Make accessible embedded viewer for libraries l.jpg
Make accessible – Embedded Viewer for Libraries authors (30,000 publisher partners)

Vital stats l.jpg

Number of books scanned: fifteen million authors (30,000 publisher partners)

Number of pages: five billion

Number of words: over two trillion

Libraries: forty

Publishers: thirty thousand

Vital stats

Google Confidential and Proprietary

Google books in a nutshell l.jpg
Google Books in a nutshell authors (30,000 publisher partners)

Google Confidential and Proprietary

Challenges l.jpg
Challenges authors (30,000 publisher partners)

478 languages l.jpg
478 languages authors (30,000 publisher partners)

Kabardian: 16Khasi: 78Khoisan: 53Khotanese 21Kikuyu: 48Kinyarwanda: 77

Kyrgyz: 702Kimbundu: 14Konkani: 83Komi: 48Kongo: 134Korean: 35905

Kosraean: 10

Kpelle: 6Karachay-balkar: 17Karelian: 28Kru: 26Kurukh: 30Kuanyama: 9Kumyk: 16Kurdish: 220Kutenai: 0Klingon: 3Kalmyk: 26

  • Kashubian: 14

  • Kara-kalpak: 102Kabyle: 50Kachin: 18Kalaallisut: 82Kamba: 29Kannada: 2600Karen: 50Kashmiri: 289Kanuri: 25Kawi: 106

  • Kazakh: 1871

A diversity of dates l.jpg
A diversity of dates authors (30,000 publisher partners)

  • 18??

  • [196-?]

  • 1957/8

  • late 14th century

  • finita quarto nonas Januarias [1490]

  • mense Septembri: Anno Millesimo q[ui]ngentesimo decimonono

  • mense iulio, anno M.D.XXXX

  • התשנ״א (Hebrew year 5751 = Gregorian 1990/1 CE)

  • ١٣٧٣ (either Islamic year 1373 AH = Gregorian 1953/4 CE or Persian year 1373 AP = Gregorian 1994/5 CE)

Works expressions manifestations and items l.jpg
Works, Expressions, Manifestations, and Items authors (30,000 publisher partners)

Library of Congress

Books in Print


Lord of the Rings, v.1

The Fellowship of the Ring


John Roland Reuel Tolkien

J.R.R. Tolkien


Houghton Mifflin

Ballantine Books




Annotations l.jpg
Annotations authors (30,000 publisher partners)

The future l.jpg
The Future  authors (30,000 publisher partners)

Google editions l.jpg
Google Editions authors (30,000 publisher partners)

Buy Anywhere:

Purchase directly on Google Books, devices, retail partner sites, affiliates, and brick and mortar stores.

Read Anywhere:

Users can read eBooks on desktop, tablets, iPhone, Android phone, and eInk Readers.  Cloud storage and cloud sync.

More to Read:

Target is 400K+ paid books and over 2M freepublic domain books.

Google book settlement us only l.jpg
Google Book Settlement (US only) authors (30,000 publisher partners)

  • If approved, resolves lawsuit brought against Google

  • Benefits:

    • Rightsholder control

    • Snippets => 20%

    • Library subscriptions

    • Free terminal in every US public library building

    • Downloadable books for purchase

    • Access for the print-disabled

    • Book Rights Registry: a non-profit organization to find and pay rightsholders

    • Research corpus

Books as a corpus of human knowledge l.jpg
Books as a corpus of human knowledge authors (30,000 publisher partners)

  • Understand one book

  • Understand all books

  • Understand relations between books

Linguistic analysis l.jpg
Linguistic analysis authors (30,000 publisher partners)

  • "Research that performs linguistic analysis over the Research Corpus to understand language, linguistic use, semantics and syntax as they evolve over time and across different genres or other classifications of Books."

Digital humanities l.jpg

Steven Abney and Terry Szymanski, University of Michigan. authors (30,000 publisher partners)Automatic Identification and Extraction of Structured Linguistic Passages in Texts.

Elton Barker, The Open University, Eric C. Kansa, University of California-Berkeley, Leif Isaksen, University of Southampton, United Kingdom. Google Ancient Places (GAP)

Dan Cohen and Fred Gibbs, George Mason University. Reframing the Victorians.

Gregory R. Crane, Tufts University. Classics in Google Books.

Miles Efron, Graduate School of Library and Information Science, University of Illinois. Meeting the Challenge of Language Change in Text Retrieval with Machine Translation Techniques.

Brian Geiger, University of California-Riverside, Benjamin Pauley, Eastern Connecticut State University. Early Modern Books Metadata in Google Books.

David Mimno and David Blei, Princeton University. The Open Encyclopedia of Classical Sites.

Alfonso Moreno, Magdalen College, University of Oxford. Bibliotheca Academica Translationum: link to Google Books

Todd Presner, David Shepard, Chris Johanson, James Lee, University of California-Los Angeles. Hypercities Geo-Scribe.

Amelia del Rosario Sanz-Cabrerizo and José Luis Sierra-Rodríguez, Universidad Complutense de Madrid. Collaborative Annotation of Digitalized Literary Texts.

Andrew Stauffer, University of Virginia. JUXTA Collation Tool for the Web.

Timothy R. Tangherlini, University of California-Los Angeles, Peter Leonard, University of Washington. Tools & Techniques for Automated Literary Analysis, Based on the Scandinavian Corpus in Google Books.

Digital Humanities

Insights into human progress l.jpg
Insights into human progress authors (30,000 publisher partners)

oxide of lead

may be thus

a heavy fire

a striking proof

miles distant from

terms of peace

presents the appearance

more than mortal

vexation of spirit

zeal and devotion

lesbian and gay

health care professionals

abuse and neglect

the overall process

shift away from

the power elite

a research project

the poor countries

probability of failure

increased awareness of

Old-fashioned trigrams

New-fangled trigrams

Google is preparing trigram data for release for research purposes

Source: Matthew Gray & Yuan K. Shen

Google Confidential and Proprietary

Thank you l.jpg
Thank You! accessible and useful