Sunday, December 19, 2010

Browse cultural trends through 500 years of history with Google's new tool

Google Books Ngram Viewer tool lets you browse cultural trends throughout history, as recorded in books.

In recent years Google made a database culled from nearly 5.2 million digitized books. The database consists of the 500 billion words contained in books published between 1500 and 2008 in English, French, Spanish, German, Chinese and Russian. So far, Google has scanned more than 11 percent of the entire corpus of published books, about two trillion words.

Erez Lieberman Aiden, a junior fellow at the Society of Fellows at Harvard and Jean-Baptiste Michel, a postdoctoral fellow at Harvard assembled the data set with Google and spearheaded a research project. Their study could be found on the journal Science.

“The goal is to give an 8-year-old the ability to browse cultural trends throughout history, as recorded in books,” told Mr. Lieberman Aiden to New York Times. Whose expertise is in applied mathematics and genomics, Mr. Lieberman Aiden calls the method “culturomics.”

According to scientists, this tool will expand our understanding of language, culture and the flow of ideas. Here's an interesting find; Looking at inventions, they found technological advances took, on average, 66 years to be adopted by the larger culture in the early 1800s and only 27 years between 1880 and 1920. Also they figured that the English lexicon has grown by 70 percent to more than a million words in the last 50 years.

Google Books Ngram Viewer can be found on this link.  The data set can be downloaded here, and you can build your own search tools.

Image credit: Cover of Trend: illustrated Japanese-English dictionary of things Japanese, printed in 1999. ISBN 4095050713, 9784095050713

(Via New York Times)

No comments:

Post a Comment

You Might Also Like

Related Posts Plugin for WordPress, Blogger...

Disqus for Sinan says

Calibrate Your Monitor

Calibrate Your Monitor
You should be able to see each step in this greyscale separated, and the background should be neutral grey.