Accessing Knowledge From Clay Tablets to Google

by Ronald L. Burgess

Founded 24 January 1895
Meeting Number 1699 – 4:00 P.M. April 29, 2004
Assembly Room, A. K. Smiley Public Library

Accessing Knowledge From Clay Tablets to Google

Part I

One of mankind’s most profound distinctions from other living forms is our creation, accumulation, and dissemination of knowledge. It has been said, “knowledge is power.” And so, throughout the ages, the control of knowledge has meant the control of power itself.

This control of knowledge may well have started before the Assyrian scribes developed a simple written nomenclature to represent details of the harvest, trade, or war (to remember the killed and injured). We know that these first written languages developed far beyond accounting and reporting, to full stories that had been verbalized for generations, or represented in elaborate friezes and paintings of events meant to “tell the story”.

In those ancient days certainly most of the day-to-day knowledge of manufacturing, agriculture, and construction was handed down on-the-job through apprenticeships. But, as civilization became more complex, so did the methods of knowledge retention. One wonders which lead and which followed, much like the proverbial chicken or the egg.

When the accumulation of knowledge in written, arithmetic, and pictographic form began to overburden the memories of its custodians, the first organizational consultants must have seen an opportunity!

The first libraries were “Houses of Writings” or “Places of the records of the Palace of the King” in about 3200 BC. The first librarians were called scribes. The libraries in Greece began to show up about 500BC. Demetrius of Phaleron was the first librarian of Alexandria.

By 330BC Ptolemy I and Ptolemy II had built the library of Alexandria into the largest repository of knowledge in the known world. According to World Book (on-line version), “The Ptolemys borrowed books from libraries in Athens and other cities and had them copied. According to legend, Ptolemy II shut 72 Jewish scholars in cells on the island of Pharos until they produced the Septuagint, the first known Greek translation of the Hebrew Old Testament.” “The Alexandrian Library had a copy of every existing scroll known to the library’s administrators. The library housed more than 400,000 scrolls. A succession of famous scholars headed this library, which became famous for the scholarly studies it supported as well as for its collection.”

All these scrolls became cumbersome to find, so Demetrius or Callimachus of Cyrene developed categories for organization. The scrolls were organized into 10 subject areas: poetry, the drama, laws, philosophy, history, oratory, medicine, mathematical science, natural science and miscellany. He also developed an alphabetical author index. The evidence is not clear, but this may have been the first large scale indexing of knowledge.

One might assume this wonderful indexing provided access to whoever wanted it. But, the ruling classes had no fear. If the knowledge is in one place, it can be controlled, but further, one first had to be able to read. Reading, in the ancient world, was so rare that some kings thought it was a special spiritual gift that only scribes could obtain. Eventually, kings overcame this perception and learned of the power of deciphering this invaluable resource themselves.

Ruling classes and families have carefully educated their children for millennia. Surely, it was considered a civic duty to prepare qualified leaders; but whether they intended to do so or not, they also controlled the power, through control of the knowledge. Intelligence has always provided superior military and political tactics. The lack of intelligence can hinder even the strongest government, as our ventures into Iraq show even today! It therefore may be no surprise that Alexander the Great, was also educated by one of the greatest teachers of all time, Aristotle, the best that his king father could provide.

The destruction of the great library of Alexandria was as much a symbol of the destruction of organized knowledge, as just bringing down a building.

If knowledge is power, then the organization of knowledge facilitates power. Europe’s dark ages were probably not seen as such for those who lived between 450 and 1066. However, the systems of retaining, cataloging, and disseminating knowledge did not begin to recover for this period, having been kept alive not by the ruling classes but the monasteries, almost intentionally remote to cloister rare books, scrolls, and the knowledge they represented.

Following the Norman Conquest of England by William the Great, the fashion became to send sons off to France for education. And why not? French kings ruled England, and for 300 years French was the government’s official language. By the mid 1200’s, French lost its edge in England, as the common and hybrid English language was commonly spoken by all but a few noblemen and kings.

After all, the first known library in a university was in 1450 at the University of Paris, and in short order the university became the center of recorded knowledge, although the wealthy also collected large numbers of books, to show their knowledge . . . or perhaps really their power!

Oxford began to thrive after 1167 when Henry II banned English students from attending the University of Paris. Oxford began a library of its own. To this era, knowledge was imparted by learned men and a few very expensive wooden block hand engraved printed books. Books were kept in libraries as much for their rarity and central location as to catalog them. There just were not enough to go around. But, when Gutenberg invented movable metal type in 1440, everything began to change. By 1452 the famous Gutenberg Bible was in production. The following century found Bibles and other religious books and pamphlets in quantities hundreds of times higher than in the mid-15th century. As more content was available throughout Europe, the coveted skill of reading became more useful and, as thinkers had more access to knowledge, the Reformation would follow in 1517.

Still, it took many years to fill libraries throughout Europe, and while the knowledge in them would evoke new thought from the advantaged, the common man had little access to the books.

Meanwhile, the educated class needed to find the information. Classification systems were still used by some, others assigned a place on the bookshelf for each book, mixing subjects, authors, categories etc. This proved to be fine for the assignment of bookshelf real estate, but tough on the researcher. Abstracts by discipline and publishing of subject indexes did not evolve until the 18th century. It was not until Frederick Poole, in 1848, published the Poole’s Index, and Melvil Dewey, in 1876, developed the modern Dewey Decimal System, that we really had a workable system.

Access to the public began in the 19th century in Europe; the first rental library was in Edinburgh in 1725. The modern public library movement began in England in 1847 with the Committee on Public Libraries, which produced the 1850 Public Libraries Act. The Act enabled cities with over 10,000 in population to levy taxes to support libraries. However progress was slow with only 75 libraries by 1877 and just 300 by 1900.

In the United States, Boston opened its Boston Public Library in 1854, but the real access to the public was not until philanthropy drove the institution. In 1881 Andrew Carnegie gave the money for a library in Pittsburgh, PA and by 1920 he had donated money for 2500 libraries in the United States.

It took the ruling classes, scribes, monks, and universities four millennia to classify knowledge, and in just 40 years Andrew Carnegie and others brought this knowledge directly to the public. . . at least in the United States.

We, of the Fortnightly Club, are the recipients of similar generosity and foresight right now, sitting in the A.K. Smiley Public Library. As you all know, though Larry Burgess’ continual public praise for the Smiley philanthropy, this community remembers, that Andrew Carnegie himself visited this very library stating, “Mr. Smiley went out of his way and borrowed money in order to make his gift of love.” We, as members of the Redlands Fortnightly Club, are all recipients of the countless that go before us for the reference materials to engage in our primary activity.

Part II

And so . . is this the end? Of course not, most of you already know of the background material just recited. Now the subject begins.

Even visionaries who understood the Personal Computer in 1979, missed the innovation that would impact the world even more than the PC. It was the Internet. Known only in the lab, academia and the military prior to the 1980’s, it was the World Wide Web protocol that produced general interest after 1994. This same year saw the invention of a web browser Mozilla, by Marc Andreessen, and the founding of Netscape. It had become a hot media topic by 1996 and the subject of more media attention when it blew apart the stock market.

But while the media frenzy about the overrated Internet continued, millions of “web surfers” continued to use and increase the amount of time they spent online. As of the writing of this paper, nearly 2/3 of the American households are “on-line.” Eighteen to 35 year old males spend more time online than they do watching TV. Seventy-eight percent of Company CEO’s report that they go online every day, before they leave the house for work. They also report the Internet (along with glossy industry magazines) rank at the top as favorite research preference.

While the economy continues the slow growth over the past year, by most measures similar sectors have doubled and tripled their growth online. Retail sales consistently increased in the single low digits, while revenue increases for online buying went up a whopping 20% to 50% , depending on the category.

By 2004, it is very clear that the Internet is here to stay, and as only a teenager, it will likely change our lives in many more ways still unknown. So, what does this have to do with knowledge and indexing it? Plenty.

The early days of the Internet were not nearly as sexy as today (this is true both figuratively and literally). The initial years saw a text only document transferring system, used primarily by university researchers who found it an ideal way to collaborate and get the very latest information in their field. Because the Internet was not indexed in any way, users had to know the URL (Uniform Resource Locator) addresses of the computer where the document was stored. The URL address is similar to your house address, but can also address the room, cabinet, drawer, and file. Without the ability to surf the web, most of us found little use for it. Who could remember the long address of each and every document?

But the natural progression, and important research papers being accessible to all, was the opiate for several visionaries to see the need for an organized way to find various information across many university computers, and a growing number of other private Internet computers. Yahoo developed the first successful and well-known search mechanism. They developed a way to search indexed pages on the Internet.

The number of documents was not particularly impressive in the mid-1990’s compared with libraries, but the beast was unleashed. The early days of search became a prestige item for professors to see how many times their papers were listed under their name. The rest of us began to aimlessly look all over the world at museums, libraries, art collections, restaurant reviews, recipes, and every manor of topic imagined.

But we were infatuated with the technology, the instant access and the snooping. We almost gleefully followed a subject heading that looked interesting, even if it had nothing to do with the phrases we typed into the search engine. These phrases have come to be known as “key words.” And, as many users began to use the Internet as a way to find information, they gravitated to the many experimental search engines to see if they could find better, more relevant information. The network of documents was decentralized in each university computer (on the Internet they are called servers), and a huge number of other Internet servers. Because no one controlled all the servers, anyone could place whatever they wanted on their own server, and make it available to the total network or Internet.

This created a kind of knowledge and data anarchy; people with their own interests and opinions providing instant access to millions of documents from their own computers, with no standards or organization. Dewey, and Poole are turning over in their graves!

Several of the hundreds of search engine schemes tried to order the new universe similar to the way the old one was ordered, by categories of knowledge. Yahoo took the lead by viewing all websites to be indexed with human eyes. But, with millions of web pages turning into billions of web pages, this method created long lines of new web pages wanting to also be found on the web. Many other search engines were glad to take up the slack. Like a gold rush, all the search engines in the late twentieth century scrambled to find and index the most pages.

The land rush to knowledge was on, and available to all with access to the Internet. As a society, we are still just beginning to understand what this means. Could all knowledge become one giant library, museum, or document repository with instant access to scholars? Just like the human teenager, the teenage Internet is gangly and uncoordinated for now.

Attempting to categorize knowledge by only one subject, title, or author has its limits. How do you know that a book on the Southern US contains a reference to Kentucky, or Nashville, unless you walk to the bookshelf, pull the book and look in the index; assuming the index is really complete? If the book is missing, you’ll never know until it is checked in again. This new machine (the Internet) could index an infinite number of characteristics about a document, and it was accessible all the time. So, why not try it.

Harry Jackson Jr. of the St. Louis-Dispatch, states, “The volume of information is incomprehensible. Observers now speak of the amount of information on the Internet in terabytes, 1 trillion bits of information. Google’s engine searches 3.4 billion Web pages to bring back information. For example, all recorded knowledge could be stored in 100 terabytes.” Many now expect all recorded knowledge to be stored somewhere accessible to the Internet one-day. But how will we find it?

Young programmers invented all manners of ways to do this. They invented software robots, crawlers, and other indexing devises and databases to attempt the monumental job of indexing the knowledge on the web. But the problem of context can be tough to decipher. Does a phrase like Banana Republic refer to a South American country, or the American sportswear store? Does Redlands Orange mean our famous oranges or the street?

Determining the context of information is not the only problem. If you find a relevant page on a medical condition, how do you know if it is authoritative? We trust publishers and librarians to print and stock books of some repute. But when anyone can publish to the whole world for the cost of a single book, how do we know the information can be trusted?

In researching the historical library section of this paper, I found a surprising difference of “fact” about what the first library was. The key phrase “oldest ancient library” yielded hundreds of documents. But the oldest library described was different in the first three documents! It seems we have the same original source issue that plagues all researchers and historians. What is to be believed and what is just an opinion?

States Clay Shirky, in “Fame vs. Fortune:Micropayments and Free Content, “Creators are not publishers, and putting the power to publish directly into their hands does not make them publishers. It makes them artists with printing presses. This matters because creative people crave attention in a way publishers do not. Prior to the Internet, this did not make much difference. The expense of publishing and distributing printed material is too great for it to be given away freely and in unlimited quantities — even vanity press books come with a price tag. Now, however, a single individual can serve an audience in the hundreds of thousands, as a hobby, with nary a publisher in sight.”

It took Guttenberg decades to stir the intellectual community, but in less than 11 years, we are on our third knowledge revolution in the Internet. A small company started by two students at Stanford University challenged the strength of Yahoo and the money of Microsoft by completely re-thinking the notion of important knowledge. In just three years, Google has become the most used search engine in the world. Last year it made $300 million on revenue of one Billion dollars! The founders, Larry Page and Sergey Brin were in college just 5 years ago.

Startup Google, g-o-o-g-l-e (a play on the mathematical term for a 1 with one hundred zeros) could overthrow billion dollar companies if it was delivering superior search results; results that returned important and authoritative pages on the search key phrase used. If the Internet becomes the world wide resource for knowledge it will have to solve these issues first.

Page and Brin reasoned that the millions of web authors, who dealt in the subject matter where they were experts, made hyperlinks from one of their pages to another respected page. Their research revolved around B. F. Skinner’s work with pigeons. Skinner’s work, “. . . relies primarily on the superior trainability of the domestic pigeon (Columba livia) and its unique capacity to recognize objects regardless of spatial orientation.” The trade marked Pigeon Rank is now used by Google to rank pages. They reasoned that if these millions of pigeons (web authors) linked to another site, it was because they thought it was relevant and authoritative. So, by using the Pigeon Rank of the original page that links to another page (called a back link), the second page thus gets its rank modified for how many links that page has.

In other words, the sum of the importance of all pointing (back) links becomes the rank of the target page. The logic works like this: if Dr. Larry Burgess says Don McCue knows something about the Smileys, most informed members would take that as a hotter lead than if Ron Burgess said the same about McCue.

In a similar way this idea was tested to predict the next Nobel Prize winner. Those in the know about such things, and who respect the future winner, will provide a link to his work. They are themselves “hot” and so pass this along to the winner’s page. So, the sum of all the hot pages is the score of the target page itself. Thus, when the subject matter is determined, the Pigeon Rank determines which page is first, second, and so on.

The subject of the key phase is determined by the text in the web page, title, and the instructions to the web robot by savvy webmasters. So a page is compared to the key words typed into the search engine, and to the text on the page. Then the page is ordered first to last by its Pigeon Ranking. While this is a rudimentary explanation to a complex mathematical algorithm, you can see that the attempt is being made to fulfill the request of the Google user; to bring forward relevant, authoritative information to the top of the list. While a little practice and skill at “Googling” can improve your results, you can see that the motivation is simple and pure: to provide knowledge at your fingertips.

This new way of searching for information on the Internet, has already killed the older category look-up function modeled doubtless on the Dewey Decimal System and the phone book. Has-been Yahoo has just changed its old system to one closer to Google’s. Why? Because typing exactly what you want is easier than knowing a classification system. Additionally it is not nested or hierarchical, requiring multiple clicks from broad to specific category. But does it work? Will we really be able to find authoritative information in this way?

When “Lincoln Shrine” is typed into Google, the Lincoln Shrine website has the first two positions ranked against over 74,000 pages found. Not bad! The third is a personal page where a 20-year-old Lincoln Mark IV is the author’s own “Lincoln Shrine.” Other Lincoln Shrine references follow with an article about the Lincoln Shrine Boy Scout Pilgrimage. is listed about 10th as it contains links and information on the Shrine. But if you type “Lincoln Shrine Redlands”, the Shrine still gets top billing, but RedlandsWeb is now fourth as it is authoritative and considered a “Hot” Google page for the key word Redlands. This is because of the quantity of links, and the high ranking of websites that point to, such as the University of Redlands, ESRI, The San Bernardino County website, and over 100 others. This list includes a reference to RedlandsWeb by our own Al Reid, in his January 2002 paper entitled Internet Search Engines, help or overload.

A small divergence here as it may illustrate the point. Mr. Reid’s paper was delivered before I was a member, and because I have not yet read all previous papers on-line, I was unaware of his inclusion of in his paper. Because he included the full syntax,, it was not only indexed by Google, along with the rest of his paper (and of course every other paper on-line) but the hyper-link was also considered a link to RedlandsWeb. Just as the one I just read to you will one day point to the actual website. By the way, I was relieved to find, following my own read of Reid’s well researched paper, that our subject matter was not entirely overlapping, as at the moment of this writing, didn’t have time to start a new topic!

Back to the subject.

So, with a few examples we can see that Google at least is pretty good at coming up with the subject matter.

But, what about more important issues such as health? Type in “Rheumatoid Arthritis.” The first three results are from the Arthritis Foundation, a university and the US Library of Medicine, a government site. Are these sites authoritative? Perhaps only an authority knows, but chances are that authoritative sites have pointed to these sites.

Looking only at the top site, The Arthritis Foundation, 601 other websites have links to this site. The top back-links include Johns Hopkins University, The University of Buffalo, three other medical directories specifically about arthritis, and naturally some sites with arthritis cures, which undoubtedly want to associate with legitimate and authoritative arthritis sites if possible.

The Google logic here is that websites run by experts in a field have selected authoritative websites to link to; therefore a measure of how many of these experts have selected a website to link to would result in authoritative and relevant knowledge.


As successful as Google is commercially and with the general web surfing public, Internet search has a long way to go to be accepted by academics, librarians, researchers, and even Internet geeks as a replacement for books and original source work.

Here is a recent excerpt from Internet Search writer Andy Beal, on the subject of relevance.

The Future of Search Engine Technology
By Andy Beal – January 28, 2004
Overcoming The Lack Of Relevant Search Results

Even today, conducting a search on any of the major search engines can be classified as an “enter your query and hope for the best” experience. Google’s “I’m Feeling Lucky” button, while designed to take you directly to the number one results, could ironically be a truism for its entire search results (process?). Enter your desired search words into any of the search engines and you often end up crossing your fingers and hoping that they display the type of results you were looking for. Since the recent updates of “Florida” and “Austin”, complaints that Google, in particular, is displaying less relevant results have escalated (although mostly by those who lost important positioning that they had assumed was their right to maintain).

There is, of course, evidence that the search engines are trying to enhance their search results- so that they can better anticipate the intentions of the searcher. Search for “pizza Chicago” at Yahoo, and you’ll see that the top results include names, addresses, telephone numbers and even directions to pizza restaurants in Chicago, a great improvement on previous results. Even when you take everyone’s favorite search term example, “windows”, you can see that the search engines are at least trying to determine your intent. While Yahoo and Google still display search results saturated with links discussing Microsoft’s pervasive operating system, enter your search over at Ask Jeeves and the chirpy English butler will ask you if you meant “Microsoft Windows” or “Windows made out of glass”.

Searching the Internet is not perfect, it still requires the human element of understanding sources and being on the lookout for fakes. Google seems to be working toward ways to indicate the ranking of material. Perhaps someday universities will point to each other’s articles that they respect, in an attempt to increase the effectiveness of understanding authority on the Internet. In fact, this is already done in some important areas. The prestigious and authoritative “New England Journal of Medicine” has a very extensive website of articles. The articles themselves contain abstracts that indicate which articles have linked to the article of subject. The main website also links to Institutions and Libraries, adding a coveted Google Pigeon Rank, I am sure.

The vernacular of search still needs refinement. Specialty directory sites still exist because large global issues do not translate to local requirement. RedlandsWeb has one of the highest traffic counts about Redlands, because it is concerned only with local websites. Even Yahoo and Google cannot find and catalog as many sites and local information as we can locally. If you want to buy a major appliance locally, you can type key words Whirlpool and Redlands and it will yield stores from throughout the county. This is because websites misrepresent themselves by listing hundreds of cities in their websites.

The battle for creating better ways of accessing knowledge is still heating up. As Yahoo dominated earlier directories on the Internet, and Google eclipsed Yahoo, Microsoft and others (including Yahoo) are betting big-time on new improvements to search which will bring the golden fleece (advertising dollars) to their door step. . . well front page anyway.

States Andy Beal again, “Smaller search engines continue to improve the user experience such as Grokker an interface that groups search results graphically. Eurekster, combines the social networking elements that are used by sites such as Friendster, and provides results that can be filtered based upon what members of your group are searching. While all of these are interesting and provide a glimpse of the future of search, it will not be the small companies that change the way we search. With Google about to get an influx of cash from its upcoming IPO, Yahoo re-vamping Inktomi and Overture, and Microsoft finally jumping into the search arena, it will be these search engine powerhouses that enhance our search experience and take search engine technology to the next level.”15

The next level represents a complete transformation of how we access knowledge, by ironing out simple quirks in usefulness, and educating ourselves about how to find and test resources of information. While the battle lines are forming again on the search engine front, driven by huge financial payoffs, the educational world seems to be moving in a similar direction. WASC (the school accreditation body) now requires “resource centers” not libraries for its accreditation of secondary schools. Locally, Arrowhead Christian Academy is spending $50,000 on technology for one room to create a Resource Center where the library once was. It is an amount that may seem large for a single room, but represents only 1500 or so books. Combined with other paid, secure on-line resources the little school in Redlands will have access to libraries and databases worldwide.

This new Resource Center, armed with two large LCD monitors will allow instruction on Internet resource gathering, while the student can follow along on the computer of his or her own desk. This brings rich multimedia and authoritative materials to the classroom or personal computer at home.

Whether or not authoritative acceptance in each subject area is fully embraced by the intellengencia, it is clear that the knowledge of knowledge will never be the same. The Internet is as important as the Guttenberg Bible in bringing information to the masses. If knowledge is power, the power will be transferred to whoever wants it, and it will transform all media that we have known in the past.

Bill Gates, founder of Microsoft said, some years ago, that the Internet would take the friction out of the transaction. This is certainly true of the gathering of knowledge as well. While I attempted to balance my research between periodicals, books, and the Internet, I found myself reliant on just one more search to check something out (a function that is done in literally seconds). The trip to the card file (even if it is a computer database) and then back to the stacks, the stooping and stretching to find only a few selections on a subject-then to look in the index for a reference, has a lot of friction. We’ll have more time for thinking about the subject when we have to research less to get the relevant information.

Amazon, the huge on-line book retailer, just released a new database that literally scans all pages of every book for sale. Your next search will likely list the contents of a book online. A powerful reason to buy it is when it may contain the exact subject you need. But, the text is not on-line for free. . yet. The motivation of course is to separate us from some of our money. But, the ability to place all knowledge on-line is a reality.

The transformation next to come in some ways may be slower but more dramatic. Will the warehouses of the future knowledge be libraries and museums or server farms? Have the Carnegie Libraries and even A.K. Smiley Public Library seen their best years? How will schools cope with the basics? Will we integrate all school learning with the available treasures online? Not likely. But a change is coming. Knowledge, as we have indexed and catalogued it, is changing and changing fast. We are at the threshold of the largest and most comprehensive change in how we access knowledge since the library of Alexandria was organized by Demetrius, and the invention of the printing press by Guttenberg.

It’s really exciting. . . but somehow what I still look forward to is a quiet afternoon, in our own beautiful reading room, enjoying the garden outside our window, with a great book! Try that on the Internet!

End Notes

  1., Scott Lee, Antelope Valley College
  2., The Persus Digital Library, Tufts University
  3., Scott Lee, Antelope Valley College
  4. The Story of English, Robert Crum, William Cran, and Robert MacNeil. Viking Press, 1986 pp 74-76.
  5. A Brief History of Oxford,
  6. DotPRint –On-line Printing Industry news organization. Before Gutenberg’s innovation, most books were produced by and for the Church using the process of wood engraving. This required the craftsman to cut away the background, leaving the area to be printed raised. This process applied to both text and illustrations and was extremely time-consuming. When a page was complete, often comprising a number of blocks joined together, it would be inked and a sheet of paper was then pressed over it for an imprint. The susceptibility of wood to the elements gave such blocks a limited lifespan.
  7., Mary Bellis
  8., IN 1517, Martin Luther was to do something, albeit by accident, that was to change the face of the world as it was then known in Western Europe, and introduce the German Reformation
  9., Heidi Lee Horeman Ms. Hoerman graduated from Bates College in Lewiston Maine with a BA in English and from Indiana University with an MLS. Heidi is also ABD in Library and Information Science from Indiana University. She has done additional graduate work at Indiana University, Columbia University and Montana State University.
  10., History of Libraries in the Western World Lecture, Scott Lee, Antelope Valley College
  11. “The Smileys” by Larry E. Burgess, Commemorative Edition, reprinted 1991, Moore Historical Foundation.
  12. The History of The Internet,
  13. Fame vs. Fortune: Micropayments and Free Content,
  14. “Search and Destroy,” Lev GrossmanTime magazine December 2003, referenced here and elsewhere in this paper.