This class us about electronic access to information - how is is structured, accessed, and manipulated.
| Resource | Access Points | |
| Indexing Only | Author, Title, Subject | |
| Poole's Index to Periodical Literature (1802-1906) (print) [example] | Subject (requires separate name index and subject index to use effectively) | |
| C19 (online) | Author, Title, Source Title, Subject, Limit by Date, Keyword | |
| Indexing and Abstracting | Biological Abstracts (print) [example] | Author, Title, Subject |
| Agricola (online) | Author, Title, Source Title, Subject, Descriptors, Abstract, Notes, Keyword, and many others | |
| Bibliography of the History of Art (online) | Author, Title, Source Title, Abstract, Classification, Notes, Keyword, and many others | |
| Biological Abstracts (online) | Author, Title, Source Title, "Topic", Controlled Vocabulary | |
| Indexing and Abstracting plus Full Text Searching | Academic Search Complete (online) | Author, Title, Subject, Source Title, Abstract, Full Text (optional search) |
| U.S. Congressional Serial Set (1817-1980) (online) | Author, Title, Subject, Bibliograpic Numbers, Personal Names, Geographic Location, Publication Category, Full Text | |
| Early English Books Online (1475-1700) (online) | Author, Title Subject, Full Text | |
| Emerald Library (online) | Author, Title Subject, Source Title, Abstract, Full Text |
Why would you ever want to search just surrogate records rather than full text?
Full text scanning is not always so great. Look at this example.
Download it.
Data pulls - citation managers
Bibliographic citation software
RefWorks
ProCite
EndNote
Reference Manager
Papyrus
Library Master
Biblioscape
Note Bene
Citation Machine
Scholar's Aid
Citation
WebClarity
Zotero
Mendeley
My theories of typical user behavior
1. Lack of awareness.
Most Web users have no idea what is happening to them when they surf the Web.
2. Lack of discernment.
Most users don't apply critical thinking and evaluative principles to the content they encounter on the Web.
3. Path of least resistance
Uses give up if they don't find full text right away. If they encounter citations only, they simply give up. They prefer the less credible resource if it is available now if full text, over the more credible resource if it is not readily available.
In the early days of the book it was nearly possible to have read all extant books and for the human mind to remember everything that was read. Quick flip to today when a search engine provides the “memory” and nearly all extant publications are retrievable with an instant search. What is in the middle is the history of indexing – and a complex history it is.
Walter Ong, Orality and Literacy.
Walter Ong has noted the differences between the ways of managing knowledge in oral cultures versus the ways of managing knowledge in literate cultures. Indexes are essentially lists. Lists did not exist in oral cultures – there was no need. When writing cam about, lists eventually were necessary. Alphabetic indexes developed first for manuscripts, but the obvious problem was how to refer to a location within a manuscript. When printing came about, page numbers enabled indexing to refer to places throughout all copies of the same imprint. (see Ong, p. 123ff.).
In oral cultures memory was king.
In literary cultures, index was king.
In our Internet culture the search engine is king.

Modern periodical indexing began in the beginning of the 19 th century. William Frederick Poole began indexing reviews and periodicals during the 1840s, and by 1853 he published his Index to Periodical Literature . This evolved into Poole's Index to Periodical Literature , with coverage extending back to 1802. It was a remarkable work for its time. The interesting “Chronological Conspectus of the Serials Indexed” gives year-by-year and title-by-title coverage of each volume indexed.

As data increased, methods of accessing that data became necessary. Two sciences arose to make this possible, classification and indexing (Kuhr 1993). The difference between these two is extremely significant. Classification is a hierarchical system, often based on numbers that groups items into broad divisions and subdivisions. Indexing sometimes based on the language of the original author (such as in a book index), at other times it is based on a controlled vocabulary (such as in a periodical index).
The beginnings of the computer age saw many early experimental applications of computer technologies. With the Census Bureau using computers to count people on a Hollerith machine with punch cards in 1890, and with early computers like Eniac and the evolution of the analog computer, to the development and growth of digital computing, computers have made their mark on information storage, access, and retrieval.
Not surprisingly, paralleling the development of computational power was the explosion of publishing. More publications meant more indexing.
My favorite indexing humor.

Osborn estimated that in 1980 there were 500,000 serials in the world (p. 45).
Gale Directory of Publications and Broadcast Media (141st ed., 2006)) covers approx. 52,000 newspapers, magazines, journals, and other periodicals.
|
Year |
E-Journals and Newsletters |
|
1991 |
110 |
|
1992 |
133 |
|
1993 |
240 |
|
1994 |
443 |
|
1995 |
675 |
|
1996 |
1,689 |
From http://db.arl.org/dsej/2000/mogge.html
In 1992 it was estimated that over thirty publications could
be considered scholarly journal publications.
Sasse, Margo and B. Jean Winkler. "Electronic Journals: A Formidable
Challenge for Libraries." Advances in Librarianship 17, (1993):
149-173.
Penrose Library currently subscribes to 44,647 unique electronic journals (as of 1/1/07)
59,308 unique electronic journals as of 1/15/08
85,166 unique electronic journals as of 11/25/08
100,657 unique electronic journals as of 1/5/12
129,615 unique electronic journals as of 3/23/13
So let's take a look at various kinds of indexes throughout the ages.
Alphabetic Index. The most basic and earliest indexes were alphabetic. The early idea of a list of ideas, people, or places evolved into nested indexes,
Book Index. We are all familiar with the structure of the modern book, with a title page in the front, followed by the table of contents, then the main body of the work, and lastly the index.
Periodical Index. Serial publications generally have articles authored by a variety of authors, and from early on periodicals often issued indexes to the contents of their publications, sometimes annually, sometimes less frequently.
Cumulative Indexes. To avoid checking many indexes (such as annual), a cumulative index gathers larger periods of time together to save the time of the user.
See: http://www2.sims.berkeley.edu/courses/is245/s03/verbal.html for KWIC, KWAC, KWOC
If it Quacks Like a Duck: KWIC, KWAC, and KWOC
KWIC means key word in context. Under this scheme each word that is not a stop-word is an index entry. Let's take a title as an example: A handbook for road repair crews
[insert scan from KWIC Index: A Bibliography of Computer Management]
The emphases justifying these indexes were speed with which they could be produced, and the low cost of production. All of this motivated, of course, by the fact that the technology was possible. Beginning with the mid- to late-1950s these indexes began to appear.
H.W. Wilson; print indexes; early computing
Key Dates:
1889: Halsey Wilson and Henry Morris
start a Minnesota bookstore.
1898: Wilson buys out Morris and begins
publishing the Cumulative Book Index .
1901: Reader's Guide to
Periodical Literature is first published.
1903: H.W. Wilson is incorporated.
1913: Wilson sells the bookstore and
moves to White Plains , New York .
1917: The company is relocated to The
Bronx.
1954: Halsey Wilson dies.
1985: The company's first electronic
product, a version of the Reader's Guide ,
debuts.
1997: The WilsonWeb web site is
launched.
2011: Merged with Ebsco Publishing
From: http://www.fundinguniverse.com/company-histories/The-HW-Wilson-Company-Company-History.html
History of Indexing – see: http://www.asindexing.org/site/history.shtml
The Natural History of Pliny – contained an index in volume 1 See: http://mirlyn.lib.umich.edu:80/F/?func=direct&doc_number=001769341&local_base=MIU01_PUB
1st encyclopedia in alphabetical order – ca. 900 A.D.
Ancient Indexing
Early Modern Indexing
Early Computer Indexing
Modern Print Indexing
Future of Computer Indexing
Kuhr, Patricia S. 1993. Abstracting and indexing. In World encyclopedia of library and information services. Third ed., 1-5. Chicago : American Library Association.
We have libraries filled with hundreds of thousands to millions of books. Yet students continue to approach the academic reference desk saying, “Why doesn't your library have anything on x?” No wonder they say this: our access points are deponent.
Students seem to be a bit happier when they search for journal articles. Why the difference? I call this problem “the information access anomaly.” This problem can be seen when we look at the size, structure, and extent of the surrogate bibliographic record for each respective information type compared with the full text of the item.
“Information wants to be found.”
|
|
Book (average) |
Journal Article (average) |
|
Typical Length - full text (FT) |
200 pages x 400 1 = 80,000 words |
15 pages x 400 1 = 6,000 |
|
Surrogate Record (SR) |
50-100 words (75 ave.) |
300-500 words (400 ave.) |
|
SR to FT ratio |
1 to 10,666 |
1 to 15 |
1 Ave. 400 pages per book ( http://www.writersservices.com/wps/p_word_count.htm )
How do the differences in ratios affect search strategies?
It's all about what you are searching and how you are searching it.
We go to Google and type in election statistics for colorado and we get 223,000 results. We type the same words in an academic library's online catalog and we get 10 results. We conclude that libraries are not helpful.
Let's say that we analyze the search and conclude that is was flawed. The user should have typed election statistics AND colorado . Now we get 14 results. Not much better. Explain ratio of FT to words in biblio record.
Who handles the syntax?
Who handles the classification? Pre-coordination, post-coordination
Who handles the logical operators? Boolean