|
20th
Century |
21st
Century |
Books |
||
Serials |
||
Newspapers |
American
Newspapers, 1821-1936: A Union List of Files Available in the United
States and Canada; |
|
Libraries |
||
Bibliographic
Networks |
|
OCLC
(WorldCat); COPAC
[Britain]; NACSIS
[Japan], etc. |
General vs. Specialized (Subject) Encyclopedias
Searching databases is an integral part of reference librarianship. We used to offer LIS 4011 (Information Access & Retrieval), but that course is no longer offered. In Spring 2014 a new version of this class will be offered as an elective. But since this topic is so important, I will take a little time in this class to discuss databases, their history, the many varieties of databases, how they are structured, and how to search them.
Database Fields: Definition (MIT Libraries)
A field is a container, a place where information is stored. That container may have various rules: dates only, numbers only, 4-digit year date only, control number only, any text, any text up to 255 characters, any text following strict entry rules, etc. Also, fields may iterate in some cases. There may be multiple authors, so databases will need to deal with this. Some databases may dump all author data into a single field. A better structure would be to put each author into a separate (iterating) field. Subjects also iterate.
MARC 21 Concise Format for Bibliographic Data
OCLC Bibliographic Formats and Standards (3rd ed.)
Fixed Fields: OCLC
Fixed fields, by definition, contain fixed data in specific formats for the purpose of economy of record size. These letters and numerals contain volumes of information about the bibliographic record. Take a look at the 3-letter MARC Language Codes and you can see how a lot of information can be encoded in three letters.
Variable-length Fields: Webopedia
Iterating Fields - why would you ever want a field more than once?
Non-iterating fields - why would you ever NOT want a field more than once?
The MARC record is an ingenious invention from the 1960s that fakes a relational database structure.
Misc. Database Issues
- Data loads (dumps) - dirty data. Examples of bad data loads from EbscoHost: Art Abstracts | Library Literature (they have fixed these problems after I reported these problems two times over a period of a year).
- OCR problems: Take a look at this search from Google Patents.
- Mapping / Conversion differences. Take a look at the same Medline record from three different vendors: PubMed [live PubMed record is here], OCLC FirstSearch, CSA. Here is an explanation of PubMed fields.
- Historic database issues. Look at these differing records for the same document from the Serial Set: LexisNexis Congressional | Readex Digital Serial Set. The LN Congressional records differ because indexing was done from title lists, not from examining the piece itself. See this article for further discussion.
This class us about electronic access to information - how is is structured, accessed, and manipulated.
Ordering of Indexed Information
Alphabetical Order: Alpha by author, alpha by title, etc.
Chronological Order: Peak keyword default
Classified Order: i.e. by call number (Dewey, Library of Congress, Superintendent of Documents)
How is an index different from a catalog ?
Types of Indexes
Classified Indexes: EconLit ; MLA International Bibliography; UNCRD Publications (bibliography and index I created)
Cumulative Indexes: Not relevant in online world, but important in print world
Monthly catalogue, United States public documents (note that this record has a "cumulative index note" that says: " Subject index, 1900-1971. (Includes index to former and later titles.) 15 v."
Concordances: "An alphabetical arrangement of the principal words contained in a book, with citations of the passages in which they occur." - OED
The Harvard concordance to Shakespeare
The New Strong's exhaustive concordance of the Bible
A critical Greek and English concordance of the New Testament (online)
First-line, Last-line Indexes: Columbia Granger's poetry indexes index first and last lines of poetry. Example of an online first-line index.
String Indexes: From the early days of computers.A KWIC index is a type of string index. KWIC stands for key word in context. See Wikipedia entry .
Exporting Data from Databases in a "tagged" format. This enables importing into bibliographic management software like RefWorks and EndNote. We will demo in class.An abstract differs from an annotation and an executive summary .
Descriptive Abstracts
Informative Abstracts
Critical Abstracts
Author Abstracts: ex. Dissertation Abstracts