Manuscripts

 

Manuscripts, libraries & archives

Espen S. Ore, 22.4.02

Digitizing a library means both a digital catalog and digital data - and the two are often intermixed when for instance metadata are stored in the dataobjects. With the digital world we are also seeing data which have only existed in digital form: how should libraries and other archival institustions catalog and keep copies of digital data?

Manuscripts are traditionally found in three different kinds of institutions: Libraries, Museums and Archive- institutions. These three groups have their own methods and theories behind retrieval systems, and this shows in the way they store their catralog information.

Catalog data are still very much influenced by pruinted catalogs and card archives. One of the aims for the future is to liberate the cataloging systems and the use of meta data from this.

Libraries have a tradition of cataloging for the specialist: a librarien would often function as an intermediary between the card catalog and the book or manuscript collection. When catalog information - and ideally copies of the books and manuscripts - are made available for the general public it has to be done in a way which is useful outside of the library professionals - if not it will go nowhere. This is an important area for further development.

Digital libraries

Marilyn Deegan, 19 April 2002

There are two issues here, of which I propose to deal with only one. They are: what is happening to libraries as institutions, which is probably not our concern, and what is happening in digital library research. Digital library research includes a great deal of methodological work which is of interest to the humanities computing community, also, some of the formalisms of humanities computing have a great deal to offer the digital library world. The whole issue of complex digital resources for the humanities straddles the two fields.

Where are we now?

Digital library research sits at the conjunction of computer science and information science, and its application has a great deal of impact (of course) upon practical digital developments in libraries. It includes:

  • Complex networking applications and protocols
  • Information retrieval
  • Complex linking of library objects, for instance, SFX ‘a unique and revolutionary tool for navigation and discovery, delivering powerful linking services in the scholarly information environment’ www.sfxit.com
  • New work on information locators: URN, OpenURL, DOI, etc
  • Work in preservation such as OAIS (Open Archival Information System)
  • Complex metadata research (TEI, EAD, METS, MOAII)
  • Virtual reunification (papyri, Arnamagnæan collection)

 

In the area of digital resources for the humanities, a critical mass of materials is being made available for humanities, and there are many, many projects and collections that one could cite. The following are of particular interest:

  • Gutenberg bibles which are available from Keio University, Cambridge University, the British Library, Goettingen
  • The Bibliotheca Universalis project: a EU-funded cultural heritage project to produce digital images of all kinds of library materials, especially early/rare/unique materials
  • Early English Books Online: images, text and catalogue records of books from 1500-1800
  • 3-D object management systems, using complex structural metadata, virtual reality techniques and complex interfaces

 

In codicology, electronic editing and presentation projects there are some exciting developments:

  • The British Library’s ‘Turning the Pages’ systems for the Sherborne Missal and other key early works. These also incorporate music and commentary for a rich multimedia presentation
  • The British Library Newspaper Pilot
  • The e-carel project at Vassar College, which offers a new mode of interaction with library materials

 

Where are we going?

  1. Automated methods of generating metadata and searchability in complex texts, e.g Olive Software’s Active Paper Archive used in the British Library Newspaper Pilot; the MetaE (Metadata Engine) project, funded by the EU.
  2. Work on digital preservation, which is giving us new understandings of the complexity of digital data and the integration of this with systems and programs. In particulate, work on migration and emulation.
  3. FEDORA (Flexible Extensible Digital Object Repository Architecture), led by UVA, which is taking complex digital resources and creating preservable, reusable digital objects, and also deriving the structures and formalisms for the description, use and long-term archiving of many kinds of resources.
  4. Information retrieval, including automatic keywording from bodies of digital data, then creation and refinement of thesauri, use of lemmatized dictionaries for searching, creation of topic maps, etc.
  5. Multilinguality is still an issue, as it has always been. UNICODE is a solution, but we are not yet there with fonts.
  6. Query by image content for visual materials.
  7. New kinds of projects, integrating documents, databases and mapping tools, such as GIS.
  8. E-books: new modes of publications which give us new corpora of data which don’t need to be rekeyed.

What should our agenda be?

  1. Investigate and evaluate some of the commercial tools on the market.
  2. Work on multilinguality and UNICODE—oriental and ancient languages are still not as well covered as we would wish.
  3. New and more complex multimedia linkages.
  4. Look at automated metadata extraction projects.
  5. Make closer links with the digital library research world.
  6. Think about worldwide digital library development issues, including standards, interoperability, preservation, copyright, micropayments and licensing issues etc.
  7. Look at the users/readers in libraries: how is this changing? What is the impact of lifelong learning? How do we train scholars and students for the digital world?
  8. Legal deposit of electronic materials and future access to these.

Susan Hockey

The Internet will be the library in the future world of scholarship.

At present many libraries are putting their content on the internet in a form that mimicks print - images and conventional metadata and cataloguing systems.

IR systems provide retrieval mechanisms.

In the humanities we study primary sources in detail with analysis and interpretation as well as retrieval.

The TEI is a major contribution to digital libraries from computing in the humanities (chum). The TEI is used a lot in digital libraries, but usually as TEILite - i.e. in a simple way.

In chum we have always had a lot of individual projects. There is now a lot of emphasis on reusable resources which is what the library is all about.

The Internet makes access much better but what happens at the point of access once you have found the item? Chum can really contribute here because of expertise in the functionality of software.

The agenda is putting the cultural heritage on the Internet with tools to help people of all ages to work with it. More research is needed on what those tools might do.We need generalisable systems with reusable resources. There is also a lot of political pressure to reach out to wider communities and the rest of society.

TEI in XML is a starting point for this - I am interested to see what happens when more sophisticated linking (e.g. XLinks) are implement. What is the impact on the user?