Your gateway to millions of microforms and digital reproductions of books

Your Data in EROMM Web Search

EROMM Web Search is kind of a fallback for data that does not fit into EROMM Classic, because the description of the reformatted items isn't detailed enough (see Your Data in EROMM Classic for details).

EROMM Web Search acquires data in two ways:

  • harvesting OAI-sources
  • extracting text from webpages (the classic search engine approach)

We prefer to get the data via OAI-PMH if possible, because it allows better display and retrieval. Extracting text from webpages should be seen as a last resort for inclusion.

The following sections describe how these two ways work and give you a picture how your data will look like in EROMM Web Search. To submit your source(s) please use the form below.

OAI-PMH

EROMM Web Search harvests metadata only in Dublin Core (“oai_dc”), whose support is mandatory in the OAI-PMH (for other formats see Your Data in EROMM Classic). It uses a selection of the DC-elements for search and display and the dc:identifier element for linking. Except for dc:title all listed elements are optional and inclusion depends on the data in the OAI-source.

The following elements may be used for the search-index:

  • dc:title
  • dc:creator
  • dc:contributor
  • dc:date
  • dc:publisher
  • dc:relation
  • dc:description
  • dc:source
  • dc:subject
  • dc:identifier

EROMM Web Search combines these elements into one index-field which is used for retrieval. A query on a single element is not possible.

Display

A smaller selection of the above elements can also be used to create the display of the results:

  • dc:title
  • dc:creator
  • dc:contributor
  • dc:date
  • dc:publisher
  • dc:identifier

If the dc:format element holds a MIME type, it will be used to create an icon indicating the format of the described reformatted item. Other content of dc:format is ignored.

Linking

EROMM Web Search's indexer evaluates the dc:identifier element to create links. Usually OAI-sources hold a direct link or a persistent identifier (e.g. DOI, URN, Handle) in this element. EROMM Web Search can handle both, but will always prefer a persistent identifier (if an OAI-source offers both, only the persistent identifier will be used).

If a source does not offer any link in the dc:identifier field, or a record for some reason doesn't have a link, the result in EROMM Web Search will point to the “start / search page” of the source.

EROMM Web Search can also handle multiple links for a record. They will all be presented to the user after clicking on the result.

OAI-Sets

EROMM Search harvests records on a set basis. Ideally there is one set (or the whole source) which includes all items in the scope of EROMM. Please tell us which sets you want to have included in EROMM Web Search when you submit your source.

Websites

EROMM Web Search extracts text from a website and uses it for search and display. It can crawl through a page and only follow links with certain patterns. These are set manually for each source.

All the text of a webpage goes into the search index. In the result display EROMM Web Search presents a snippet from the text around the search term(s).

Please note that EROMM Web Search can only extract text from HTML-code - text generated by scripts, embedded in Flash, images and similar “objects” can not be used. Further, EROMM Web Search respects the Robots exclusion standard and will not crawl excluded pages.

Tell Us About a Source

Please use the form below to submit your source. In case of an OAI-source we'd appreciate if you could give us information on sets to harvest.

If you want us to contact you once your data is in, leave your email address

In case of an OAI-source please list the sets you want us to harvest here, otherwise we will harvest the lot

S A᠎ P G Z
This website uses cookies to ensure you get the best experience from it.
Last modified:: 2012-02-07, 9:00