Projekt Dyabola

Last updated

Projekt Dyabola (the Dyabola project) is a software for creating and browsing bibliographic data and image collections, specifically targeted to the humanities community. The program is built and maintained by the Biering & Brinkmann company of Germany, and access to a web version is available through subscription. [1] The service is available in six languages.

Contents

Databases

Currently, 15 databases are available in the online system, where the set of data contained in a database is delineated based on what organization owns the original collection being indexed. [2] Each database contains individual records that describe published books and articles, authors, ancient objects, images, etc. The descriptive records, or metadata, are connected in various ways, such as by connecting authors with their publications, and publications with their subject descriptors. In this way, it functions similar to many integrated library systems, or citation indexing services.

Example: subject catalog of the German Archaeological Institute in Rome

This database, called the Realkatalog in German, contains an index of all the academic publications owned by the German Archaeological Institute in Rome. These publications generally cover topics about Classical Studies, i.e. the history and archaeology of the region around the Mediterranean Sea from about 800 BCE to 476 CE. The data entry for this collection began in 1990, and currently all publications from 1956 to February 2009 have been indexed. This includes about half a million titles of books and articles, one hundred thousand authors, and one million subject words describing these. [3]

To find a record, a keyword search either across all metadata fields or restricted to one field can be performed. It is also possible to browse the hierarchical tree of subject terms to find a particular term of interest. Finally, if a user knows the local unique identifier for a source, this can be entered directly. Search results are automatically saved for future browsing.

A record about an article would contain the title, publication date, and page numbers, as well as a link to the author, the journal, any reviews of the work, and a list of subject headings. From a particular journal issue, it is possible to navigate to each of the individual articles and vice versa.

The system lacks a number of important pieces of information that would allow the user to make a relevancy judgment on the source. For example, aside from the title and a few subject headings, no information is provided about the actual content of the source. Book summaries and article abstracts are not included.

Maintenance and access issues

Currently, the large amount of data that has been manually entered and interconnected represents a value-added aspect of this software. However, the requirement for manual resources restricts access to the service to those that can purchase a subscription. New models have become available over the last few years to remedy these issues. Full-text searching and automated citation interconnection, such as that begun in the Google Books and Google Scholar projects will likely eventually make this type of manually maintained database obsolete. [4] In addition, free online citation indexing programs such as Citeulike and Connotea allow the global community to collaboratively index and catalog academic sources, using a model not unlike Wikipedia.

User interface issues

Since Projekt Dyabola's interface is about 10 years old, a number of weaknesses are now becoming apparent, especially when compared to similar, more modern systems. The interface is unintuitive and relies on outdated technologies and paradigms of interaction. For example, once the database is opened in a separate browser pop-up window, there is no description to indicate the contents of the dataset being searched. These screens also lack contextual help that would aid the user in understanding the available functionality. Another issue is that there are points where a user may like to return to a previous screen, but this option is not made available and the browser's back button is purposely disabled. [5] In addition, the software automatically saves previous searches and makes these available on the main search page, but it does not provide a function for the user to delete these old searches as they begin to clutter up this screen. Finally, when viewing an individual record, a user can also view its subject terms, which are in a hierarchical tree structure, but it is not possible to directly navigate this tree. These examples provide only a brief sample of the usability issues of the software. The weaknesses of the system can be fully enumerated in comparison with a newer software packages that serves a similar function. An example would be the Endeca faceted-browsing system, as implemented for the library catalog of the Triangle Research Libraries Network (TRLN) in North Carolina.

See also

Related Research Articles

A web portal is a specially designed website that brings information from diverse sources, like emails, online forums and search engines, together in a uniform way. Usually, each information source gets its dedicated area on the page for displaying information ; often, the user can configure which ones to display. Variants of portals include mashups and intranet "dashboards" for executives and managers. The extent to which content is displayed in a "uniform way" may depend on the intended user and the intended purpose, as well as the diversity of the content. Very often design emphasis is on a certain "metaphor" for configuring and customizing the presentation of the content and the chosen implementation framework or code libraries. In addition, the role of the user in an organization may determine which content can be added to the portal or deleted from the portal configuration.

CiteSeerX is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science. CiteSeer is considered as a predecessor of academic search tools such as Google Scholar and Microsoft Academic Search. CiteSeer-like engines and archives usually only harvest documents from publicly available websites and do not crawl publisher websites. For this reason, authors whose documents are freely available are more likely to be represented in the index.

MEDLINE is a bibliographic database of life sciences and biomedical information. It includes bibliographic information for articles from academic journals covering medicine, nursing, pharmacy, dentistry, veterinary medicine, and health care. MEDLINE also covers much of the literature in biology and biochemistry, as well as fields such as molecular evolution.

This page is a glossary of library and information science.

A discovery system is a bibliographic search system based on search engine technology. It is part of the concept of Library 2.0 and is intended to supplement or even replace the existing OPAC catalogs. These systems emerged in the late 2000s in response to user desire for a more convenient search option similar to that of internet search engine. The results from searching a discovery system may include books and other print materials from the library's catalog, electronic resources such as e-journals or videos, and items stored in other libraries.

Windows File Explorer File manager application that is included with releases of the Microsoft Windows operating system

Windows File Explorer, previously known as Windows Explorer, is a file manager application that is included with releases of the Microsoft Windows operating system from Windows 95 onwards. It provides a graphical user interface for accessing the file systems. It is also the component of the operating system that presents many user interface items on the screen such as the taskbar and desktop. Controlling the computer is possible without Windows Explorer running.

Reference management software, citation management software, or bibliographic management software is software for scholars and authors to use for recording and utilising bibliographic citations (references) as well as managing project references either as a company or an individual. Once a citation has been recorded, it can be used time and again in generating bibliographies, such as lists of references in scholarly books, articles and essays. The development of reference management packages has been driven by the rapid expansion of scientific literature.

The SAO/NASA Astrophysics Data System (ADS) is an online database of over 16 million astronomy and physics papers from both peer reviewed and non-peer reviewed sources. Abstracts are available free online for almost all articles, and full scanned articles are available in Graphics Interchange Format (GIF) and Portable Document Format (PDF) for older articles. It was developed by the National Aeronautics and Space Administration (NASA), and is managed by the Smithsonian Astrophysical Observatory.

Desktop search

Desktop search tools search within a user's own computer files as opposed to searching the Internet. These tools are designed to find information on the user's PC, including web browser history, e-mail archives, text documents, sound files, images, and video. A variety of desktop search programs are now available; see this list for examples. Most desktop search programs are standalone applications. Desktop search products are software alternatives to the search software included in the operating system, helping users sift through desktop files, emails, attachments, and more.

Tag (metadata) Keyword assigned to information

In information systems, a tag is a keyword or term assigned to a piece of information. This kind of metadata helps describe an item and allows it to be found again by browsing or searching. Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system, although they may also be chosen from a controlled vocabulary.

Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled vocabulary schemes mandate the use of predefined, authorised terms that have been preselected by the designers of the schemes, in contrast to natural language vocabularies, which have no such restriction.

In computing, a virtual folder generally denotes an organizing principle for files that is not dependent on location in a hierarchical directory tree. Instead, it consists of software that coalesces results from a data store, which may be a database or a custom index, and presents them visually in the format in which folder views are presented. A virtual folder can be thought of as a view that lists all files tagged with a certain tag, and thus a simulation of a folder whose dynamic contents can be assembled on the fly, when requested. It is related in concept to several other topics in computer science, with names including saved search, saved query, and filtering.

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. The web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis.

Google Books Service from Google

Google Books is a service from Google Inc. that searches the full text of books and magazines that Google has scanned, converted to text using optical character recognition (OCR), and stored in its digital database. Books are provided either by publishers and authors through the Google Books Partner Program, or by Google's library partners through the Library Project. Additionally, Google has partnered with a number of magazine publishers to digitize their archives.

Entity–attribute–value model (EAV) is a data model to encode, in a space-efficient manner, entities where the number of attributes that can be used to describe them is potentially vast, but the number that will actually apply to a given entity is relatively modest. Such entities correspond to the mathematical notion of a sparse matrix.

BASE (search engine) Academic search engine

BASE is a multi-disciplinary search engine to scholarly internet resources, created by Bielefeld University Library in Bielefeld, Germany. It is based on free and open-source software such as Apache Solr and VuFind. It harvests OAI metadata from institutional repositories and other academic digital libraries that implement the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), and then normalizes and indexes the data for searching. In addition to OAI metadata, the library indexes selected web sites and local data collections, all of which can be searched via a single search interface.

Metadata Data about data

Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:

A digital library, also called an online library, an internet library, a digital repository, or a digital collection is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital media formats or a library accessible through the internet. Objects can consist of digitized content like print or photographs, as well as originally produced digital content like word processor files or social media posts. In addition to storing content, digital libraries provide means for organizing, searching, and retrieving the content contained in the collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals or organizations. The digital content may be stored locally, or accessed remotely via computer networks. These information retrieval systems are able to exchange information with each other through interoperability and sustainability.

The OpenSIGLE repository provides open access to the bibliographic records of the former SIGLE database. The creation of the OpenSIGLE archive was decided by some major European STI centres, members of the former European network EAGLE for the collection and dissemination of grey literature. OpenSIGLE was developed by the French INIST-CNRS, with assistance from the German FIZ Karlsruhe and the Dutch Grey Literature Network Service (GreyNet). OpenSIGLE is hosted on an INIST-CNRS server at Nancy. Part of the open Access movement, OpenSIGLE is referenced by the international Directory of Open Access Repositories.

A bibliographic record is an entry in a bibliographic index which represents and describes a specific resource. A bibliographic record contains the data elements necessary to help users identify and retrieve that resource, as well as additional supporting information, presented in a formalized bibliographic format. Additional information may support particular database functions such as search, or browse, or may provide fuller presentation of the content item.

References

  1. Biering & Brinkmann (2001). "General User Manual for the DYA v5.5 Retrieval Programme 2nd Version" . Retrieved 13 March 2009.
  2. Biering & Brinkmann (2001). "PROJEKT DYABOLA Database Projects" . Retrieved 13 March 2009.
  3. Biering & Brinkmann (2001). "The Subject catalogue of the DAI in Rome 1956–2003" . Retrieved 13 March 2009.
  4. Noruzi, A. (2005). "Google Scholar: The new generation of citation indexes". Libri: International Journal of Libraries & Information Services. 55 (4): 170–180. CiteSeerX   10.1.1.101.4149 . doi:10.1515/LIBR.2005.170. S2CID   18669150.
  5. Kim, Hyun Hee; Kim, Yong Ho (2008). "Usability study of digital institutional repositories". The Electronic Library. 26 (6): 863–881. doi:10.1108/02640470810921637.