A digital library (also called an online library , an internet library , a digital repository , a library without walls , or a digital collection ) is an online database of digital objects that can include text, still images, audio, video, digital documents , or other digital media formats or a library accessible through the internet . Objects can consist of digitized content like print or photographs , as well as originally produced digital content like word processor files or social media posts. In addition to storing content, digital libraries provide means for organizing, searching, and retrieving the content contained in the collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals or organizations. The digital content may be stored locally, or accessed remotely via computer networks. These information retrieval systems are able to exchange information with each other through interoperability and sustainability .
120-502: The SAO/NASA Astrophysics Data System ( ADS ) is a digital library portal for researchers on astronomy and physics , operated for NASA by the Smithsonian Astrophysical Observatory . ADS maintains three bibliographic collections containing over 15 million records, including all arXiv e-prints. Abstracts and full-text of major astronomy and physics publications are indexed and searchable through
240-581: A subscription to have access to the CAD library 3D models. Generative Ai CAD libraries are being developed using linked open data of schematics and diagrams . CAD libraries can have assets such as 3D models , materials/ textures , bump maps , trees/plants, HDRIs , and different Computer graphics lighting sources to be rendered . A 2D graphics repository/library are vector graphics or raster graphics images/ icons that can be free use or proprietary . The advantages of digital libraries as
360-476: A tooltip . This style makes citing easier and improves the reader's experience. Citation styles can be broadly divided into styles common to the humanities and the sciences, though there is considerable overlap. Some style guides, such as the Chicago Manual of Style , are quite flexible and cover both parenthetical and note citation systems. Others, such as MLA and APA styles, specify formats within
480-511: A bit-stream environment, the digital library contains a built-in proxy server and search engine so the digital materials can be accessed using an Internet browser . Also, the materials are not preserved for the future. The eGranary is intended for use in places or situations where Internet connectivity is very slow, non-existent, unreliable, unsuitable or too expensive. In the past few years, procedures for digitizing books at high speed and comparatively low cost have improved considerably with
600-414: A citation is actually supplementary material, or suggestions for further reading. Parenthetical referencing, also known as Harvard referencing, has full or partial, in-text, citations enclosed in circular brackets and embedded in the paragraph. An example of a parenthetical reference: Depending on the choice of style, fully cited parenthetical references may require no end section. Other styles include
720-441: A citation on Misplaced Pages "could be considered a public parallel to scholarly citation". A scientific publication being "cited in a Misplaced Pages article is considered an indicator of some form of impact for this publication" and it may be possible to detect certain publications through changes to Misplaced Pages articles. Wikimedia Research's Cite-o-Meter tool showed a league table of which academic publishers are most cited on Misplaced Pages as does
840-408: A combined result consisting of the most relevant found items. Searching over previously harvested metadata involves searching a locally stored index of information that has previously been collected from the libraries in the federation. When a search is performed, the search mechanism does not need to make connections with the digital libraries it is searching—it already has a local representation of
960-611: A court victory on proceeding with their book-scanning project that was halted by the Authors' Guild. This helped open the road for libraries to work with Google to better reach patrons who are accustomed to computerized information. According to Larry Lannom, Director of Information Management Technology at the nonprofit Corporation for National Research Initiatives (CNRI), "all the problems associated with digital libraries are wrapped up in archiving". He goes on to state, "If in 100 years people can still read your article, we'll have solved
1080-655: A database of education citations, abstracts and texts that was created in 1964 and made available online through DIALOG in 1969. In 1994, digital libraries became widely visible in the research community due to a $ 24.4 million NSF managed program supported jointly by DARPA 's Intelligent Integration of Information (I3) program, NASA , and NSF itself. Successful research proposals came from six U.S. universities. The universities included Carnegie Mellon University , University of California-Berkeley , University of Michigan , University of Illinois , University of California-Santa Barbara , and Stanford University . Articles from
1200-666: A digital library can be much lower than that of a traditional library. A physical library must spend large sums of money paying for staff, book maintenance, rent, and additional books. Digital libraries may reduce or, in some instances, do away with these fees. Both types of library require cataloging input to allow users to locate and retrieve material. Digital libraries may be more willing to adopt innovations in technology providing users with improvements in electronic and audio book technology as well as presenting new forms of communication such as wikis and blogs; conventional libraries may consider that providing online access to their OP AC catalog
1320-590: A keyboard. He named this the " Memex ". This way individuals would be able to access stored books and files at a rapid speed. In 1956, Ford Foundation funded Licklider to analyze how libraries could be improved with technology. Almost a decade later, his book entitled " Libraries of the Future " included his vision. He wanted to create a system that would use computers and networks so human knowledge would be accessible for human needs and feedback would be automatic for machine purposes. This system contained three components,
SECTION 10
#17327823713561440-518: A library's content. Popular open-source solutions include DSpace , Greenstone Digital Library (GSDL) , EPrints , Digital Commons , and the Fedora Commons -based systems Islandora and Samvera . Legal deposit is often covered by copyright legislation and sometimes by laws specific to legal deposit, and requires that one or more copies of all material published in a country should be submitted for preservation in an institution, typically
1560-477: A license to lend their resources. This may involve the restriction of lending out only one copy at a time for each license, and applying a system of digital rights management for this purpose. The Digital Millennium Copyright Act of 1998 was an act created in the United States to attempt to deal with the introduction of digital works. This Act incorporates two treaties from the year 1996. It criminalizes
1680-403: A list of the citations, with complete bibliographical references, in an end section, sorted alphabetically by author. This section is often called "References", "Bibliography", "Works cited" or "Works consulted". In-text references for online publications may differ from conventional parenthetical referencing. A full reference can be hidden, only displayed when wanted by the reader, in the form of
1800-469: A list of the references given at the end of the article is easily extracted. For scanned articles, reference extraction relies on OCR. The reference database can then be "inverted" to list the citations for each paper in the database. Citation lists have been used in the past to identify popular articles missing from the database; mostly these were from before 1975 and have now been added to the system. The database now contains over fifteen million articles. In
1920-402: A means of easily and rapidly accessing books, archives and images of various types are now widely recognized by commercial interests and public bodies alike. Traditional libraries are limited by storage space; digital libraries have the potential to store much more information, simply because digital information requires very little physical space to contain it. As such, the cost of maintaining
2040-405: A network of libraries, but public access is only available in the reading rooms in the libraries. The Australian National edeposit system has the same features, but also allows for remote access by the general public for most of the content. Physical archives differ from physical libraries in several ways. Traditionally, archives are defined as: The technology used to create digital libraries
2160-471: A number of different guides exist. Individual publishers often have their own in-house variations as well, and some works are so long-established as to have their own citation methods too: Stephanus pagination for Plato ; Bekker numbers for Aristotle ; citing the Bible by book, chapter and verse; or Shakespeare notation by play. The Citation Style Language (CSL) is an open XML-based language to describe
2280-426: A page by the "Academic Journals WikiProject". Research indicates a large share of academic citations on the platform are paywalled and hence inaccessible to many readers. "[ citation needed ]" is a tag added by Misplaced Pages editors to unsourced statements in articles requesting citations to be added. The phrase is reflective of the policies of verifiability and no original research on Misplaced Pages and has become
2400-441: A rare term which is a synonym of a much more common term (such as ' dateline ' rather than ' date ') can be searched for specifically. The search engine allows selection logic both within fields and between fields. Search terms in each field can be combined with OR, AND, simple logic or Boolean logic , and the user can specify which fields must be matched in the search results. This allows complex searches to be built; for example,
2520-725: A search interface which allows resources to be found. These resources are typically deep web (or invisible web) resources since they frequently cannot be located by search engine crawlers . Some digital libraries create special pages or sitemaps to allow search engines to find all their resources. Digital libraries frequently use the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to expose their metadata to other digital libraries, and search engines like Google Scholar , Yahoo! and Scirus can also use OAI-PMH to find these deep web resources. As with physical libraries, very relatively little
SECTION 20
#17327823713562640-511: A system that was written specifically for the ADS, allowing for extensive customization for astronomical needs that would not have been possible with general purpose database software. The scripts are designed to be as platform independent as possible, given the need to facilitate mirroring on different systems around the world, although the growing use of Linux as the operating system of choice within astronomy has led to increasing optimization of
2760-460: A wider spread of online editions of journal publications, abstracts would start to instead be loaded into ADS directly. Papers are indexed within the database by their bibliographic record which contains the details of the journal they were published in, and various associated metadata , such as author lists, references and citations . Originally this data was stored in ASCII format but eventually
2880-413: A work if its format becomes obsolete. Copyright issues persist. As such, proposals have been put forward suggesting that digital libraries be exempt from copyright law. Although this would be very beneficial to the public, it may have a negative economic effect and authors may be less inclined to create new works. Another issue that complicates matters is the desire of some publishing houses to restrict
3000-434: Is a conflict of interest between libraries and the publishers who may wish to create online versions of their acquired content for commercial purposes. In 2010, it was estimated that twenty-three percent of books in existence were created before 1923 and thus out of copyright. Of those printed after this date, only five percent were still in print as of 2010. Thus, approximately seventy-two percent of books were not available to
3120-504: Is a reference to a book, article , web page , or other published item. Citations should supply sufficient detail to identify the item uniquely. Different citation systems and styles are used in scientific citation , legal citation , prior art , the arts , and the humanities . Regarding the use of citations in the scientific literature, some scholars also put forward "the right to refuse unwanted citations" in certain situations deemed inappropriate. Citation content can vary depending on
3240-608: Is a type of semantic digital library. Keywords-based and semantic search are the two main types of searches. A tool is provided in the semantic search that create a group for augmentation and refinement for keywords-based search. Conceptual knowledge used in DjDL is centered around two forms; the subject ontology and the set of concept search patterns based on the ontology. The three type of ontologies that are associated to this search are bibliographic ontologies , community-aware ontologies, and subject ontologies. In traditional libraries,
3360-623: Is almost universally used as a research tool among astronomers, and there are several studies that have estimated quantitatively how much more efficient ADS has made astronomy; one estimated that ADS increased the efficiency of astronomical research by 333 full-time equivalent research years per year, and another found that in 2002 its effect was equivalent to 736 full-time researchers, or all the astronomical research done in France. ADS has allowed literature searches that would previously have taken days or weeks to carry out to be completed in seconds, and it
3480-604: Is an example of such a database, built in response to scientific communication needs in light of the pandemic. Beyond academia, digital collections have also recently been developed to appeal to a more general audience, as is the case with the Selected General Audience Content of the Internet-First University Press developed by Cornell University. This general-audience database contains specialized research information but
3600-512: Is by far the most advanced and its use accounts for about 85% of the total ADS usage. Articles are assigned to the different databases according to the subject rather than the journal they are published in, so that articles from any one journal might appear in all three subject databases. The separation of the databases allows searching in each discipline to be tailored, so that words can automatically be given different weight functions in different database searches, depending on how common they are in
3720-418: Is citation errors, which often occur due to carelessness on either the researcher or journal editor's part in the publication procedure. For example, a study that analyzed 1,200 randomly selected citations from three major business ethics journals concluded that an average article contains at least three plagiarized citations when authors copy and paste a citation entry from another publication without consulting
Astrophysics Data System - Misplaced Pages Continue
3840-413: Is converted to " Hercules ", but h er is ignored. Once search terms have been preprocessed, the database is queried with the revised search term, as well as synonyms for it. As well as simple synonym replacement such as searching for both plural and singular forms, ADS also searches for a large number of specifically astronomical synonyms. For example, spectrograph and spectroscope have basically
3960-408: Is digitally organized for accessibility. The establishment of these archives has facilitated specialized forms of digital recordkeeping to fulfill various niches in online, research-based communication. Citation A citation is a reference to a source. More precisely, a citation is an abbreviated alphanumeric expression embedded in the body of an intellectual work that denotes an entry in
4080-402: Is distributed around the world. Most users access the system from institutes of higher education, whose IP address can easily be used to determine the user's geographical location. Studies reveal that the highest per-capita users of ADS are France and Netherlands-based astronomers, and while more developed countries (measured by GDP per capita ) use the system more than less developed countries;
4200-533: Is estimated at between 4,000 and US$ 5,000 million, so the value of ADS to astronomy would be about 200–250 million USD annually. Its operating budget is a small fraction of this amount. The great importance of ADS to astronomers has been recognized by the United Nations , the General Assembly of which has commended ADS on its work and success, particularly noting its importance to astronomers in
4320-405: Is estimated that ADS has increased the readership and use of the astronomical literature by a factor of about three since its inception. In monetary terms, this increase in efficiency represents a considerable amount. There are about 12,000 active astronomical researchers worldwide, so ADS is the equivalent of about 5% of the working population of astronomers. The global astronomical research budget
4440-465: Is even more revolutionary for archives since it breaks down the second and third of these general rules. In other words, "digital archives" or "online archives" will still generally contain primary sources, but they are likely to be described individually rather than (or in addition to) in groups or collections. Further, because they are digital, their contents are easily reproducible and may indeed have been reproduced from elsewhere. The Oxford Text Archive
4560-526: Is generally considered to be the oldest digital archive of academic physical primary source materials. Archives differ from libraries in the nature of the materials held. Libraries collect individual published books and serials, or bounded sets of individual items. The books and journals held by libraries are not unique, since multiple copies exist and any given copy will generally prove as satisfactory as any other copy. The material in archives and manuscript libraries are "the unique records of corporate bodies and
4680-732: Is just one of these purposes. Linguistic analysis of citation-practices has indicated that they also serve critical roles in orchestrating the state of knowledge on a particular topic, identifying gaps in the existing knowledge that should be filled or describing areas where inquiries should be continued or replicated. Citation has also been identified as a critical means by which researchers establish stance: aligning themselves with or against subgroups of fellow researchers working on similar projects and staking out opportunities for creating new knowledge. Conventions of citation (e.g., placement of dates within parentheses, superscripted endnotes vs. footnotes , colons or commas for page numbers, etc.) vary by
4800-412: Is known about how users actually select books. There are two general strategies for searching a federation of digital libraries: distributed searching and searching previously harvested metadata . Distributed searching typically involves a client sending multiple search requests in parallel to a number of servers in the federation. The results are gathered, duplicates are eliminated or clustered, and
4920-405: Is not well documented, but several key thinkers are connected to the emergence of the concept. Predecessors include Paul Otlet and Henri La Fontaine 's Mundaneum , an attempt begun in 1895 to gather and systematically catalogue the world's knowledge, with the hope of bringing about world peace. The visions of the digital library were largely realized a century later during the great expansion of
Astrophysics Data System - Misplaced Pages Continue
5040-723: Is one of ADS's most powerful tools. The system uses data from the SIMBAD , the NASA/IPAC Extragalactic Database , the International Astronomical Union Circulars and the Lunar and Planetary Institute to identify papers referring to a given object, and can also search by object position, listing papers which concern objects within a 10 arcminute radius of a given Right Ascension and Declination . These databases combine
5160-407: Is simplified and a user input of M45, M 45 or M-45 all result in the same query being executed; similarly, NGC designations and common search terms such as Shoemaker Levy and T Tauri are stripped of spaces. Unimportant words such as AT, OR and TO are stripped out, although in some cases case sensitivity is maintained, so that while a nd is ignored, A nd is converted to " Andromeda ", and H er
5280-861: Is sometimes used for libraries that have both physical collections and electronic collections. For example, American Memory is a digital library within the Library of Congress . Some important digital libraries also serve as long term archives, such as arXiv and the Internet Archive . Others, such as the Digital Public Library of America , seek to make digital information from various institutions widely accessible online. Many academic libraries are actively involved in building repositories of their institution's books, papers, theses, and other works that can be digitized or were 'born digital'. Many of these repositories are made available to
5400-690: Is sufficient. An important advantage to digital conversion is increased accessibility to users. They also increase availability to individuals who may not be traditional patrons of a library, due to geographic location or organizational affiliation. Digital libraries offer a variety of software packages, including those tailored for kids' educational games . Institutional repository software, which focuses primarily on ingest, preservation and access of locally produced documents, particularly locally produced academic outputs, can be found in Institutional repository software . This software may be proprietary, as
5520-420: Is that harvesting and indexing systems are more resource-intensive and therefore expensive. Digital preservation aims to ensure that digital media and information systems are still interpretable into the indefinite future. Each necessary component of this must be migrated, preserved or emulated . Typically lower levels of systems ( floppy disks for example) are emulated, bit-streams (the actual files stored in
5640-472: Is the case with the Library of Congress which uses Digiboard and CTS to manage digital content. The design and implementation in digital libraries are constructed so computer systems and software can make use of the information when it is exchanged. These are referred to as semantic digital libraries. Semantic libraries are also used to socialize with different communities from a mass of social networks. DjDL
5760-571: The COVID-19 pandemic , libraries and higher education institutions have launched digital archiving projects to document life during the pandemic, thus creating a digital, cultural record of collective memories from the period. Researchers have also utilized digital archiving to create specialized research databases . These databases compile digital records for use on international and interdisciplinary levels. COVID CORPUS, launched in October 2020,
5880-585: The Million Book Project , and Internet Archive . With continued improvements in book handling and presentation technologies such as optical character recognition and development of alternative depositories and business models, digital libraries are rapidly growing in popularity. Just as libraries have ventured into audio and video collections, so have digital libraries such as the Internet Archive. In 2016, Google Books project received
6000-660: The national library . Since the advent of electronic documents , legislation has had to be amended to cover the new formats, such as the 2016 amendment to the Copyright Act 1968 in Australia. Since then various types of electronic depositories have been built. The British Library 's Publisher Submission Portal and the German model at the Deutsche Nationalbibliothek have one deposit point for
6120-566: The 1980s, the success of these endeavors resulted in OPAC replacing the traditional card catalog in many academic, public and special libraries. This permitted libraries to undertake additional rewarding co-operative efforts to support resource sharing and expand access to library materials beyond an individual library. An early example of a digital library is the Education Resources Information Center (ERIC),
SECTION 50
#17327823713566240-624: The 5S model to define a digital archive as a specific case of digital library able to take into consideration the peculiar features of archives. A computer-aided design library or CAD library is a cloud based repository of 3D models or parts for computer-aided design (CAD), computer-aided engineering (CAE), computer-aided manufacturing (CAM), or Building information modeling (BIM). Examples of CAD libraries are GrabCAD , Sketchup 3D Warehouse , Sketchfab , McMaster-Carr , TurboSquid , Chaos Cosmos , and Thingiverse . The models can be free and open source or proprietary and have to pay
6360-399: The ADS bibliographic record. The ADS service is distributed worldwide with twelve mirror sites in twelve countries and with the database synchronized by weekly updates using rsync , a mirroring utility which allows updates to only the portions of the database which have changed. All updates are triggered centrally, but they initiate scripts at the mirror sites which "pull" updated data from
6480-484: The Internet. Vannevar Bush and J.C.R. Licklider are two contributors that advanced this idea into then current technology. Bush had supported research that led to the bomb that was dropped on Hiroshima . After seeing the disaster, he wanted to create a machine that would show how technology can lead to understanding instead of destruction. This machine would include a desk with two screens, switches and buttons, and
6600-463: The United Kingdom, and Ukraine. ADS currently (2005) receives abstracts or tables of contents from almost two hundred journal sources. The service may receive data referring to the same article from multiple sources, and creates one bibliographic reference based on the most accurate data from each source. The common use of TeX and LaTeX by almost all scientific journals greatly facilitates
6720-672: The ability to find works of interest is directly related to how well they were cataloged. While cataloging electronic works digitized from a library's existing holding may be as simple as copying or moving a record from the print to the electronic form, complex and born-digital works require substantially more effort. To handle the growing volume of electronic publications, new tools and technologies have to be designed to allow effective automated semantic classification and searching. While full-text search can be used for some items, there are many common catalog searches which cannot be performed using full text, including: Most digital libraries provide
6840-404: The article. In this way, an ADS user can determine which papers are of most interest to astronomers who are interested in the subject of a given paper. Also returned are links to the SIMBAD and/or NASA Extragalactic Database object name databases, via which a user can quickly find out basic observational data about the objects analyzed in a paper, and find further papers on those objects. ADS
6960-541: The associated reference(s). There also has been analysis of citations of science information on Misplaced Pages or of scientific citations on the site, e.g. enabling listing the most relevant or most-cited scientific journals and categories and dominant domains. Since 2015, the altmetrics platform Altmetric.com also shows citing English Misplaced Pages articles for a given study, later adding other language editions. The Wikimedia platform under development Scholia also shows "Misplaced Pages mentions" of scientific works. A study suggests
7080-408: The attempt to circumvent measures which limit access to copyrighted materials. It also criminalizes the act of attempting to circumvent access control. This act provides an exemption for nonprofit libraries and archives which allows up to three copies to be made, one of which may be digital. This may not be made public or distributed on the web, however. Further, it allows libraries and archives to copy
7200-537: The availability of the computer networks the information resources are expected to stay distributed and accessed as needed, whereas in Vannevar Bush 's essay As We May Think (1945) they were to be collected and kept within the researcher's Memex . The term virtual library was initially used interchangeably with digital library, but is now primarily used for libraries that are virtual in other senses (such as libraries which aggregate distributed content). In
7320-628: The bibliographic references section of the work for the purpose of acknowledging the relevance of the works of others to the topic of discussion at the spot where the citation appears. Generally, the combination of both the in-body citation and the bibliographic entry constitutes what is commonly thought of as a citation (whereas bibliographic entries by themselves are not). Citations have several important purposes. While their uses for upholding intellectual honesty and bolstering claims are typically foregrounded in teaching materials and style guides (e.g., ), correct attribution of insights to previous sources
SECTION 60
#17327823713567440-680: The cases of the major journals of astronomy ( Astrophysical Journal , Astronomical Journal , Astronomy and Astrophysics , Publications of the Astronomical Society of the Pacific and the Monthly Notices of the Royal Astronomical Society ), coverage is complete, with all issues indexed from number 1 to the present. These journals account for about two-thirds of the papers in the database, with
7560-412: The citation-system used (e.g., Oxford , Harvard , MLA , NLM , American Sociological Association (ASA), American Psychological Association (APA), etc.). Each system is associated with different academic disciplines , and academic journals associated with these disciplines maintain the relevant citational style by recommending and adhering to the relevant style guides . A bibliographic citation
7680-579: The context by means of the archival bond . Archival descriptions are the fundamental means to describe, understand, retrieve and access archival material. At the digital level, archival descriptions are usually encoded by means of the Encoded Archival Description XML format. The EAD is a standardized electronic representation of archival description which makes it possible to provide union access to detailed archival descriptions and resources in repositories distributed throughout
7800-468: The context of a single citation system. These may be referred to as citation formats as well as citation styles. The various guides thus specify order of appearance, for example, of publication date, title, and page numbers following the author name, in addition to conventions of punctuation, use of italics, emphasis, parenthesis, quotation marks, etc., particular to their style. A number of organizations have created styles to fit their needs; consequently,
7920-471: The corpus of knowledge, the question, and the answer. Licklider called it a procognitive system. In 1980 the role of the library in an electronic society was the focus of a clinic on library applications of data processing . Participants included Frederick Wilfrid Lancaster , Derek De Solla Price , Gerard Salton , and Michael Gorman) . Early projects centered on the creation of an electronic card catalogue known as Online Public Access Catalog (OPAC). By
8040-827: The current claim. The digitization of patent data and increasing computing power have led to a community of practice that uses these citation data to measure innovation attributes, trace knowledge flows, and map innovation networks. Modern scientists are sometimes judged by the number of times their work is cited by others—this is actually a key indicator of the relative importance of a work in science. Accordingly, individual scientists are motivated to have their own work cited early and often and as widely as possible, but all other scientists are motivated to eliminate unnecessary citations so as not to devalue this means of judgment . A formal citation index tracks which referred and reviewed papers have referred which other such papers. Baruch Lev and other advocates of accounting reform consider
8160-1169: The desire of a digital library to become expanded to include best sellers, but publisher licensing may hinder the process. Many digital libraries offer recommender systems to reduce information overload and help their users discovering relevant literature. Some examples of digital libraries offering recommender systems are IEEE Xplore , Europeana , and GESIS Sowiport . The recommender systems work mostly based on content-based filtering but also other approaches are used such as collaborative filtering and citation-based recommendations. Beel et al. report that there are more than 90 different recommendation approaches for digital libraries, presented in more than 200 research articles . Typically, digital libraries develop and maintain their own recommender systems based on existing search and recommendation frameworks such as Apache Lucene or Apache Mahout . Digital libraries, or at least their digital collections, also have brought their own problems and challenges in areas such as: There are many large scale digitisation projects that perpetuate these problems. Large scale digitization projects are underway at Google ,
8280-734: The developing world, in reports of the United Nations Committee on the Peaceful Uses of Outer Space . A 2002 report by a visiting committee to the Center for Astrophysics, meanwhile, said that the service had "revolutionized the use of the astronomical literature", and was "probably the most valuable single contribution to astronomy research that the CfA has made in its lifetime". Because it is used almost universally by astronomers, ADS can reveal much about how astronomical research
8400-487: The disks) are preserved and operating systems are emulated as a virtual machine . Only where the meaning and content of digital media and information systems are well understood is migration possible, as is the case for office documents. However, at least one organization, the Wider Net Project, has created an offline digital library, the eGranary , by reproducing materials on a 6 TB hard drive . Instead of
8520-455: The documents. A typical aim would be to identify the most important documents in a collection. A classic example is that of the citations between academic articles and books. For another example, judges of law support their judgements by referring back to judgements made in earlier cases (see citation analysis in a legal context ). An additional example is provided by patents which contain prior art , citation of earlier patents relevant to
8640-440: The early days of digital libraries, there was discussion of the similarities and differences among the terms digital , virtual , and electronic . A distinction is often made between content that was created in a digital format, known as born-digital , and information that has been converted from a physical medium, e.g. paper, through digitization . Not all electronic content is in digital data format. The term hybrid library
8760-415: The field of communication, Michael Bugeja and Daniela V. Dimitrova have found that citations to online sources have a rate of decay (as cited pages are taken down), which they call a "half-life", that renders footnotes in those journals less useful for scholarship over time. Other experts have found that published replications do not have as many citations as original publications. Another important issue
8880-428: The final search results. The system indexes author names by surname and initials, and accounts for the possible variations in spelling of names using a list of variations. This is common in the case of names including accents such as umlauts and transliterations from Arabic or Cyrillic script . An example of an entry in the author synonym list is: The capability to search for papers on specific astronomical objects
9000-478: The formatting of citations and bibliographies. In some areas of the humanities, footnotes are used exclusively for references, and their use for conventional footnotes (explanations or examples) is avoided. In these areas, the term footnote is actually used as a synonym for reference , and care must be taken by editors and typesetters to ensure that they understand how the term is being used by their authors. In their research on footnotes in scholarly journals in
9120-420: The general public with few restrictions, in accordance with the goals of open access , in contrast to the publication of research in commercial journals, where the publishers usually limit access rights. Irrespective of access rights, institutional, truly free, and corporate repositories can be referred to as digital libraries. Institutional repository software is designed for archiving, organizing, and searching
9240-460: The impact; while in sociology the number of references, the article length, and title length are among the factors. Studies of methodological quality and reliability have found that "reliability of published research works in several fields may be decreasing with increasing journal rank". Nature Index recognizes that citations remain a controversial and yet important metric for academics. They report five ways to increase citation counts: (1) watch
9360-519: The importance attached to astronomical research. The amount of basic research carried out in a country is found to be proportional to the number of astronomers in that country multiplied by its GDP per capita, with considerable scatter. ADS has also been used to show that the fraction of single-author astronomy papers has decreased substantially since 1975 and that astronomical papers with more than 50 authors have become more common since 1990. Digital library The early history of digital libraries
9480-407: The incorporation of bibliographic data into the system in a standardized format, and importing HTML -coded web-based articles is also simple. ADS utilizes Python and Perl scripts for importing, processing and standardizing bibliographic data. The apparently mundane task of converting author names into a standard Surname , Initial format is actually one of the more difficult to automate, due to
9600-494: The information. This approach requires the creation of an indexing and harvesting mechanism which operates regularly, connecting to all the digital libraries and querying the whole collection in order to discover new and updated resources. OAI-PMH is frequently used by digital libraries for allowing metadata to be harvested. A benefit to this approach is that the search mechanism has full control over indexing and ranking algorithms, possibly allowing more consistent results. A drawback
9720-404: The limit is reached, the library can repurchase access rights at a lower cost than the original price." While from a publishing perspective, this sounds like a good balance of library lending and protecting themselves from a feared decrease in book sales, libraries are not set up to monitor their collections as such. They acknowledge the increased demand of digital materials available to patrons and
9840-758: The limitations of this encouraged the database maintainers to migrate all records to an XML (Extensible Markup Language) format in 2000. Bibliographic records are now stored as an XML element with sub-elements for the various metadata. Scanned articles are stored in TIFF format at both medium and high resolution . The TIFF files are converted on demand into GIF files, for on-screen viewing, and PDF or PostScript files for printing. The generated files are then cached to eliminate needlessly frequent regenerations for popular articles. As of 2000, ADS contained 250 GB of scans, which consisted of 1,128,955 article pages comprising 138,789 articles. By 2005 this had grown to 650 GB and
9960-413: The main ADS servers. At first, the journal articles available via ADS were exclusively scanned bitmaps created from the paper journals and the abstracts created using optical character recognition software. Some of these scanned articles up to around 1995 are available for free by agreement with the journal publishers, with some dating from as far back as the early 19th century. Eventually, because of
10080-622: The many catalogue designations an object might have, so that a search for the Pleiades will also find papers which list the famous open cluster in Taurus under any of its other catalog designations or popular names, such as M45, the Seven Sisters or Melotte 22. The search engine first filters search terms in several ways. An M followed by a space or hyphen has the space or hyphen removed, so that searching for Messier catalogue objects
10200-485: The most relevant papers. The database can be queried for author names, astronomical object names, title words, and words in the abstract text, and results can be filtered according to a number of criteria. It works by first gathering synonyms and simplifying search terms as described above, and then generating an "inverted file", which is a list of all the documents matching each search term. The user-selected logic and filters are then applied to this inverted list to generate
10320-680: The number of astronomers and astronomical publications grew, bibliographical efforts became institutional tasks, first at the Observatoire Royal de Belgique , where the Bibliography of Astronomy was published from 1881 to 1898, and then at the Astronomischer Rechen-Institut in Heidelberg, where the yearly Astronomischer Jahresbericht was published from 1899 to 1968. After 1968, this was replaced by
10440-581: The number of times a patent is cited to be a significant metric of its quality, and thus of innovation . Reviews often replace citations to primary studies. Two metascientists reported that in a growing scientific field , citations disproportionately cite already well-cited papers, possibly slowing and inhibiting canonical progress to some degree in some cases. They find that "structures fostering disruptive scholarship and focusing attention on novel ideas" could be important. Recommendation systems sometimes also use citations to find similar studies to
10560-424: The one the user is currently reading or that the user may be interested in and may find useful. Better availability of integrable open citation information could be useful in addressing the "overwhelming amount of scientific literature". Knowledge agents may use citations to find studies that are relevant to the user's query, in particular citation statements are used by scite.ai to answer a question, also providing
10680-478: The original source. Experts have found that simple precautions, such as consulting the author of a cited source about proper citations, reduce the likelihood of citation errors and thus increase the quality of research. Another study noted that approximately 25% citations do not support the claims made, a finding that affects many disciplines, including history. Research suggests the impact of an article can be, partly, explained by superficial factors and not only by
10800-414: The papers of individuals and families". A fundamental characteristic of archives is that they have to keep the context in which their records have been created and the network of relationships between them in order to preserve their informative content and provide understandable and useful information over time. The fundamental characteristic of archives resides in their hierarchical organization expressing
10920-567: The portal. Johann Friedrich Weidler published the first comprehensive history of astronomy in 1741 and the first astronomical bibliography in 1755. This was an effort to archive and classify earlier astronomical knowledge and works. This effort was continued by Jérôme de La Lande who published his Bibliographie astronomique in 1803, a work that covered the period from 480 BCE to the year of publication. The Bibliographie générale de l’astronomie, Volume I and Volume II , published by J.C. Houzeau and A. Lancaster, followed in 1882 until 1889. As
11040-878: The problem." Daniel Akst , author of The Webster Chronicle , proposes that "the future of libraries—and of information—is digital". Peter Lyman and Hal Variant , information scientists at the University of California, Berkeley , estimate that "the world's total yearly production of print, film, optical, and magnetic content would require roughly 1.5 billion gigabytes of storage". Therefore, they believe that "soon it will be technologically possible for an average person to access virtually all recorded information". Digital archives are an evolving medium and they develop under various circumstances. Alongside large scale repositories, other digital archiving projects have also evolved in response to needs in research and research communication on various institutional levels. For example, during
11160-615: The projects summarized their progress at their halfway point in May 1996. Stanford research, by Sergey Brin and Larry Page , led to the founding of Google . Early attempts at creating a model for digital libraries included the DELOS Digital Library Reference Model and the 5S Framework. The term digital library was first popularized by the NSF / DARPA / NASA Digital Libraries Initiative in 1994. With
11280-446: The public. There is a dilution of responsibility that occurs as a result of the distributed nature of digital resources. Complex intellectual property matters may become involved since digital material is not always owned by a library. The content is, in many cases, public domain or self-generated content only. Some digital libraries, such as Project Gutenberg , work to digitize out-of-copyright works and make them freely available to
11400-542: The public. An estimate of the number of distinct books still existent in library catalogues from 2000 BC to 1960, has been made. The Fair Use Provisions (17 USC § 107) under the Copyright Act of 1976 provide specific guidelines under which circumstances libraries are allowed to copy digital resources. Four factors that constitute fair use are "Purpose of the use, Nature of the work, Amount or substantiality used and Market impact". Some digital libraries acquire
11520-496: The relationship between GDP per capita and ADS use is not linear. The range of ADS usage per capita far exceeds the range of GDP per capita, and basic research carried out in a country, as measured by ADS usage, has been found to be proportional to the square of the country's GDP divided by its population. Statistics also imply that there are about three times as many astronomers in countries of European culture as in countries of Asian cultures , perhaps suggesting cultural differences in
11640-434: The relevant field. Data in the preprint archive is updated daily from arXiv which is the dominant repository of physics and astronomy preprints. The advent of preprint servers has, like ADS, had a significant impact on the rate of astronomical research, as papers are often made available from preprint servers weeks or months before they are published in the journals. The incorporation of preprints from arXiv into ADS means that
11760-456: The remaining items are sorted and presented back to the client. Protocols like Z39.50 are frequently used in distributed searching. A benefit to this approach is that the resource-intensive tasks of indexing and storage are left to the respective servers in the federation. A drawback to this approach is that the search mechanism is limited by the different indexing and ranking capabilities of each database; therefore, making it difficult to assemble
11880-637: The rest consisting of papers published in over 100 other journals from around the world, as well as in conference proceedings. While the database contains the complete contents of all the major journals and many minor ones as well, its coverage of references and citations is much less complete. References in and citations of articles in the major journals are fairly complete, but references such as "private communication", "in press" or "in preparation" cannot be matched, and author errors in reference listings also introduce potential errors. Astronomical papers may cite and be cited by articles in journals which fall outside
12000-469: The result that it is now possible to digitize millions of books per year. The Google book-scanning project is also working with libraries to offer digitize books pushing forward on the digitize book realm. Digital libraries are hampered by copyright law because, unlike with traditional printed works, the laws of digital copyright are still being formed. The republication of material on the web by libraries may require permission from rights holders, and there
12120-616: The same meaning, and in an astronomical context metallicity and abundance are also synonymous. ADS's synonym list was created manually, by grouping the list of words in the database according to similar meanings. As well as English language synonyms, ADS also searches for English translations of foreign search terms and vice versa, so that a search for the French word soleil retrieves references to Sun , and papers in languages other than English can be returned by English search terms. Synonym replacement can be disabled if required, so that
12240-436: The same subject. There is research about citations and development of related tools and systems, mainly relating to scientific citations. Citation analysis is a method widely used in metascience . Citation analysis is the examination of the frequency, patterns, and graphs of citations in documents. It uses the directed graph of citations — links from one document to another document — to reveal properties of
12360-410: The scientific merits of an article. Field-dependent factors are usually listed as an issue to be tackled not only when comparisons across disciplines are made, but also when different fields of research of one discipline are being compared. For example, in medicine, among other factors, the number of authors, the number of references, the article length, and the presence of a colon in the title influence
12480-440: The scope of ADS, such as chemistry , mathematics or biology journals. Since its inception, the ADS has developed a highly complex search engine to query the abstract and object databases . The search engine is tailor-made for searching astronomical abstracts, and the engine and its user interface assume that the user is well-versed in astronomy and able to interpret search results which are designed to return more than just
12600-614: The scripts for installation on that platform. The main ADS server is located at the Center for Astrophysics | Harvard & Smithsonian in Cambridge, Massachusetts , and is a dual 64-bit X86 Intel server with two quad-core 3.0 GHz CPUs and 32 GB of RAM , running the CentOS 5.4 Linux distribution. As of 2022, there are mirrors located in China, Chile, France, Germany, Japan, Russia,
12720-432: The search engine can return the most current research available, with the caveat that preprints may not have been peer-reviewed or proofread to the required standard for publication in the main journals. The database of ADS links preprints with subsequently published articles wherever possible, so that citation and reference searches will return links to the journal article where the preprint was cited. The software runs on
12840-484: The search. Although it was conceived as a means of accessing abstracts and papers, ADS provides a substantial amount of ancillary information along with search results. For each abstract returned, links are provided to other papers in the database which are referenced, and which cite the paper, and a link is provided to a preprint, where one exists. The system also generates a link to "also-read" articles – that is, those which have been most commonly accessed by those reading
12960-499: The text of a paper using a notes system without a full bibliography could look like: The note, located either at the foot of the page (footnote) or at the end of the paper (endnote) would look like this: In a paper with a full bibliography, the shortened note might look like: The bibliography entry, which is required with a shortened note, would look like this: In the humanities, many authors also use footnotes or endnotes to supply anecdotal information. In this way, what looks like
13080-536: The text, either bracketed or superscript or both. The numbers refer to either footnotes (notes at the end of the page) or endnotes (notes on a page at the end of the paper) that provide source detail. The notes system may or may not require a full bibliography, depending on whether the writer has used a full-note form or a shortened-note form. The organizational logic of the bibliography is that sources are listed in their order of appearance in-text, rather than alphabetically by author last name. For example, an excerpt from
13200-429: The title length and punctuation; (2) release the results early as preprints; (3) avoid referring to a country in the title, abstract, or keywords; (4) link the article to supporting data in a repository; and (5) avoid hyphens in the titles of research articles. Citation patterns are also known to be affected by unethical behavior of both the authors and journal staff. Such behavior is called impact factor boosting and
13320-585: The type of source and may include: Along with information such as authors, date of publication, title and page numbers, citations may also include unique identifiers depending on the type of work being referred to. Broadly speaking, there are two types of citation systems, the Vancouver system and parenthetical referencing. However, the Council of Science Editors (CSE) adds a third, the citation-name system . The Vancouver system uses sequential numbers in
13440-458: The use of digit materials such as e-books purchased by libraries. Whereas with printed books, the library owns the book until it can no longer be circulated, publishers want to limit the number of times an e-book can be checked out before the library would need to repurchase that book. "[HarperCollins] began licensing use of each e-book copy for a maximum of 26 loans. This affects only the most popular titles and has no practical effect on others. After
13560-555: The user could search for papers concerning NGC 6543 OR NGC 7009 , with the paper titles containing (radius OR velocity) AND NOT (abundance OR temperature). Search results can be filtered according to a number of criteria, including specifying a range of years such as "1945 to 1975", "2000 to the present day" or "before 1900", and what type of journal the article appears in [–] non-peer-reviewed articles such as conference proceedings. These can be excluded or specifically searched for, or specific journals can be included in or excluded from
13680-408: The wide variety of naming conventions around the world and the possibility that a given name such as Davis could be a first name , middle name or surname. The accurate conversion of names requires a detailed knowledge of the names of authors active in astronomy, and ADS maintains an extensive database of author names, which is also used in searching the database (see below). For electronic articles,
13800-431: The world. Given the importance of archives, a dedicated formal model, called NEsted SeTs for Object Hierarchies (NESTOR), built around their peculiar constituents, has been defined. NESTOR is based on the idea of expressing the hierarchical relationships between objects through the inclusion property between sets, in contrast to the binary relation between nodes exploited by the tree. NESTOR has been used to formally extend
13920-625: The yearly Astronomy and Astrophysics Abstracts book series, which continued until the end of the 20th century. The first suggestion of a digital database of journal paper abstracts was made at a conference on Astronomy from Large Data-Bases held in Garching bei München in 1987. An initial version of ADS, with a database consisting of 40 papers, was created as a proof of concept in 1988. The ADS Abstract Service became available for general use via proprietary network software in April 1993, and it
14040-440: Was connected to SIMBAD a few months later. In early 1994 the ADS web-based service was launched, which effectively quadrupled the number of active users in the five weeks following its introduction. In 2011 the ADS launched ADS Labs Streamlined Search which introduced facets for query refinement and selection. In 2013, ADS Labs 2.0 started featuring a new search engine, full-text search functionality, scalable facets, and an API
14160-445: Was expected to grow further to about 900 GB by 2007. No further information has been published (2005). The database initially contained only astronomical references, but has now grown to incorporate three databases, covering astronomy references (including planetary sciences and solar physics), physics references (including instrumentation and geosciences), as well as preprints of scientific papers from arXiv . The astronomy database
14280-479: Was introduced. In 2015, the new ADS, code-named Bumblebee, was released as ADS-beta. The ADS-beta system features a micro-services API and client-side dynamic page loading served on a cloud platform. In May 2018 the beta label was dropped and Bumblebee became the default ADS interface—with some legacy features (ADS Classic) remaining available. Development continues to the present day, with an extensible API available: enabling users to build their own utilities on top of
14400-539: Was reported to involve even the top-tier journals. Specifically the high-ranking journals of medical science, including The Lancet , JAMA and The New England Journal of Medicine , are thought to be associated with such behavior, with up to 30% of citations to these journals being generated by commissioned opinion articles. On the other hand, the phenomenon of citation cartels is rising. Citation cartels are defined as groups of authors that cite each other disproportionately more than they do other groups of authors who work on
#355644