Loading in 2 Seconds...
Loading in 2 Seconds...
Current research information as part of digital libraries and the heterogeneity problem. Integrated searches in the context of databases with different content analyses . CRIS2002, Kassel. Jürgen Krause University of Koblenz-Landau and Social Science Information Centre (IZ-Bonn)
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Current research information as part of digital libraries and the heterogeneity problem.Integrated searches in the context of databases with different content analyses.CRIS2002, Kassel
University of Koblenz-Landau and Social Science Information Centre (IZ-Bonn)
Lennéstr. 30, 53113 Bonn, Germany,
“It doesn’t matter what you want to know, there are people in the Internet who already have this knowledge and want to help you”
(Hahn, 1999: 107)
Out of the estimated 800 million pages on around three million servers, only 6% relate to the fields of science and education (by comparison: 1.5% relate to pornography).
NEC 2000:thousand million
When used for specialist information retrieval (IR), general WWW search engines run counter to nearly every criterion which actually permits a successful search based on IR knowledge. This involves all the main components of an IR system, the database and its selection, the use of research logic and user expectations. Based on his/her knowledge of these aspects, the user should develop the best possible research strategy, something which is impossible with WWW search engines
Nevertheless WWW search engines have one advantage compared with current specialist databases: embedded in an enormous volume of irrelevant data is data which is not found in specialist databases and which may be of value to experts. This means that it is simply not possible to return to the recommendation to narrow down the search to the original specialist databases. New ways have to be found to make research, including WWW sources, more satisfactory than is the case at present using general WWW search engines.
Additional to technological integration:
Descriptor A in one such system: wide range of meanings
ViBSoz„Social Science Virtual Library“, Virtual Library Project of the German Research Association (DFG)
CARMEN „ Content Analysis, Retrieval and Metadata: Effective Networking“, special support program of the German Ministry of Education and Research (BMBF).
ELVIRA “Electronic Retrieval and Analysis System for Industrial Associations”, funded by the German Federal Ministry of Economics and Technology
ETB “The European Schools Treasury Browser” funded by the European Commission
U.S. Bureau of the Census: Integrated Information solutions – The future of census bureau data access and dissemination, Sept. 1999. Working paper
“Recent surveys of Census Bureau customers show that two out of three use multiple data sets. ... If we continue to saddle data users with the burden of putting data from disparate sources into digestible forms, we do it at the risk of our own peril.“(p.2)
“Solutions of these issues ... will remove around the further development of standards, metadata ...“ (p.3)
“IIS will help minimize data user burden, data uncertainty and maximize data quality and usefulness through the use of metadata“ (p.2)
„Strategie für die Standardisierung der Informations- und Kommunikationstechnik (ICT)“ (DIN Berlin 2002, draft)
... It is ... necessary to find a new concept relating to the still existing demand for consistency retention and interoperability. This concept can be described by means of the following premise: standardization must be considered in terms of the remaining heterogeneity. Only joint interaction between intellectual and automatic processes for the treatment of heterogeneity and standardization will produce a solution strategy which also ensures, under present-day marginal conditions, usable consistency and interoperability conditions
(translation from German)
extract metadata from various document formats algorithmically
PACS 62.30.+d Mechanical and elastic waves; vibrations (Mechanische und elastische Wellen, Schwingungslehre)
MSC 74S15 Boundary element methods (Randelementmethode)
PACS 62. Not connected
Dominanz, Messen, Mongolei, Nichtregierungsorganisation, Flugzeug, Datenaustausch, Kommunikationsraum, Kommunikationstechnologie, Medienpädagogik,
Zahl zusätzliche relev. Treffer
Anteil der zusätzlichen relev. Treffer an den zusätzl. Treffern
non-differentiated handling of vagueness
document term sets
V1: Handling of vagueness between questions and terms
document term sets
Bilateral handling of vagueness
LSI and Transformation network x Statistical methods
Fig. 3: Transformation network USB Thesaurus to IZ Thesaurus (Fig. 7-12 from Mandl 2000:206)
Todays search engines do not adequately solve the problem of a worldwide search for relevant documents and data in a special scientific community. They only represent an incomplete, albeit valuable first step. Users want to interlink literature and research project databases with the catalogues of virtual libraries, the WWW homepages of science institutions and fact sources, e.g. data archives with their survey data. In this case integration should not be performed only on a technical level or using solely intellectually created links, as is the case at present. A key role is played here by automatic transfer between different content analysis methods and standardizations of the document sets to be integrated. Based on the initial empirical results of different IZ projects, the proposed strategy appears to be highly promising: vagueness problems are not treated non-specifically as a transfer between all documents and the query but will be done cognitively plausible with individual bilateral modules.