Visualization and search approaches for time oriented scientific primary data
Download
1 / 19

Visualization and Search Approaches for Time-Oriented Scientific Primary Data - PowerPoint PPT Presentation


  • 93 Views
  • Uploaded on

Visualization and Search Approaches for Time-Oriented Scientific Primary Data. A WGL-TIB Project. Jürgen Bernard Technische Universität Darmstadt Fachgebiet Graphisch-Interaktive Systeme (GRIS) Visual Analysis Group Fraunhoferstraße 5 64283 Darmstadt Germany Tel.: +49 (6151) 155 – 666

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Visualization and Search Approaches for Time-Oriented Scientific Primary Data' - aulani


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Visualization and search approaches for time oriented scientific primary data

Visualization and Search Approaches for Time-Oriented Scientific Primary Data

A WGL-TIB Project

Jürgen Bernard

Technische Universität Darmstadt

Fachgebiet Graphisch-Interaktive Systeme (GRIS)

Visual Analysis Group

Fraunhoferstraße 5

64283 Darmstadt

Germany

Tel.: +49 (6151) 155 – 666

Fax: +49 (6151) 155 – 669

Email: [email protected]

http://www.gris.tu-darmstadt.de/home/members/bernard/index.de.htm


Outline
Outline Scientific Primary Data

  • Motivation

  • Practical Approach

  • Outlook

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 2


1 motivation
1 Scientific Primary DataMotivation

  • Trends

  • Content-based search and presentation available for many document types

    • Text documents

    • Digital image, video, audio, etc.

  • Repositories for new kinds of non-textural documents

    • Like scientific primary data

    • PANGAEA, PsychData, Dryad, ELEXIR,KoLaWiss

  • Information overload: need for visual retrieval and data exploration

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 3


1 motivation1
1 Scientific Primary DataMotivation

  • Time-oriented scientific primary data

  • Massive amounts of time-oriented scientific primary data that may be valuable for future research

  • Heterogenity of standards in different research disciplines

  • Exploration of scientific primary data repositories currently restricted to “meta-search”

PANGAEA PanPlot Tool. (http://doi.pangaea.de/10.1594/PANGAEA.330147)

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 4


1 motivation2
1 Scientific Primary DataMotivation

  • Development of a Visual Catalog for time series data

  • Interactive-graphic access to huge amounts of scientific primary data

  • Combination of content-based, and meta data retrieval

  • Explorative data analysis by searching, browsing and zooming techniques

  • User-adaptive search methods

  • Establish higher data transparency and deeper user comprehension

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 5


1 motivation3
1 Scientific Primary DataMotivation

  • Possible Use Case Scenario

  • Natural scientist detects interesting curve progression

  • Hypothesis: curve progression indicates future event

  • Search for similar curve progressions in related data sets

  • Visual overview of the most similar data elements

  • Filter result set, adapt the reference example

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 6


Outline1
Outline Scientific Primary Data

  • Motivation

  • Practical Approach

  • Outlook

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 7


2 1 overview
2.1 Overview Scientific Primary Data

Back End

Front End

Bernard, J. and Brase, J. and Fellner, D. and Koepler, O. and Kohlhammer, J. and Ruppert, T. and Schreck, T. and Sens, I.A Visual Digital Library Approach for Time-Oriented Scientic Primary Data. Accepted at the 14th European Conference on Research and Advanced Technology for Digital Libraries (ECDL), 2010

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 8


2 2 data import
2.2 Data Import Scientific Primary Data

  • Parse primary data

    • Initialy: use Pangaea data files

    • In principle: inclusion of additional repositories by customized data parsers

  • Define a generic time series data structure

    • Feature based approach

    • Store meta data as „bag of words“

    • Special attributes: time stamp,

    • parameter/unit

  • Defined data base schema

    • MySQL data base

    • Efficient data management

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 9


2 3 preprocessing
2.3 Preprocessing Scientific Primary Data

  • Problem: time series primary data may not be directly applicable for clustering, consider:

  • Outliers, noise

  • Missingvalues

  • formatof time stamps in datafiles

  • Inhomogenous time quantization, timestampcompatibility

  • Possible approaches

  • Aggregation (binning)

  • Transformation

  • Interpolation

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 10


2 4 feature extraction
2.4 Feature Extraction Scientific Primary Data

  • Descriptors: representations of time series in a reduced dimensionality space

  • Basis for similarity measures (clustering, indexing, search)

  • Problem: full sequence vs. sub sequence search

  • Huge amount of descriptors published, suitable descriptor approaches:

    • Binning, (current approach), Discrete Fourier Transformation (DFT), Discrete Wavelet Transformation (DWT), SAX-Descriptor (symbolic representation)

Lin, J. and Keogh, E. and Lonardi, S. and Chiu, B. : A symbolic representation of time series, with implications for streaming algorithms.

Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, 2003

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 11


2 5 visual catalog
2.5 Visual Catalog Scientific Primary Data

  • Visualization of clustering results (global und local)

  • Self-organizing Map as a basis for smart visualization of huge amounts of time series

  • Provide a single grid view for details (zooming)

  • We also explore other layout algorithms

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 12


2 5 visual catalog1
2.5 Visual Catalog Scientific Primary Data

  • Visual-interactive time series search

  • Query by example, query by sketch

  • Problem: fullsequence vs. subsequence search

  • Colormaps for the indication of similarity

  • List-based visualizations for result sets

  • Save session, export results

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 13


2 5 visual catalog2
2.5 Visual Catalog Scientific Primary Data

  • Meta data search

  • Additional search functionality, combination with content-based search

  • Search by „Bag of words“

  • Search by special attributes

    • Physical unit, example: temperature, humidity, etc.

    • Use time stamp: define time interval and quantization

  • Combine multiple search operations: filtering

Filtering

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 14


Outline2
Outline Scientific Primary Data

  • Motivation

  • Practical Approach

  • Outlook

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 15


3 outlook
3 Outlook Scientific Primary Data

  • Summary

  • WGL-TIB project: visual access to scientific primary data

  • Project start: 01/2010, project duration: 3 years

  • Facing time-series data

  • Feature-based descriptor approach

  • Interfaces for visualization, browsing and search

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 16


3 outlook1
3 Outlook Scientific Primary Data

  • Requirements, use-cases and evaluation model (TIB Hannover)

  • Test visual catalog prototype with real world problems

  • Collaboration with scientific users to capture user view

  • User in the loop approach

  • Technical future work (GRIS, FhG)

  • Establish an application prototype

  • Similarity search operations, based on evaluation results

  • Special user interface

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 17


Thank you for your attention
Thank you for your attention Scientific Primary Data

Comments very welcome

Do you have any questions?

Acknowledgements

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 18


Related work
Related Work Scientific Primary Data

  • PANGAEA Publishing Network for Geoscientic & Environmental Data. (http://www.pangaea.de/)

  • PsychData National Repository for Psychological Research Data. (http://psychdata.zpid.de/ (in German))

  • Dryad Digital Repository for Data Underlying Published Works. (http://www.datadryad.org/)

  • ELIXIR European Life Sciences Infrastructure for Biological Information. (http://www.elixir-europe.org/)

  • KoLaWiss: Society for Scientic Data Processing Goettingen: Cooperative long-term preservation for research centers (in German).Project Report (2009)

  • PANGAEA PanPlot Tool. (http://doi.pangaea.de/10.1594/PANGAEA.330147)

  • Bernard, J. and Brase, J. and Fellner, D. and Koepler, O. and Kohlhammer, J. and Ruppert, T. and Schreck, T. and Sens, I.:A Visual Digital Library Approach for Time-Oriented Scientic Primary Data. Accepted at the 14th European Conference on Research and Advanced Technology for Digital Libraries (ECDL), 2010

  • Lin, J. and Keogh, E. and Lonardi, S. and Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, 2003

27.08.2014 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 19


ad