Information life cycle and visualization and check in for project definitions
This presentation is the property of its rightful owner.
Sponsored Links
1 / 40

Information life-cycle and visualization and check-in for project definitions PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on
  • Presentation posted in: General

Information life-cycle and visualization and check-in for project definitions. Peter Fox Xinformatics – ITEC 6961/CSCI 6960/ERTH-6963-01 Week 9/10, April 13, 2010. Contents. Review of last class, reading Information life-cycle Information visualization Checking in for project definitions

Download Presentation

Information life-cycle and visualization and check-in for project definitions

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Information life cycle and visualization and check in for project definitions

Information life-cycle and visualization and check-in for project definitions

Peter Fox

Xinformatics – ITEC 6961/CSCI 6960/ERTH-6963-01

Week 9/10, April 13, 2010


Contents

Contents

  • Review of last class, reading

  • Information life-cycle

  • Information visualization

  • Checking in for project definitions

  • Discussion of reading

  • Next class


And yet only one part of the life cycle of data

And yet only one part of the life cycle of data


Definitions

Definitions

  • Life-cycle elements

    • Acquisition: Process of recording or generating a concrete artefact from the concept (see transduction)

    • Curation: The activity of managing the use of data from its point of creation to ensure it is available for discovery and re-use in the future (http://www.dcc.ac.uk/FAQs/data-curator)

    • Preservation: Process of retaining usability of data in some source form for intended and unintended use

    • Stewardship: Process of maintaining integrity across acquisition, curation and preservation


Definitions ctd

Definitions ctd.

  • Management: Process of arranging for discovery, access and use of data, information and all related elements. Also oversees or effects control of processes for acquisition, curation, preservation and stewardship. Involves fiscal and intellectual responsibility.


The nature of the challenge

The nature of the challenge

  • To architect information systems today

    • You may play many roles

    • You may not get all the metadata or information you need even if you get the data

    • You will need skills that you were not taught

  • To work with end-users today

    • You may have lots of technical experience

    • You will need new skills in addressing the changing use of data and information

    • One ‘size’ does not fit all


Many views of the information life cycle

Many views of the Information life-cycle


Acquisition

Acquisition

  • Learn / read what you can about the developer of the means of acquisition

    • Documents may not be easy to find

    • Remember bias!!!

  • Document things as you go

  • Have a checklist (the Management list) and review it often


Curation partial

Curation (partial)

  • Consider the organization and presentation of the data

  • Document what has been (and has not been) done

  • Consider and address the provenance to date, you are now THE next person

  • Be as technology-neutral as possible

  • Look to add metainformation


Preservation

Preservation

  • Usually refers to the full life cycle

  • Archiving is a component

  • Stewardship is the act of preservation

  • Intent is that ‘you can open it any time in the future’ and that ‘it will be there’

  • This involves steps that may not be conventionally thought of

  • Think 10, 20, 50, 200 years…. looking historically gives some guide to future considerations


Remember

Remember

  • The life cycle applies within and before and after your use case…

  • So, let’s look in a little more detail


How the information is created

How the information is created

  • Systemic

  • Environmental

  • Trial-and-error (or ad-hoc)


How the information is delivered

How the information is delivered?

  • One-to-many presentation

  • White paper

  • Web site FAQ

  • Web site informational

  • Web site directed (link sent with e-mail, and so on) to a specific Web site

  • Application-based delivery via managed expert system

  • One-to-one presentation:

    • Word of mouth

    • Ad-hoc communication


How the information is managed

How the information is managed

  • Complexity of the information

  • Complexity of the creation process

  • Complexity of the management system

  • Financial impact of IP/IC creation


Type of information created

Type of information created

  • Tacit (created and stored informally):

    • Human memory

    • Local hard drive of the computer

    • Expert system (moving tacit information into a formalized structure)

  • Explicit (created and sorted formally):

    • Network share

    • Network Web site/intranet

    • Informal knowledge-management system

    • Document-management system

    • Formal KM system

  • Value of the source

  • Age of the information

  • Proximity of the information to the consumer

  • Source of the information, and previous interactions with that specific source


Value of the source

Value of the source

  • Age of the information

  • Proximity of the information to the consumer

  • Source of the information, and previous interactions with that specific source


Mostly technical issues

Mostly Technical Issues

  • Data Preservation

    • Bit-level integrity

    • Data readability

  • Documentation

  • Metadata

  • Semantics

  • Persistent Identifiers

  • Virtual Data Products

  • Lineage Persistence

  • Required ancillary data

  • Applicable standards


Mostly non technical issues

Mostly Non-Technical Issues

  • Policy (constrained by money…)

    • Front end of the lifecycle

      • Long-term planning, data formats, documentation...

    • Governance and policy

    • Legal requirements

    • Archive to archive transitions

  • Money (intertwined with policy)

    • Cost-benefit trades

    • Long-term needs of programs

    • User input

      • Identifying likely users

    • Levels of service

    • Funding source and mechanism


Life cycle is a complex issue

Life cycle is a complex issue

  • Must be managed

  • Documented

  • As part of the use case, but also outside it


Information visualization

Information Visualization

  • Questions to keep in mind

    • What is the improvement in the understanding as compared to the situation without visualization?

    • Which visualization techniques are suitable for one's data/ information?


Why visualization

Why visualization?

  • Reducing amount of data, quantization

  • Patterns

  • Features

  • Events

  • Trends

  • Irregularities

  • Exit points for analysis

  • Leading to presentation of data

  • Recall – cognitive science and the mental representation??!!??


Types of visualization

Types of visualization

  • Color coding (including false color)

  • Classification of techniques is based on

    • Dimensionality

    • Information being sought, i.e. purpose

  • Line plots

  • Contours

  • Surface rendering techniques

  • Volume rendering techniques

  • Animation techniques

  • Non-realistic, including ‘cartoon/ artist’ style


Image aka raster file formats

Image (aka Raster) file formats

  • CGM, the Computer Graphics Metafile, has been an ISO standard since 1987. It has the capability to encompass both graphical and image data.

  • PostScript or more specifically Encapsulated PostScript Format (EPSF), is a page description language with sophisticated text facilities . For graphics, as compared to CGM, it tends to be expensive in terms of storage.


Image file formats

Image file formats

  • TIFF, the Tagged Image File Format, encompasses a range of different formats, originally designed for interchange between electronic publishing packages.

  • GIF, the Graphical Interchange Format , is quite widespread and can encode a number of separate images of different sizes and colors.

  • PNG, the Portable Network Graphic format


Image file formats1

Image file formats

  • RGB, the Red Green Blue format of Silicon Graphics, is used by most visualization software packages as the internal image format. The format consist of a header containing the dimensions of the image, followed by the actual image data.

  • The image data is stored as a 2D array of tuples. Each tuple is a vector with 3 components: R, G, and B. The RGB components determine the color of every pixel (picture element) in the image.


Image file formats2

Image file formats

  • PPM, the Portable Pixmap Format (24 bits per pixel), PGM, the Portable Greyscale Format (8 bits per pixel), and PBM, the Portable Bitmap Format (1 bit per pixel) formats are pixel based and are distributed with the the X-Window system (version 11.4).


Image file formats3

Image file formats

  • XBM is the X-Window one Bit image file format, which has been standardized by the MIT X-consortium.

  • A major constraint on the use of images is the large data volume which has to be dealt with.

  • Large sets of image data can have severe implications for storage, memory, and transmission costs.

  • Therefore, compression techniques are very important.

  • There are two categories based on whether or not it is possible to reconstruct the initial picture after compression.


Compression any format

Compression (any format)

  • Lossless compression methods are methods for which the original, uncompressed data can be recovered exactly. Examples of this category are the Run Length Encoding, and the Lempel-Ziv Welch algorithm.

  • Lossy methods - in contrast to lossless compression, the original data cannot be recovered exactly after a lossy compression of the data. An example of this category is the Color Cell Compression method.

  • Lossy compression techniques can reach reduction rates of 0.9, whereas lossless compression techniques normally have a maximum reduction rate of 0.5.


Vector formats

Vector formats

  • Postscript

  • PDF

  • SVG

  • ‘Shape files’

  • CGM (also)


Animation formats

Animation formats

  • Mpeg

  • Avi

  • Qt

  • Wmv

  • Animated GIF


Remember metadata

Remember - metadata

  • Many of these formats already contain metadata or fields for metadata, use them!


Tools

Tools

  • Conversion

    • Imtools

    • GraphicConverter

    • Gnu convert

    • Many more

  • Combination/Visualization

    • IDV

    • Gnuplot

    • http://disc.sci.gsfc.nasa.gov/giovanni


New modes

New modes

  • http://www.actoncopenhagen.decc.gov.uk/content/en/embeds/flash/4-degrees-large-map-final

  • http://www.smashingmagazine.com/2007/08/02/data-visualization-modern-approaches/

  • Many modes:

    • http://www.siggraph.org/education/materials/HyperVis/domik/folien.html


Visualization

Visualization


Managing visualization products

Managing visualization products

  • The importance of a ‘self-describing’ product

  • Visualization products are not just consumed by people

  • How many images, graphics files do you have on your computer for which the origin, purpose, use is still known?

  • How are these logically organized?


Discovery of visualizations

Discovery of visualizations

  • When represented as images:

    • Image-based type free text search?

    • Referred to in publications (articles, books, web pages)

  • Vector graphics:

    • Postscript or PDF

    • SVG

    • Others?

  • What makes this easy or hard or impossible?


Discussion

Discussion

  • About life-cycle in general?

  • Visualization?


Reading for this week

Reading for this week

  • Is retrospective


Check in for project assignment

Check in for Project Assignment

  • Analysis of existing information system content and architecture, critique, redesign and prototype redeployment


What is next

What is next

  • Week 11 – Information and Workflow Management

  • Week 12 – Information Discovery, Information Integration


  • Login