1 / 25

I502 Information Management

I502 Information Management. Lecture 1 January 13, 2004. Outline. Growth in information Basic information units and baselines Dimensions of information management. Information Growth. From 1.8 million to 26 million. Information Growth.

Download Presentation

I502 Information Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. I502 Information Management Lecture 1 January 13, 2004

  2. Outline • Growth in information • Basic information units and baselines • Dimensions of information management

  3. Information Growth From 1.8 million to 26 million

  4. Information Growth • In 1951 there were 10,000 journals and now there are about 140,000 journals • Estimate: Printed/conventional information double every eight years • How much new information per person? According to the Population Reference Bureau, the world population is 6.3 billion; almost 800 MB of recorded information is produced per person each year. It would take about 30 feet of books to store the equivalent of 800 MB of information on paper * Source: Lyman, Peter and Hal R. Varian, "How Much Information", 2003. Retrieved from http://www.sims.berkeley.edu/how-much-info-2003 on Jan. 10, 2004

  5. Information Growth • In 1999, there were 800 million web pages, now there are at least 3 billion pages (as of this morning!) • Total volume of web content: • Surface web: 167 TB • Deep web: 91850 TB * Source: Lyman, Peter and Hal R. Varian, "How Much Information", 2003. Retrieved from http://www.sims.berkeley.edu/how-much-info-2003 on Jan. 10, 2004.

  6. A Major Factor Behind Growth • Shifts in major economies in the world • From Agro-Industrial -> Service • Information-driven businesses such as banking, entertainment, computing, publishing dominate

  7. Information Processing ->Competitive Advantage • In information-based economy capability to store, organize, process, and learn from information is critical to surviving in the market place

  8. Info growth in relation to Info Technology 1998 1999 2003 1950 1960 1969/70 1973 1984 1991 1993 1995 DNS introduced; Hosts now 1000; sybolics.com registration (‘85) Netscape goes public; Java launched 4.3 million web servers; 800 million web pages ARPA net commissioned by DoD - 4 nodes - 1969 Relational Model Introduced - 1970 Feb. Mosaic introduced by NCSA; by Oct. 500 servers; Nov. Mosaic Mac and Wintel Intel 8080 microprocessor - entire CPU on a chip - cost then $400 -now $1 or less Internet hosts reaches 30 million; WWW sites reaches 2,200,000; 320 million web pages; XML becomes a W3C standard WWW dev. By TBL; NSFNET backbone now T3; traffic 1 trillon bytes/month Random Access Files Introduced - Disk Drives Available 3 billion web pages Indexed by Google

  9. Influence of Info Tech on Growth of Info • Computers make production, manipulation and distribution of data easier … leading to more info • With popularity of computers, data is becoming digital …& there is more of it …

  10. Basic Units of Information • Digital Units of Data • 0 or 1 = single bit • Eight bits = 1 byte • 1000 bytes = 1 Kilo byte (1 KB) - 3 (0’s) • 1000 Kbytes = 1 Mega byte (1 MB) - 6 • 1000 Mbytes = 1 Giga byte (1 GB) - 9 • 1000 Gbytes = 1 Tera byte (1 TB) - 12 • 1000 Tbytes = 1 Peta byte (1 PB) - 15 • 1000 Pbytes = 1 Exa byte (1 EB) - 18

  11. Baselines Unit Amount(bytes) Example Byte 1 one character Kilobyte 30,000 image of a book page 500,000 a typical novel in text format Megabyte 1,400,000 a 3.5 inch disk 10,000,000 a Mozart symphony, compressed 20,000,000 a digitized scanned book 650,000,000 a CD-ROM disk Gigabytes 10,000,000,000 a digitized movie, compressed 17,000,000,000 a DVD disk Terabytes 10,000,000,000,000 LC Print Collection Petabytes 2,000,000,000,000,000 Content of all US research libraries Exabytes 5,000,000,000,000,000,000 New information produced in 2002 (92% on magnetic media – hard disk)

  12. Growth Numbers to Ponder • In 2001 WalMart’s DW was roughly half the size of the world’s largest library (11 TB)

  13. Growth Numbers to Ponder • A high resolution astronomical camera can generate about half the size of Walmart’s DW in about eight hours!

  14. Possible Consequences of Digitization • Convergence in industry … motivated by multi-purposing of content Broadcast Mass Media Publishing Computing Communication By Nicholas Negrponte, MIT

  15. Possible Consequences of Growth • Data Glut • Data  knowledge • Data  decision • Interesting book -> Data Smog : Surviving the Information Glutby David Shenk

  16. Possible Consequences of Growth • Computers are duel-edged swords -> help produce more data but if used properly can help manage and transform data

  17. Transformation of Data • Data -> Knowledge • Requires a two-pronged approach • IM Macro level - Broad understanding of info management technologies • IM Micro level - Deep understanding of data modeling, organization, retrieval, and analysis

  18. Dimensions of IM Macro Level • Data and collection building • Architecture • Networked access • Users and social impact

  19. Different Types of Data • Need to store and serve text, operational/ transactional data, statistics, image, audio, video • Many primary formats, e.g., ASCII, Proprietary, POSTSCRIPT, LATEX, GIF, JPEG, AIFF, QUICKTIME .. • Many secondary formats, e.g., PKZIP, UUENCODE, TAR, UNIX compress ...

  20. Aggregating Data • Databases = structured data • Digital Libraries = both structured and un-structured data • Data Warehouses = extracted, filtered, classified, integrated, and summarized data • Primary data must be accurate – DW data must be “curated”

  21. Info Architecture • The structure or organization of information can directly influence interaction • IM systems are designed with close attention to navigation, search, and means for access to information • The user interface (UI) is designed with specific attention to user needs and their background • UI provides immediate feedback about organization and supports multiple means of accessing information

  22. Networked Access • Resources may be at different locations • Distributed access supported, meaning users can get at data from any network-accessible devices • Also generally data is available at any time; implies dis-intermediation or absence of a human intermediary between user and data

  23. Embedded in WWW Browser Script Web Server Script Browser Backend systems Script Internet Browser Local Files

  24. Users and social impact • Must remember, technology change revolutionary but human change evolutionary - seek balance • paper highly portable & culturally supported • not everything can be digitized • sensitivity to human and social issues needed (HCI and legal issues can be critical)

  25. IM Micro level • Data modeling • Data normalization and relational model (discrete data) • Handling full-text data and non-text data • WWW based IMS • User Interface design

More Related