1 / 51

Emerging Research Directions in DBs/ISs

Emerging Research Directions in DBs/ISs. Outline. Mobile Databases Multimedia Databases Geographic Information Systems Bioinformatics XML Data Mining Data Warehousing Introduction to ASIS Lab. Mobile Databases.

jamesmeyer
Download Presentation

Emerging Research Directions in DBs/ISs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Emerging Research Directions in DBs/ISs

  2. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab 2

  3. Mobile Databases • Recent advances in portable and wireless technology led to mobile computing, a new dimension in data communication and processing. • Portable computing devices coupled with wireless communications allow clients to access data from virtually anywhere and at any time. • There are a number of hardware and software problems that must be resolved before the capabilities of mobile computing can be fully utilized. • Some of the software problems – which may involve data management, transaction management, and database recovery – have their origins in distributed database systems. 3

  4. Mobile Databases(2) • In mobile computing, the problems are more difficult, mainly: • The limited and intermittent connectivity afforded by wireless communications. • The limited life of the power supply(battery). • The changing topology of the network. • In addition, mobile computing introduces new architectural possibilities and challenges. 4

  5. Mobile Computing Architecture 5

  6. Mobile Computing Architecture(2) • It is distributed architecture where a number of computers, generally referred to as Fixed Hosts and Base Stations are interconnected through a high-speed wired network. • Fixed hosts are general purpose computers configured to manage mobile units. • Base stations function as gateways to the fixed network for the Mobile Units. 6

  7. Data Management Issues • From a data management standpoint, mobile computing may be considered a variation of distributed computing. Mobile databases can be distributed under two possible scenarios: • The entire database is distributed mainly among the wired components, possibly with full or partial replication. • A base station or fixed host manages its own database with a DBMS-like functionality, with additional functionality for locating mobile units and additional query and transaction management features to meet the requirements of mobile environments. • The database is distributed among wired and wireless components. • Data management responsibility is shared among base stations or fixed hosts and mobile units. 7

  8. Data Management Issues(2) • Data management issues as it is applied to mobile databases: • Data distribution and replication • Transactions models • Query processing • Recovery and fault tolerance • Mobile database design • Location-based service • Division of labor • Security • M-Commerce 8

  9. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab 9

  10. Multimedia Databases • In the years ahead multimedia information systems are expected to dominate our daily lives. • Our houses will be wired for bandwidth to handle interactive multimedia applications. • Our high-definition TV/computer workstations will have access to a large number of databases, including digital libraries, image and video databases that will distribute vast amounts of multisource multimedia content. 10

  11. Multimedia Databases (2) • DBMSs have been constantly adding to the types of data they support. • Today many types of multimedia data are available in current systems. 11

  12. Multimedia Databases(3) • Types of multimedia data are available in current systems • Text: May be formatted or unformatted. For ease of parsing structured documents, standards like SGML and variations such as HTML are being used. • Graphics: Examples include drawings and illustrations that are encoded using some descriptive standards (e.g. CGM, PICT, postscript). 12

  13. Multimedia Databases(4) • Types of multimedia data are available in current systems (contd.) • Images: Includes drawings, photographs, and so forth, encoded in standard formats such as bitmap, JPEG, and MPEG. Compression is built into JPEG and MPEG. • These images are not subdivided into components. Hence querying them by content (e.g., find all images containing circles) is nontrivial. • Animations: Temporal sequences of image or graphic data. 13

  14. Multimedia Databases(5) • Types of multimedia data are available in current systems (contd.) • Video: A set of temporally sequenced photographic data for presentation at specified rates– for example, 30 frames per second. • Structuredaudio: A sequence of audio components comprising note, tone, duration, and so forth. 14

  15. Multimedia Databases(6) • Types of multimedia data are available in current systems (contd.) • Audio: Sample data generated from aural recordings in a string of bits in digitized form. Analog recordings are typically converted into digital form before storage. 15

  16. Multimedia Databases(7) • Types of multimedia data are available in current systems (contd.) • Composite or mixed multimedia data: A combination of multimedia data types such as audio and video which may be physically mixed to yield a new storage format or logically mixed while retaining original types and formats. Composite data also contains additional control information describing how the information should be rendered. 16

  17. Data Management Issues • Multimedia applications dealing with thousands of images, documents, audio and video segments, and free text data depend critically on • Appropriate modeling of the structure and content of data • Designing appropriate database schemas for storing and retrieving multimedia information. 17

  18. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab • Revision 18

  19. Geographic Information Systems • Geographic information systems(GIS) are used to collect, model, and analyze information describing physical properties of the geographical world. 19

  20. Geographic Information Systems(2) • The scope of GIS broadly encompasses two types of data: • Spatial data, originating from maps, digital images, administrative and political boundaries, roads, transportation networks, physical data, such as rivers, soil characteristics, climatic regions, land elevations, and • Non-spatial data, such as socio-economic data (like census counts), economic data, and sales or marketing information. GIS is a rapidly developing domain that offers highly innovative approaches to meet some challenging technical demands. 20

  21. Geographic Information Systems(3) 21

  22. Spatial data 22

  23. GIS Applications • It is possible to divide GISs into three categories: • Cartographic applications • Digital terrain modeling applications • Geographic objects applications 23

  24. GIS Applications Earth science Geographic Objects Applications Digital Terrain Modeling Applications Cartographic Irrigation Car navigation systems Crop yield analysis Geographic market analysis Civil engineering and military evaluation Land Evaluation Soil Surveys Utility distribution and consumption Air and water pollution studies Planning and Facilities management Consumer product and services – economic analysis Landscape studies Flood Control Traffic pattern analysis Water resource management GIS Applications(2) 24

  25. Data Management Requirements of GIS • The functional requirements of the GIS applications above translate into the following database requirements. 25

  26. Data Management Requirements of GIS (2) Data Modeling and Representation • GIS data can be broadly represented in two formats: • Vector data represents geometric objects such as points, lines, and polygons. 26

  27. Data Management Requirements of GIS (3) • Data Modeling and Representation (contd.): • Raster data is characterized as an array of points, where each point represents the value of an attribute for a real-world location. • Informally, raster images are n-dimensional array where each entry is a unit of the image and represents an attribute. Two-dimensional units are called pixels, while three-dimensional units are called voxels. • Three-dimensional elevation data is stored in a raster-based digital elevation model (DEM) format. 27

  28. Data Management Requirements of GIS (4) Data Integration • GISs must integrate both vector and raster data from a variety of sources. • Sometimes edges and regions are inferred from a raster image to form a vector model, or conversely, raster images such as aerial photographs are used to update vector models. • Several coordinate systems such as Universal Transverse Mercator (UTM), latitude/longitude, and local cadastral systems are used to identify locations. • Data originating from different coordinate systems requires appropriate transformations. 28

  29. Specific GIS Data Operations • GIS applications are conducted through the use of special operators such as the following: • Interpolation • Interpretation • Proximity analysis • Raster image processing • Analysis of networks 29

  30. Specific GIS Data Operations(2) • The functionality of a GIS database is also subject to other considerations: • Extensibility • Data quality control • Visualization • Such requirements clearly illustrate that standard RDBMSs or ODBMSs do not meet the special needs of GIS. • Therefore it is necessary to design systems that support the vector and raster representations and the spatial functionality as well as the required DBMS features. 30

  31. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab • Revision 31

  32. Bioinformatics • Bioinformatics: The study of genetics can be divided into three branches: • Mendelian genetics is the study of the transmission of traits between generations • Molecular genetics is the study of the chemical structure and function of genes at the molecular level • Population genetics is the study of how genetic information varies across populations of organisms • Bioinformatics addresses information management of genetic information with special emphasis on DNA sequence analysis • Interdisciplinary research field 32

  33. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab • Revision 33

  34. XML: Extensible Markup Language • Although HTML is widely used for formatting and structuring Web documents, it is not suitable for specifying structured data that is extracted from databases. • A new language—namely XML (eXtended Markup Language) has emerged as the standard for structuring and exchanging data over the Web. • XML can be used to provide more information about the structure and meaning of the data in the Web pages rather than just specifying how the Web pages are formatted for display on the screen. • The formatting aspects are specified separately—for example, by using a formatting language such as XSL (eXtended Stylesheet Language). 34

  35. XML (2) • Example1: • Example2: 35

  36. XML (3) • The basic object is XML is the XML document. • There are two main structuring concepts that are used to construct an XML document: • Elements • Attributes • Attributes in XML provide additional information that describe elements. 36

  37. XML(4) • As in HTML, elements are identified in a document by their starttag and endtag. • The tag names are enclosed between angled brackets <…>, and end tags are further identified by a backslash </…>. • Complex elements are constructed from other elements hierarchically, whereas simple elements contain data values. • It is straightforward to see the correspondence between the XML textual representation and the tree structure. • In the tree representation, internal nodes represent complex elements, whereas leaf nodes represent simple elements. • That is why the XML model is called a tree model or a hierarchical model. 37

  38. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab • Revision 38

  39. Definitions of Data Mining • The discovery of new information in terms of patterns or rules from vast amounts of data. • The process of finding interesting structure in data. • The process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data. 39

  40. Knowledge Discovery in Databases (KDD) • Data mining is actually one step of a larger process known as knowledge discovery in databases (KDD). • The KDD process model comprises six phases • Data selection • Data cleansing • Enrichment • Data transformation or encoding • Data mining • Reporting and displaying discovered knowledge 40

  41. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab • Revision 41

  42. Data Warehousing • The data warehouse is a historical database designed for decision support. • Data mining can be applied to the data in a warehouse to help with certain types of decisions. • Proper construction of a data warehouse is fundamental to the successful use of data mining. • W. H Inmon characterized a data warehouse as: • “A subject-oriented, integrated, nonvolatile, time-variant collection of data in support of management’s decisions.” 42

  43. Data Warehousing (2) • Purpose of Data Warehousing • Traditional databases are not optimized for data access only they have to balance the requirement of data access with the need to ensure integrity of data. • Most of the times the data warehouse users need only read access but, need the access to be fast over alarge volume of data. • Most of the data required for data warehouse analysis comes from multiple databases and these analysis are recurrent and predictable to be able to design specific software to meet the requirements. • There is a great need for tools that provide decision makers with information to make decisions quickly and reliably based on historical data. • The above functionality is achieved by Data Warehousing and Online analytical processing (OLAP) 43

  44. Data Warehousing (3) • Applications that data warehouse supports are: • OLAP (Online Analytical Processing) is a term used to describe the analysis of complex data from the data warehouse. • DSS (Decision Support Systems) also known as EIS (Executive Information Systems) supports organization’s leading decision makers for making complex and important decisions. • DataMining is used for knowledge discovery, the process of searching data for unanticipated new knowledge. 44

  45. Conceptual Structure of Data Warehouse • Data Warehouse processing involves • Cleaning and reformatting of data • OLAP • Data Mining 45

  46. Comparison with Traditional Databases • Data Warehouses are mainly optimized for appropriate data access. • Traditional databases are transactional and are optimized for both access mechanisms and integrity assurance measures. • Data warehouses emphasize more on historical data as their main purpose is to support time-series and trend analysis. • Compared with transactional databases, data warehouses are nonvolatile. • In transactional databases transaction is the mechanism change to the database. By contrast information in data warehouse is relatively coarse grained and refresh policy is carefully chosen, usually incremental. 46

  47. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab • Revision 47

  48. Introduction to ASIS Lab • Advances in Security & Information Systems Lab (www.cse.hcmut.edu.vn/~asis ) • Research Directions (2006-2010) • Information Systems Security: • Database Security • Security Issues in E-/M-Commerce • Security and Privacy in Location-Based Applications • Security Issues in Outsourced Databases Services • DBs/ISs Security Visualization • E-Learning Systems Security • Digital Watermarking and Steganography • Privacy and Identity Management 48

  49. Introduction to ASIS_Lab(2) • Research Directions (2006-2010) (cont.) • Advanced Information Systems: • E-/M-Commerce • SOA-Based Modern Information Systems • Large Database Systems • Web Information Systems • Modern Information Retrieval Systems • Stream Data Management* • Bioinformatics* 49

  50. Outline • Mobile Databases • Multimedia Databases • Geographic Information Systems • Bioinformatics • XML • Data Mining • Data Warehousing • Introduction to ASIS Lab 50

More Related