1 / 46

Data Management Information Management Knowledge Management for Network Centric Operations

Data Management Information Management Knowledge Management for Network Centric Operations. Dr. Bhavani Thuraisingham The University of Texas at Dallas. October 2005. Data, Information and Knowledge Management: Definitions. Knowledge Management:. Acquiring knowledge.

giulia
Download Presentation

Data Management Information Management Knowledge Management for Network Centric Operations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management Information Management Knowledge Management for Network Centric Operations Dr. Bhavani Thuraisingham The University of Texas at Dallas October 2005

  2. Data, Information and Knowledge Management: Definitions Knowledge Management: Acquiring knowledge Collaboration and sharing Managing the processes Disseminating the knowledge Taking action Information Management: Extracting information from the data Visualizing the data Data Management: Data administration Database management

  3. What is data management? • One proposal: Data Management = Database System Management + Data Administration • Includes data analysis, data administration, database administration, auditing, data modeling, database system development, database application development

  4. Data Administration • Identifying the data • Data may be in files, paper, databases, etc. • Analyzing the data • Is the data of good quality? • Is the data complete? • Data standardization • Should one standardize all the data elements and metadata? • Repositories for handling semantic heterogeneity? • Data Security • How should data be secured? • Data modeling • Structure the data, model the data and the processes

  5. Data Administration (Continued) • Data quality provides some measure for determining the accuracy of the data • Is the data current? Can we trust the source? • Data quality parameters can be passed from source to source • E.g., Trust A 50% and Trust B 30% • Data may have different semantics • E.g, Bank A may send out statement on the 20th day of each month and Bank B may send out statements on the 5th day of each month • Fighter jet and Passenger plane may be considered to be one and the same

  6. Data Administration (Concluded) • Data Standards • Standards for data semantics and administration • E.g., XML (eXtensible Markup Language) for document interchange • Data security includes data confidentiality and integrity • Confidentiality is about preventing unauthorized access to the data • Integrity is about preventing malicious corruption to the data

  7. An Example Database System

  8. Metadata • Metadata describes the data in the database • Example: Database D consists of a relation EMP with attributes SS#, Name, and Salary • Metadatabase stores the metadata • Could be physically stored with the database • Metadatabase may also store constraints and administrative information • Metadata is also referred to as the schema or data dictionary

  9. Three-level Schema Architecture: Details User B2 User A1 User A2 User A3 User B1 External Schema B External Model A External Schema A External Model B External/Conceptual Mapping A External/Conceptual Mapping B Conceptual Model Conceptual Schema Conceptual/Internal Mapping Stored Database Internal Model Internal Schema

  10. Functional Architecture Data Management User Interface Manager Schema (Data Dictionary) Manager (metadata) Security/ Integrity Manager Query Manager Transaction Manager Storage Management File Manager Disk Manager

  11. Types of Database Systems • Relational Database Systems • Distributed and Federated Database Systems • Object Database Systems • Deductive Database Systems • Other • Real-time, Secure, Parallel, Scientific, Temporal, Wireless, Functional, Entity-Relationship, Sensor/Stream Database Systems, etc.

  12. Relational Database: Example Relation S: S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens Relation P: P# PNAME COLOR WEIGHT CITY P1 Nut Red 12 London P2 Bolt Green 17 Paris P3 Screw Blue 17 Rome P4 Screw Red 14 London P5 Cam Blue 12 Paris P6 Cog Red 19 London Relation SP: S# P# QTY S1 P1 300 S1 P2 200 S1 P3 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 S4 P2 200 S4 P4 300 S4 P5 400

  13. Example Object Composite Document Object Section 2 Object Section 1 Object Paragraph 1 Object Paragraph 2 Object

  14. Data- base 1 DBMS 3 Data- base 3 Distributed Processor 3 Site 3 DBMS 1 Distributed Processor 1 Communication Network Site 1 Distributed Processor 2 Data- base 2 DBMS 2 Site 2 Distributed Database System

  15. Query Processing Example DQP (Distributed Query Processor) Network DQP DQP DQP DBMS 3 DBMS 1 DBMS 2 EMP1 (20) EMP3 (50) DEPT3 (30) EMP2 (30) DEPT2 (20) EMP1 (20) Query at site 1: Join EMP and DEPT on D# Move EMP2 to site 3; Merge EMP1, EMP2, EMP3 to form EMP Move DEPT2 to site 3; Merge DEPT2 and DEPT3 to form DEPT Join EMP and DEPT; Move result to site 1

  16. Transaction Processing Example DTM (Distributed Transaction Manager) responsible for executing the distributed transaction Issues: Concurrency control Recovery Data Replication Site 1 Coordinator Transaction Tj Subtransaction Tj4 Subtransaction Tj2 Subtransaction Tj3 Site 2 Participant Site 4 Participant Site 3 Participant Two-phase commit: Coordinator queries participants whether they are ready to commit If all participants agree, then coordinator sends request for the participants to commit

  17. Interoperability of Heterogeneous Database Systems Database System A Database System B (Relational) (Object- Oriented) Network Transparent access to heterogeneous databases - both users and application programs; Query, Transaction processing Database System C (Legacy)

  18. Technical Issues on the Interoperability of Heterogeneous Database Systems • Heterogeneity with respect to data models, schema, query processing, query languages, transaction management, semantics, integrity, and security policies • Interoperability based on client-server architectures • Federated database management • Collection of cooperating, autonomous, and possibly heterogeneous component database systems, each belonging to one or more federations

  19. Different Data Models Network Node A Node B Node C Node D Database Database Database Database Network Model Object- Oriented Model Relational Model Hierarchical Model Developments: Tools for interoperability; commercial products Challenges: Global data model

  20. Schema Integration and Transformation: An approach External Schema III External Schema I External Schema II Global Schema: Integrate the generic schemas Generic schema describing the relational database Generic schema describing the network database Generic schema describing the hierarchical database Generic schema describing the object-oriented database Schema describing the network database Schema describing the relational database Schema describing the hierarchical database Schema describing the object-oriented database Challenges: Selecting appropriate generic representation; maintaining consistency during transformations;

  21. Semantic Heterogeneity • Semantic heterogeneity occurs when there is a disagreement about the meaning or interpretation of the same data; or same data interpreted differently Object O Challenges: Standard definitions; Repositories Node A Node B Database Database Object O interpreted as a passenger ship Object O interpreted as a submarine

  22. Federated Database Management Database System A Database System B Federation F1 Cooperating database systems yet maintaining some degree of autonomy Federation F2 Database System C

  23. Autonomy component A honors the local request first request from component local request Component A Component B Challenges: Adapt techniques to handle autonomy - e.g., transaction processing, schema integration; transition research to products communication through federation component A does not communicate with component C Component C

  24. Federated Data and Policy Management Data/Policy for Federation Export Export Data/Policy Data/Policy Export Data/Policy Component Component Data/Policy for Data/Policy for Agency A Agency C Component Data/Policy for Agency B

  25. What is Information Management? • Information management essentially analyzes the data and makes sense out of the data • Several technologies have to work together for effective information management • Data Warehousing: Extracting relevant data and putting this data into a repository for analysis • Data Mining: Extracting information from the data previously unknown • Multimedia: managing different media including text, images, video and audio • Web: managing the databases and libraries on the web

  26. Data Warehouse Data Warehouse: Data correlating Employees With Medical Benefits and Projects Users Query the Warehouse Could be any DBMS; Usually based on the relational data model Oracle DBMS for Employees Sybase DBMS for Projects Informix DBMS for Medical

  27. Information Harvesting Knowledge Mining Data Mining Knowledge Discovery in Databases Data Dredging Data Archaeology Data Pattern Processing Database Mining Knowledge Extraction Siftware The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data, often previously unknown, using pattern recognition technologies and statistical and mathematical techniques (Thuraisingham 1998) What is Data Mining?

  28. Steps to Data Mining Clean/ modify data sources Mine the data Integrate data sources Report final results/ Take actions Examine Results/ Prune results Data Sources

  29. Data Mining Needs for Counterterrorism: Non-real-time Data Mining • Gather data from multiple sources • Information on terrorist attacks: who, what, where, when, how • Personal and business data: place of birth, ethnic origin, religion, education, work history, finances, criminal record, relatives, friends and associates, travel history, . . . • Unstructured data: newspaper articles, video clips, speeches, emails, phone records, . . . • Integrate the data, build warehouses and federations • Develop profiles of terrorists, activities/threats • Mine the data to extract patterns of potential terrorists and predict future activities and targets • Find the “needle in the haystack” - suspicious needles? • Data integrity is important • Techniques have to SCALE

  30. Data Mining Needs for Counterterrorism: Real-time Data Mining • Nature of data • Data arriving from sensors and other devices • Continuous data streams • Breaking news, video releases, satellite images • Some critical data may also reside in caches • Rapidly sift through the data and discard unwanted data for later use and analysis (non-real-time data mining) • Data mining techniques need to meet timing constraints • Quality of service (QoS) tradeoffs among timeliness, precision and accuracy • Presentation of results, visualization, real-time alerts and triggers

  31. Data Mining as a Threat to Privacy • Data mining gives us “facts” that are not obvious to human analysts of the data • Can general trends across individuals be determined without revealing information about individuals? • Possible threats: • Combine collections of data and infer information that is private • Disease information from prescription data • Military Action from Pizza delivery to pentagon • Need to protect the associations and correlations between the data that are sensitive or private

  32. Privacy Preserving Data Mining User Interface Manager Privacy Constraints Constraint Manager Database Design Tool Structures the database Data Miner: Makes correlations Ensures privacy Query Processor: Constraints during query and release operations DBMS Database

  33. Current Status, Challenges and Directions • Status • Data Mining is now a technology • Several prototypes and tools exist; Many or almost all of them work on relational databases • Challenges • Mining large quantities of data; Dealing with noise and uncertainty, reasoning with incomplete data, Eliminating False positives and False negatives • Directions • Mining multimedia and text databases, Web mining (structure, usage and content), Mining metadata, Real-time data mining, Privacy

  34. Semantic Web: Overview • According to Tim Berners Lee, The Semantic Web supports • Machine readable and understandable web pages • Enterprise application integration • Nodes and links that essentially form a very large database Premise: Semantic Web Applications: Web Database Management + Web Services + Information Integration + - - - - - Semantic Web Technologies: XML, RDF, Ontologies, Rules-ML

  35. TRUST P R I V A C Y Logic, Proof and Trust Rules/Query Other Services RDF, Ontologies XML, XML Schemas URI, UNICODE Layered Architecture for Dependable Semantic Web • Adapted from Tim Berners Lee’s description of the Semantic Web • Some Challenges: Interoperability between Layers; Security and Privacy cut across all layers; Integration of Services; Composability

  36. What is XML all about? • XML is needed due to the limitations of HTML and complexities of SGML • It is an extensible markup language specified by the W3C (World Wide Web Consortium) • Designed to make the interchange of structured documents over the web easier • Key to XML are Document Type Definitions (DTDs) and XML Schemas • Allows users to bring multiple files together to form compound documents

  37. What is Knowledge Management? • Knowledge management, or KM, is the process through which organizations generate value from their intellectual property and knowledge-based assets • Gartner group: KM is a discipline that promotes an integrated approach to identifying and sharing all of an enterprise's information assets, including databases, documents, policies and procedures as well as unarticulated expertise and experience resident in individual workers • Peter Senge: Knowledge is the capacity for effective action, this distinguishes knowledge from data and information; KM is just another term in the ongoing continuum of business management evolution

  38. Knowledge Management Components Knowledge Components of Management: Components, Cycle and Technologies Cycle: Technologies: Components: Knowledge, Creation Expert systems Strategies Sharing, Measurement Collaboration Processes And Improvement Training Metrics Web

  39. KM: Strategy, Process and Metrics • Strategy • Motivation for KM and how to structure a KM program • Process • Use of KM to make existing practice more effective • Metrics • Measure the impact of KM on an organization

  40. Strategy: Building Learning Organizations • Adaptive learning and Generative learning • Need to adapt to the changing environment • Total quality movement (TQM) in Japan has migrated to a generative learning model • Look at the world in a new way • Changing roles of the leader • Migrating from decision makers to designers, teachers and stewards • Building a shared vision • Encouraging ideas, Requesting support, Moving beyond blame, Effective communication • Learning tools • Learning laboratory

  41. Knowledge Management in Process Management • Types of Processes • Simple processes: Low level operation • Complex and nonadapative processes: Systems that use the same rules • Complex and adaptive: Agents carrying out the processes are intelligent and adaptive • Linking knowledge management with processes • Knowledge management is needed for all processes; critical for complex and adaptive processes • Learn from experience and use the experience in unknown situations

  42. Metrics: The Balanced Scorecard • Employee Capabilities: Measuring the following • Employee satisfaction • Employee retention • Employee productivity • Information system capabilities: Measuring the following • Whether each employee segment has information to carry out its operations. • Motivation and Empowerment: Measuring the following • Suggestions made and implemented • Improvement • Team performance

  43. Knowledge Management Architecture Knowledge Creation and Acquisition Manager Knowledge Representation Manager Knowledge Dissemination and Sharing Manager Knowledge Manipulation Manager

  44. Secure Knowledge Management • Protecting the intellectual property of an organization • Access control including role-based access control • Security for process/activity management and workflow • Users must have certain credentials to carry out an activity • Composing multiple security policies across organizations • Security for knowledge management strategies and processes • Risk management and economic tradeoffs • Digital rights management and trust negotiation

  45. Status and Directions • Knowledge management has exploded due to the web • Knowledge Management has different dimensions • Technology, Business • Goal is to take advantage of knowledge in a corporation for reuse • Tools are emerging • Need effective partnerships between business leaders, technologists and policy makers • Knowledge management may subsume information management and data management • Vague boundaries

  46. Other Ideas and Directions? Prof. Bhavani Thuraisingham • Director Cyber Security Center • Department of Computer Science • Erik Jonsson School of Engineering and Computer Science • The University of Texas at Dallas • Richardson, Texas • bhavani.thuraisingham@utdallas.edu http://www.utdallas.edu/~bxt043000/ President Dr-Bhavani Security Consulting Dallas, TX www.dr-bhavani.org

More Related