1 / 86

Introduction to XML

Introduction to XML. Instructor: Joseph DiVerdi, Ph.D., MBA. HTML Explained. Document Structuring Language Hyperlink Specification Language Not a Document-Layout Language Defines Syntax & Placement of Markup Tags Special, Embedded Directions That Are Not Displayed by a Browser

preston
Download Presentation

Introduction to XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to XML Instructor: Joseph DiVerdi, Ph.D., MBA

  2. HTML Explained • Document Structuring Language • Hyperlink Specification Language • Not a Document-Layout Language • Defines Syntax & Placement of Markup Tags • Special, Embedded Directions • That Are Not Displayed by a Browser • Tell Browser How to Document Is Structured • Including Text, Images, & Other Media • Tells How to Make Documents Interactive • Through Special Hypertext Links • Which Connect Many Documents Together

  3. Hypertext Explained • A Method of Presenting Information • Selected Word or Words in The Text • Can Be Expanded at Any Time • To Provide Additional Information About the Words • These Words Are Links to Other Documents • May Contain Text, Images, & Other Media • Hypertext Arose Simultaneously With Mouse • We Often Think Hypertext & Point-and-click Are Synonymous • But They Are Not • There Are Other Means of Selection • They Just Aren't Well Developed

  4. Markup Explained • Information Added to a Document • Intended to Enhance Its Meaning • Identifies Parts & How They Relate to Each Other • Usually Encoded As Symbols • Markup Language • Set of Symbols Placed in a Document • To Demarcate & Label Parts of the Document • Let's Look At An Example

  5. TEXTWITHOUTMARKUP GOODSENSEISOFALLTHINGSINTHEWORLDTHEMOSTEQUALLYDISTRIBUTEDFOREVERYONETHINKSHIMSELFSOABUNDANTLYPROVIDEDWITHITTHATEVENTHOSEMOSTDIFFICULTTOPLEASEINALLOTHERMATTERSDONOTCOMMONLYDESIREMOREOFITTHANTHEYALREADYPOSSESSRENÉDESCARTES15961650FRENCHPHILOSOPHER

  6. TEXT WITH SPACES GOOD SENSE IS OF ALL THINGS IN THE WORLD THE MOST EQUALLY DISTRIBUTED FOR EVERYONE THINKS HIMSELF SO ABUNDANTLY PROVIDED WITH IT THAT EVEN THOSE MOST DIFFICULT TO PLEASE IN ALL OTHER MATTERS DO NOT COMMONLY DESIRE MORE OF IT THAN THEY ALREADY POSSESS RENÉ DESCARTES 1596 1650 FRENCH PHILOSOPHER

  7. TEXT WITH PUNCTUATION GOOD SENSE IS OF ALL THINGS IN THE WORLD THE MOST EQUALLY DISTRIBUTED, FOR EVERYONE THINKS HIMSELF SO ABUNDANTLY PROVIDED WITH IT, THAT EVEN THOSE MOST DIFFICULT TO PLEASE IN ALL OTHER MATTERS DO NOT COMMONLY DESIRE MORE OF IT THAN THEY ALREADY POSSESS. RENÉ DESCARTES (1596-1650) FRENCH PHILOSOPHER

  8. Text With Case Good sense is of all things in the world the most equally distributed, for everyone thinks himself so abundantly provided with it, that even those most difficult to please in all other matters do not commonly desire more of it than they already possess. René Descartes (1596–1650) French Philosopher

  9. TEXTWITHOUTMARKUP GOODSENSEISOFALLTHINGSINTHEWORLDTHEMOSTEQUALLYDISTRIBUTEDFOREVERYONETHINKSHIMSELFSOABUNDANTLYPROVIDEDWITHITTHATEVENTHOSEMOSTDIFFICULTTOPLEASEINALLOTHERMATTERSDONOTCOMMONLYDESIREMOREOFITTHANTHEYALREADYPOSSESSRENÉDESCARTES15961650FRENCHPHILOSOPHER

  10. Unmarked Italian BENVENUTITUTTIALLAMIANUOVAPAGINAWEBLHOMIGLIORATAEDAGGIORNATASPEROCHEANCHEAVOIPIACCIACOMESEMPRESEAVETEQUALSIASICONSIGLIODOMANDAOSUGGERIMENTOSCRIVETEMIPERPIACEREMIPIACEREBBEASCOLTARLITUTTIMILLEGRAZIE

  11. Marked Italian Benvenuti tutti alla mia nuova pagina web! L'ho migliorata ed aggiornata. Spero che anche a voi piaccia. Come sempre, se avete qualsiasi consiglio, domanda o suggerimento, scrivetemi per piacere: mi piacerebbe ascoltarli tutti. Mille grazie!

  12. Marked English Welcome everyone to my updated web site! I hope you appreciate the changes and, as always, please send me any suggestions or comments you have on my site as I would love to hear them. Thanks!

  13. Markup Needed • Important in Electronic Documents • Processed by Computer Programs • Human Are Relatively Smart • When Compared to Computers • Robotics Example • Computer Hand & Arm Given a Simple Task • Pick up Three Wooden Cubes • One Red, One Green, & One Blue • Create a Stack With the Red Cube on Top, Blue on Bottom • What Would You Do? • Document Processing Is Much More Difficult

  14. HTML Explained Again • HTML Is for Marking Up a Document • For Transmission Over the Internet • For Rendering or Viewing With a Browser • Sequence of Words • Partitioned Into Paragraphs, Sections, & Chapters • Comprising a Human-readable Record • Book, Article, or Essay • With Included Images And/or Other Media • HTML Has Proven Enormously Successful • For Certain Type of Documents • There Are Many Other Document Types

  15. Document Type Examples • Structured Story or Article • Business Financial Data Database Structure • Abstract Molecular Structure • Supermarket Inventory Stock Item • Genealogical Relationship Tree Structure • Structured Human Resources Data • Musical Score • Structure Of An Equation

  16. Document Structures • The Term Document • Goes Far Beyond a Sequence of Words • Sharing Text Documents • Using HTML & The Internet • Has Been Sooooo Successful • Let's Share Other Kinds of Documents • Other Document Structures • Fit Into HTML Structure Rules • With Varying Degrees of Success • Some Fit Better Than Others • Many Are a Poor Fit

  17. Document Structure • Human Data Users • Are an Annoying Lot • As Soon As We See Some Interesting Data • We Want to Manipulate It in Some Natural Way

  18. Molecular Structure "Can I rotate this molecule to see it from other views?"

  19. Musical Structure "How Would That Score Sound If It Would Be Transposed Into G#?"

  20. Financial Data "How Would the Forecast Change If We Doubled the Marketing Budget?"

  21. Interoperability • Today, Those Questions Are Answerable • On a Local Computer • Purchase an Application • Install It on Your Desktop Computer • Have Fun • Applications Are in Infancy With Regard to Answering These Questions on The Web

  22. Interoperability • One of the Critical Issues to Success • Development of Data Package Formats • Suitable for Internet Transmission • Suitable for Storage • Able to Store & Organize Any Form of Information • Based on Open Standard • Not Tied to the Fortunes of Any Single Company • Not Married to Any Particular Software • Easily Combined With Style Sheets • To Create Formatted Documents

  23. What To Do? • How Can We Package Data to Permit • Huge Interoperativity • Platform Neutrality • Extensibility

  24. Solution: Just Extend HTML • HTML Is Already Overburdened With Dozens of Interesting But Incompatible Inventions From Different Manufacturers, Because It Provides Only One Way of Describing Information • HTML Is at the Limit of Its Usefulness As a Way of Describing Information, & While It Will Continue to Play an Important Role for the Content It Currently Represents, Many New Applications Require a More Robust & Flexible Infrastructure

  25. Solution: Just Use "Word" • Information on a Network Which Connects Many Different Types of Computer Has to Be Usable on All of Them • It Is Also Helpful for Such Information to Be in a Form That Can Be Reused in Many Different Ways • Minimize Wasted Time & Effort

  26. Solution: Just Use "Word" • Public Information Cannot Afford to Be Restricted to One Make or Model or Manufacturer • Or to Cede Control of Its Data Format to Private Hands • Proprietary Data Formats, No Matter How Well Documented or Publicized, Are Simply Not an Option • Their Control Still Resides in Private Hands • They Can Be Changed or Withdrawn • Arbitrarily & Without Notice

  27. Write New Markup Languages • Permit the Various Communities of Users • To Create Their Own Markup Languages • Suitable for Packaging Their Particular Data

  28. XML Explained • A Protocol to Define Markup Languages • Use XML to Create a Markup Language • You & Your Friends Can Create a Markup Language • For Your Particular Kind of Shared Data • DickML & JaneML

  29. XML Explained • Create Documents Using the Markup Language • Create Applications to Process Marked Up Docs • Create Applications to Display Document Content • Create Applications to Emit Content As HTML • Because There Are So Many Web Browsers in Use

  30. XML Goals • Application Specific Markup Languages • Different Kinds of Documents May Require Different Kinds of Markup • Unambiguous Document Structure • No Two Ways to Interpret • Names, Order & Hierarchy of Document Elements • Presentation Stored Separately • Markup Names Precisely Reflect Items' Purpose • Presentation Stored As Style Sheet • Re-Purpose Document Using Different Style Sheets • Document Content Isn't Cluttered With Style Vocabulary

  31. XML Goals • Keep It Simple • Authoring a Document Shouldn't Be Hard • Process Programs Are Also Made Easy • Maximal Error Checking • Adherence to Syntax Standards Is Required • Well-formed Is the Minimum Standard • Element Names Spelled Correctly • Element Boundaries Closed Correctly • Durability & Utility of Document Is Assured

  32. Creating XML Documents • As With HTML, Available Tools Range From • Plain Text Editor • U-Type-It • Smart Text Editor • Permits Tag Customizing • XML Editor • Performs Structure Checking • High End XML Editor • Spiffy Graphical Interface

  33. Viewing XML • Specialized Rendering Programs • Chemical Markup Language - Jumbo • Using HTML Browser • Not Generally Useful • Using HTML Browser with Style Sheet • Cascading Style Sheet • Extensible Style Sheet Language • CSS on Steroids

  34. XML Validation • Several Levels of Validation • Well-Formed • "My uncle is pregnant." is Well-Formed • Broken Markup • Valid • Adherence to Stated Document Type Definition • Adherence to State Document Model • Contextual Mistakes • Mostly Available as Downloadable Applications • On-Line Validators Exist Now

  35. Ground Rules • XML Rules Differ From HTML Rules • Generally More Stringent • Case Is Significant • Attributes Must Be Contained in Quotes • Whitespace Is Not Collapsed Automatically • Containers Must Always Be Closed

  36. Outline • Anatomy of an XML-compliant Document • Prolog • Elements • Elements • Attributes • Namespace • Entities • Character • Mixed-Content • Well-Formed & Valid

  37. Document Anatomy • XML-Compliant Document • A Reservoir of Information • Structured Data • Two Separate Components • Prolog • Provides Declarations to XML Software Applications • Elements • Contains Marked-up Data

  38. Document Prolog • Contents • XML Declaration - Required • Document Type Declaration - Optional • Not Document Type Definition • Simplest XML Prolog <?xml?> • Only Contains XML Declaration • Note Unique Delimiters

  39. XML Declaration • Several Properties Available • Version version = "1.0" • Identifies Rules of Engagement • Only v1.0 Currently in Existence • Encoding encoding = "iso-8859-1" • Identifies Character Set Used in Document • Standalone standalone = "yes" • Indicates Whether or Not Other DTD Files Are Involved

  40. XML Declaration • Examples <?xml ?> <?xml version="1.0"?> <?xml version="1.0" encoding = "iso_8859-1"?> <?xml version="1.0" encoding = "iso_8859-8"?> <?xml version="1.0" encoding = "SHIFT_JIS"?> <?xml version="1.0" encoding = "SHIFT_JIS" standalone="yes"?> • Property Order Is Not Important • Parameter Names Are Lower Case • Values Are Quoted • Properties Are Space-Delimited

  41. Content • Content Consists of Marked up Text • Markup Tags Are Defined by Users • As Simple or Complicated As Necessary <bah> humbug <boo type="foo"> ooga-booga </boo> <boo type="bar"> oinga-boinga </boo> </bah>

  42. Document Type Declaration • Describes Root Element • Top-Level Document Container • Designates DTD • Document Type Definition • For Precisely Defining Document Structure • Names DTD Using Public Identifier • Locates DTD Using System Identifier • Defined Internal Subset of DTD

  43. Example of External DTD <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" SYSTEM "http://www.w3.org/TR/2000/REC-xhtml1-20000126/DTD/xhtml1-strict.dtd"> • Note Root Element Name • Note Public Identifier • Names DTD Using Public Name • Note System URI • Tells Software Application Where to Find It

  44. Example of Internal DTD <?xml version="1.0"?> <!DOCTYPE PARENT [ <!ELEMENT PARENT (CHILD*)> <!ELEMENT CHILD (IDENTIFIER?,NAME+)> <!ELEMENT MARK EMPTY> <!ELEMENT NAME (LASTNAME+,FIRSTNAME+)+> <!ELEMENT LASTNAME (#PCDATA)> <!ELEMENT FIRSTNAME (#PCDATA)> <!ATTLIST IDENTIFIER NUMBER ID #REQUIRED TYPE (natural|adopted|testube) "natural"> <!ENTITY my_string "Resistance is Futile"> ]>

  45. Background & Context • HTML Follows the Rules of Formal Electronic Document Markup Design & Implementation • Born Out of the Need to • Assemble Text, Graphics, & Other Digital Content • For Transmission Over the Internet • HTML V4.01 Standard Is Defined Using • Standardized Generalized Markup Language • SGML • Adequate for Formalizing HTML • Too Complex for Extending HTML

  46. Background & Content • Extensible Markup Language (XML) • Based on Simpler Features of SGML • Kinder, Gentler, & More Flexible • Well-suited for Orderly Development of New Markup Languages • HTML Is Even Being Reborn As XHTML

  47. Background & Context • With XML There Exists a Standardized Means for Defining Markup Languages • That Are Customized for Different Needs • Rather Than Relying Upon HTML Extensions • Mathematicians Express Mathematical Notations • Musicians Present Musical Scores • Physicians Exchange Medical Records • Accountants Share Financial Information • All Groups Need an Acceptable, Resilient Way to Express These Different Kinds of Information, So Software Can Be Developed to Process & Display These Diverse Data

  48. Background & Context • XML Provides a Solution • Each Content Sector • Business Group, Trade Association, Consortium.. • Can Define a Markup Language • For Information Exchange & Processing Over the Web • Programmers Can Develop Parsers • XML-Compliant Processes • That Read New Language Definitions & • Permit a Server to Process Documents in Those Languages • Permit a Client to Retrieve & Display Those Documents

  49. Background on SGML • Standard Generalized Markup Language • SGML • International Standard (ISO 8879) • Published in 1986 • SGML Prescribes a Standard Format for Embedding Descriptive Markup in a Document • SGML Also Specifies a Standard Method for Describing the Structure of a Document • More Important & Crucial to Its Power

  50. SGML Background • SGML Allows an Author to Set up Hierarchical Models for Each Type of Document Produced • SGML Forces Each Element in the Structure • Labeled With Descriptive Markup Such As Chapter, Title & Paragraph • To Fit in the Logical, Predictable Structure of the Document

More Related