Download
general dennis j reimer training and doctrine digital library manifest file specification n.
Skip this Video
Loading SlideShow in 5 Seconds..
General Dennis J. Reimer Training and Doctrine Digital Library Manifest File Specification PowerPoint Presentation
Download Presentation
General Dennis J. Reimer Training and Doctrine Digital Library Manifest File Specification

General Dennis J. Reimer Training and Doctrine Digital Library Manifest File Specification

1271 Views Download Presentation
Download Presentation

General Dennis J. Reimer Training and Doctrine Digital Library Manifest File Specification

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. General Dennis J. ReimerTraining and Doctrine Digital Library Manifest File Specification U.S. Army Training Support Center Fort Eustis, VA Mike Fore SAIC forem@atsc.army.mil Tom Dunn Remtech Services, Inc. dunnt@atsc.army.mil

  2. Reimer Digital Library (RDL) Overview • Designed to make Army training information available on the World Wide Web • Homepage: <http://www.adtdl.army.mil> • Gives soldiers, contractors, and others access to training information from a single aperture, with a single interface • RDL is logically centralized, but physically distributed

  3. RDL Overview (cont) • Content includes: • Doctrinal Publications • Task based publications for individual, collective, and drill tasks • Technical Manuals • Multimedia Courseware • Within RDL, term “Document” used to refer to any kind of content

  4. RDL Overview (cont) • Content storage: • Some in HTML on RDL servers • Some in HTML on remote servers -- RDL retrieves data based on user requests • Some in a relational database -- upon request, customized HTML is generated on-the-fly • Some in other non-relational database formats -- HTML is generated on-the-fly

  5. RDL Overview (cont) • RDL compatible “Query Engine” provides data to RDL in a RDL understandable format • Wherever possible, industry standard protocols used -- FTP, HTTP, and HTTPS protocols used to communicate with Query Engines • Most Query Engines are commercial off-the-shelf (COTS) products, not custom to RDL

  6. RDL Overview (cont) So how does the RDL decide which Query Engine to use?

  7. RDL Overview (cont) • RDL maintains a “Metadata Database” (known as the “Card Catalog”) of all available resources -- database includes: • Access permissions on each “Document” • Location of all resources, i.e., which query engine contains each resource • Cross referencing information for resolving references -- RDL uses as set of Server Side Include tags for inter-document references • When request for a resource is received, it looks up the resource in the metadata database to determine which Query Engine to use

  8. Internet Current RDL Data Flow Diagram Query Engines FTP PublicDocuments LOCAL FILES RDLServer FTP Task Data Task QueryEngine DataWarehouse HTTP Inventory,Access, & User Info WWWServer Fulltext Keyword Collection CGI RDL MetadataDatabase Search Engine

  9. RDL Inventory • Maintaining metadata database is crucial to RDL -- currently data is manually input: • GUI program prompts Librarian for document information (Title, School, Date, Table of Contents URL, etc…) • Runs a WWW Spider to collect all resources that make up the “Document” and populates the metadata database • This approach is time consuming • Librarian must be an expert on author’s intent • Works well for static documents, but Spider cannot handle dynamic documents that generate resource URIs at runtime.

  10. RDL Inventory (cont) • As documents get more and more active (i.e., Frames, Layers, Java Applets, JavaScript, Java Server Pages, etc…), cannot rely only on spider to index documents • Only author knows all “Document” components -- if author is using an authoring system, authoring system can provide a manifest of document contents • As authoring is distributed and decentralized, content loading must require less and less human interaction

  11. RDL Inventory (cont) • RDL needed a way to inventory contents of all its “Query Engines” • Inventory process needed to discover documents that had been added, changed, or deleted • Inventory process needed to function in a “push” or “pull” mode • Inventory process needed to work with all RDL content -- includes everything from doctrine publications to multimedia courseware

  12. Internet Query Engines FTP PublicDocuments LOCAL FILES RDLServer FTP Task Data Task QueryEngine DataWarehouse HTTP Inventory,Access, & User Info WWWServer Fulltext Keyword Collection CGI RDLInventoryServer RDL MetadataDatabase Search Engine Custom program that inventories Query Engines

  13. RDL Manifest • “Manifest File” fills need • Manifest File must be provided by “Document” authors, along with “Document” contents to be loaded into RDL • Manifest File must be machine readable • Manifest File will be processed by an “Inventory Server,” which will load content onto correct query engine, populate Metadata Database, and keep full-text index synchronized

  14. RDL Manifest (cont) • Manifest File contains hierarchical data • Manifest File is an XML file • Must conform to RDL Manifest File DTD • Must conform to RDL Manifest Specification • Manifest File Spec and DTD at: <http://www.adtdl.army.mil/help/manifest>

  15. RDL Manifest (cont) <manifest> <document> <distribution_restriction> </distribution_restriction> <identification> </identification> <resource> </resource> … <format> <section> … </section> </format> </document> </manifest>

  16. RDL Manifest (cont) • <manifest> element. • Rationale: Top level element of the manifest file. • Must have one <manifest> element per manifest file.

  17. RDL Manifest (cont) • <document> element. • Rationale: This element contains all the information about one document. • Must have one <document> element per manifest file. • Must include a status attribute. Current values are approved, draft, obsolete, or proposed.

  18. RDL Manifest (cont) • <distribution_restriction> element. • Rationale: All unclassified US Army documents carry a distribution restriction, which indicates who can access the document, and how it should be handled. • Must have one <distribution_restriction> element per <document> element. • Must include a code attribute. Current values are the 6 valid US Army distribution codes; a, b, c, d, e, or f.

  19. RDL Manifest (cont) • <distribution_restriction> codes (Continued): • code a: "Approved for public release; distribution is unlimited". • Most doctrinal publications fall into this category. • code b: "Distribution authorized to U.S. Government agencies only". • code c: "Distribution authorized to U.S. Government agencies and their contractors only". • code d: "Distribution authorized to the DOD and DOD contractors only". • code e: "Distribution authorized to DOD components only". • code f: "Further dissemination only as directed by proponent". • Most exams taken for credit fall into this category. Authorized to enrolled students only.

  20. RDL Manifest (cont) • <identification> element. • Rationale: This element contains information about how the document is known (Title, Number, etc…) to different communities. • Each document must have at least one <identification> element. • Within the US Military, many documents are multiservice (used by more than one of the Air Force, Army, Marines, and Navy). Within each service, some documents carry more than one identification.

  21. RDL Manifest (cont) • <identification> element (Continued). • Data within the <identification> element is displayed to RDL users in response to searches. • <identification> element contains: • Document Title • Document Number • Document Type • Approval • School (if any) • Version (if any) • Proponent (if any)

  22. RDL Manifest (cont) • <document_title> element. • Rationale: This element must contain the title of the document as displayed to the user. • The title of a document is one of the key fields displayed to the user in response to queries. • The US Department of Defense (DoD) Data Dictionary has requirements for the size and makeup of a document title to ensure that the data can be exchanged between systems.

  23. RDL Manifest (cont) • <document_number> element. • Rationale: Each DA publication must be assigned a number according to Army Regulation 25-30. • Document numbers are the most used fields for searching for a document. • The number of a document is one of the key fields displayed to the user in response to queries.

  24. RDL Manifest (cont) • <document_type> element. • Rationale: The document type identifies the type of information that will be found in the document. For example FM (Field Manuals) contain doctrinal information, while TM (Technical Manuals) contain detailed technical procedures. • The Document Type is one of the fields most used by military users to locate the information they need. • The document type is typically the publication medium portion of the document number.

  25. RDL Manifest (cont) • <approval> element. • Rationale: This element contains information about the approval of a document. Typically this is the date the document was approved for printing.

  26. RDL Manifest (cont) • <school> element. • Rationale: The US Army Training and Doctrine Command (TRADOC) is divided into 26 schoolhouses. Each school is responsible for the training material of some aspect of warfighting. • This element contains an element for the school responsible for the training material. • This element is optional. Non-TRADOC material does not have a school.

  27. RDL Manifest (cont) • <proponent> element. • Rationale: The proponent is the office responsible for maintaining a document. Any change recommendations should be sent to the proponent. • The proponent must include a US Army office symbol, and a mailing address. • The proponent may optionally include phone numbers, E-mail addresses, or other contact information. • This is an optional element. Not all documents have proponent information available.

  28. RDL Manifest (cont) • <document_version> element. • Rationale: Some documents contain additional information. This element gives the user information about any changes. • If the document supercedes another document, the superceded document’s number is listed here (for example “Supercedes FM 3-3”). • If the original document has been amended by any changes, the change number is indicated (for example “with change 1, 17 Nov. 1998”). • This is an optional element. Not all documents have a version.

  29. RDL Manifest (cont) • <resource> element. • Rationale: A resource is a item that is requested individually from the RDL’s WWW Server. Every HTML, GIF, PDF, etc… file is represented by a resource. • Every resource must contain a <resource_uri> element. This must have the Partial (Relative) URI of the resource. • URI’s are always relative to the manifest file. • All the resources of a “Document” must be in the same directory, or a sub-directory of the manifest file. • Each document must have at least one <resource> element.

  30. RDL Manifest (cont) • <resource> element (Continued). • A resource may have a reference. • A reference is a text string that can be used by other documents to refer to the resource. • Other “Documents” can use the reference text string within a Server Side Include (SSI) tag to link to the resource without knowing its URL. • The RDL resolves the SSI tags to a hyperlink if a matching resource is found. • If multiple resources match the reference, the user is given a list to make a selection from.

  31. RDL Manifest (cont) • <format> element. • Rationale: A document may have more than one format available. Many of the RDL “Documents” have an online format (HTML) and a print-on-demand format (PDF). • Each document must have at least one <format> element. • Some documents have a subset of their components which have special bandwidth/delivery requirements.

  32. RDL Manifest (cont) • <format> element (Continued). • Each format must have a type associated with it: • HTML • PDF • Real Audio • Each format must have a delivery type associated with it: • HTTP • HTTPS • RTSP

  33. RDL Manifest (cont) • <section> element. • Rationale: • Each format must have at least one <section> element. • Sections can be nested. A format may only have one section available, however, that section may contain sub-sections. • Each section must contain a <section_title>. • The <section_title> must be the same as the primary resource that makes up the section. • Every HTML file must be its own section.

  34. RDL Manifest (cont) • <section> element (Continued). • Each section must have at least one <resource_usage> element. • Every resource a section uses must be listed. For example, a section composed of an HTML page with 3 images would use 4 resources. • One and only one resource used by a section must be marked as the primary resource.

  35. RDL Manifest Example <manifest> <document status="approved"> <distribution_restriction code="a"></distribution_restriction> <identification armed_service="army"> <document_title>ARMY AVIATION OPERATIONS</document_title> <document_number>FM 1-100</document_number> <document_type><fm/></document_type> <school><aviation/></school> <approval> <date> <february/> <day>21</day> <year>1997</year> </date> </approval>

  36. RDL Manifest Example (cont) <proponent> <proponent_name>US Army Aviation Center and Fort Rucker </proponent_name> <office_symbol>ATZQ-TDS-D</office_symbol> <address> <city>Fort Rucker</city> <state>AL</state> <zip> <zip_code>36362</zip_code> <zip_plus_four>5000</zip_plus_four> </zip> </address> </proponent> </identification>

  37. RDL Manifest Example (cont) <format primary=“y"> <format_type> <pdf/> </format_type> <delivery> <http/> </delivery> <section> <section_title>FM 1-100 Table of Contents</section_title> <resource_usage id="res51" primary="y" /> </section> <section> <section_title>FM 1-100 Preface</section_title> <resource_usage id="res52" primary="y" /> </section> … </format>

  38. RDL Manifest Example (cont) <resource id="res51"> <resource_uri>toc.pdf</resource_uri> <reference>FM 1-100</reference> </resource> <resource id="res52"> <resource_uri>preface.pdf</resource_uri> </resource> <resource id="res53"> <resource_uri>futdoc.pdf</resource_uri> </resource> … </manifest>

  39. Reimer Digital Library (RDL)<http://www.adtdl.army.mil> U.S. Army Training Support Center Fort Eustis, VA Mike Fore SAIC forem@atsc.army.mil Tom Dunn Remtech Services, Inc. dunnt@atsc.army.mil