Taxonomy governance
Download
1 / 70

- PowerPoint PPT Presentation


  • 200 Views
  • Uploaded on

Taxonomy Governance. Ron Daniel, Jr. & Joseph A. Busch Taxonomy Strategies LLC. Agenda. 1:30 Welcome & Introductions 1:45 Exercise: Taxonomy Revisions 2:15 Fundamental Processes 2:30 Governance Team Roles and Structures 3:00 Tools 3:05 Break 3:15 Exercise: Organizational Self-Assessment

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '' - jaden


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Taxonomy governance l.jpg

Taxonomy Governance

Ron Daniel, Jr. & Joseph A. Busch

Taxonomy Strategies LLC


Agenda l.jpg
Agenda

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Who we are joseph busch l.jpg
Who we are: Joseph Busch

  • Over 25 years in the business of organized information

    • Founder, Taxonomy Strategies

    • Director, Solutions Architecture, Interwoven

    • VP, Infoware, Metacode Technologies

    • Program Manager, Getty Foundation

    • Manager, Pricewaterhouse

  • Metadata and taxonomies community leadership

    • President, American Society for Information Science & Technology

    • Director, Dublin Core Metadata Initiative

    • Adviser, National Research Council Computer Science and Telecommunications Board

    • Reviewer, National Science Foundation Division of Information and Intelligent Systems

    • Founder, Networked Knowledge Organization Systems/Services


Who we are ron daniel jr l.jpg
Who we are: Ron Daniel, Jr.

  • Over 15 years in the business of metadata & automatic classification

    • Principal, Taxonomy Strategies

    • Standards Architect, Interwoven

    • Senior Information Scientist, Metacode Technologies

    • Technical Staff Member, Los Alamos National Laboratory

  • Metadata and taxonomies community leadership

    • Chair, PRISM (Publishers Requirements for Industry Standard Metadata) working group

    • Acting chair: XML Linking working group

    • Member: RDF working groups

    • Co-editor: PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2 reports.


Recent current projects l.jpg

Government

Commodity Futures Trading Commission

Defense Intelligence Agency

ERIC

Federal Aviation Administration

Federal Reserve Bank of Atlanta

Forest Service

GSA Office of Citizen Services (www.firstgov.gov)

Head Start

Infocomm Development Authority of Singapore

NASA (nasataxonomy.jpl.nasa.gov)

Small Business Administration

Social Security Administration

USDA Economic Research Service

USDA e-Government Program (www.usda.gov)

Commercial

Allstate Insurance

Blue Shield of California

Debevoise & Plimpton

Halliburton

Hewlett Packard

Motorola

PeopleSoft

Pricewaterhouse Coopers

Siderean Software

Sprint

Time Inc.

Commercial subcontracts

Agency.com – Top financial services

Critical Mass – Fortune 50 retailer

Deloitte Consulting – Big credit card

Gistics/OTB – Direct selling giant

NGO’s

CEN

IDEAlliance

IMF

OCLC

Recent & current projects


Participant introductions l.jpg
Participant Introductions

  • Who are you?

  • What do you do?

  • What brings you here today?


Agenda7 l.jpg
Agenda

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Taxonomy governance overview l.jpg
Taxonomy Governance Overview

  • Is “Taxonomy Governance” synonymous with “Taxonomy Maintenance”?

  • What kinds of changes can be made, and what are their costs?

  • What kinds of information are needed to determine the changes?

  • What kind of group should maintain the taxonomy?

  • What kinds of rules should the group follow to decide on changes?

  • What should the group do beyond maintaining the taxonomy?


Exercise taxonomy modifications l.jpg
Exercise: Taxonomy Modifications

  • Divide into small groups

  • Review assigned sample taxonomy

  • Discuss changes you would make

  • In 10 minutes, a spokesperson will speak for the group and briefly:

    • Tell us something good about the taxonomy

    • Characterize the short-term changes your group would make

    • Characterize the questions your group would like answered before making other changes


Exercise notes l.jpg
Exercise Notes

  • Team Members:

  • Something good about the taxonomy:

  • Short term changes:

  • Questions for other changes:



Group 2 sample taxonomy l.jpg
Group 2 Sample Taxonomy

Top Level

Random Samples of Detailed Categories

Business / Accounting / Firms / Directories

Business / Biotechnology & Pharmaceuticals / Education & Training

Business / Employment / By Industry

Business / Healthcare / Employment / Regional

Business / Small Business / Finance / Accounting

Reference / Education / Colleges & Universities / North America / United States / Maryland / Columbia Union College / Athletics

Reference / Education / K-12 / Home Schooling / Unschooling / Chats and Forums

Regional / Europe / Ireland / Business & Economy / Employment / Health & Medical

Science / Math / Academic Departments / South America / Colombia

Science / Social Sciences / Linguistics / Translation / Associations

Society / People / Women / Science & Technology / Mathematics


Group 3 sample taxonomy l.jpg
Group 3 Sample Taxonomy

Top Level

Detail in Auto Products Category

Source: http://householdproducts.nlm.nih.gov/products.htm


Predictions l.jpg

Editorial Rules

Metadata Specification, Design for maintainability

How to put it into action?

User Characterization

Content and Metadata Maintenance

ROI

Predictions

  • Short-term changes will center on rules of style – ‘&’ vs. ampersand, capitalization, plurals

  • Faceted subdivision will only be suggested by experienced practitioners, by groups given low-level details of a taxonomy, or both. People will critique the UI Presentation

  • Questions for Long-term changes will focus, in decreasing order, on:

    • Who are the users and what are they doing?

    • What is the content and how much is in the various categories?

    • What kind of money depends on the taxonomy, and what kind of maintenance expenses are justified?

  • Anything else people want to cover?


Agenda15 l.jpg
Agenda

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Fundamental processes l.jpg
Fundamental Processes

  • What are the two fundamental processes every organization should implement to maintain its metadata and taxonomies?

    • Query log / Click trail examination

    • Tagging Error Correction

  • What are the key outlooks a taxonomist should try to instill in their organization?


Fundamental process 1 query log examination l.jpg
Fundamental Process #1 – Query Log Examination

  • How can we characterize users and what they are looking for?

  • Query Log & Click Trail Examination

    • Sophisticated software available, but don’t wait.

    • 80/20 Rule – 80% of value from 20% of possible reports.

  • Greatest value comes from:

    • Identifying a person as responsible for search quality

    • Starting a “Measure & Improve” mindset

  • Greatest challenge:

    • Getting a person assigned (≥ 10%)

    • Getting logs turned back on

    • What to do after the obvious fixes have been made

  • UltraSeek Reporting

  • Top queries

  • Queries with no results

  • Queries with no click-through

  • Most requested documents

  • Query trend analysis

  • Complete server usage summary

Click Trail Packages

iWebTrack

NetTracker

OptimalIQ

SiteCatalyst

Visitorville

WebTrends


Fundamental process 2 tagging error correction l.jpg
Fundamental Process #2 – Tagging Error Correction

  • For the Taxonomy to be used, its values must be associated with content.

    • We will refer to this as “Tagging”.

  • Errors will happen, and some will be found. What are you going to do about them?

  • Define an error correction process.

    • Process will accommodate questions like:

      • Is it an error? What is the cost to correct or not correct? Does the correction need to be scheduled? etc.

    • Once an error is corrected, NEVER lose that fact.

      • Manually reviewed pages are vital for training automatic classifiers.

      • Has implications for metadata specification and review procedures.

    • Over time, multiple error detection methods will be defined.

      • e.g. Statistical sampling of newly added pages

      • Gradually, additional error correction processes may be defined to deal with particular types of errors.


Fundamental outlooks l.jpg

How are we going to build and maintain metadata structures and controlled vocabularies?

The taxonomy problem

How are we going to populate metadata elements with complete and consistent values?

The tagging problem

How are we then going to use metadata in applications and demonstrate benefits?

The ROI problem

Taxonomy Governance is a standards process.

Take tips from other standards efforts

Team, with comment-handling responsibilities and an appeals process

Issue Logs

Announcements

Release Schedule

Foster a “Measure & Improve” Mindset

Fundamental Outlooks

Must know this to address other problems!


Agenda20 l.jpg
Agenda and controlled vocabularies?

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Taxonomy business processes l.jpg
Taxonomy Business Processes and controlled vocabularies?

  • Taxonomies must change, gradually, over time if they are to remain relevant

  • Maintenance processes need to be specified so that the changes are based on rational cost/benefit decisions

  • A team will need to maintain the taxonomy on a part-time basis

  • Taxonomy team reports to some other steering committee


Definitions about the controlled vocabulary governance environment l.jpg

Web CMS and controlled vocabularies?

Archives

Intranet

Search

ERMS

CVs

Other Controlled Items

Definitions about the Controlled Vocabulary Governance Environment

Change Requests & Responses

Published CVs and STs

Consuming Applications

1: Syndicated Terminologies change on their own schedule

2: CV Team decides when to update CVs

Syndicated Terminologies

ISO

3166-1

Vocabulary Management System

Other External

Notifications

Intranet Nav.

3: Team adds value via mappings, translations, synonyms, training materials, etc.

ERP

DAM

Custodians

4: Updated versions of CVs published to consuming applications

Other Internal

Controlled Vocabulary Governance Environment


Other controlled items l.jpg
Other Controlled Items and controlled vocabularies?

  • Taxonomy Team will have additional items to manage:

    • Charter, Goals, Performance Measures

    • Editorial rules

    • Team processes

    • Tagger training materials (manual and automatic)

    • Outreach & ROI

      • Communication plan

      • Website

      • Presentations

      • Announcements

    • Roadmap


Taxonomy governance generic team charter l.jpg
Taxonomy governance | Generic team charter and controlled vocabularies?

  • Taxonomy Team is responsible for maintaining:

    • The Taxonomy, a multi-faceted classification scheme

    • Associated taxonomy materials, such as:

      • Editorial Style Guide

      • Taxonomy Training Materials

      • Metadata Standard

      • Team rules and procedures (subject to CIO review)

  • Team evaluates costs and benefits of suggested change

  • Taxonomy Team will:

    • Manage relationship between providers of source vocabularies and consumers of the Taxonomy

    • Identify new opportunities for use of the Taxonomy across the Enterprise to improve information management practices

    • Promote awareness and use of the Taxonomy


Editorial rules l.jpg
Editorial Rules and controlled vocabularies?

  • To ensure consistent style, rules are needed

  • Issues commonly addressed in the rules:

    • Sources of Terms

    • Abbreviations

    • Ampersands

    • Capitalization

    • Continuations (More… or Other…)

    • Duplicate Terms

    • Hierarchy and Polyhierarchy

    • Languages and Character Sets

    • Length Limits

    • “Other” – Allowed or Forbidden?

    • Plural vs. Singular Forms

    • Relation Types and Limits

    • Scope Notes

    • Serial Comma

    • Spaces

    • Synonyms and Acronyms

    • Term Order (Alphabetic or …)

    • Term Label Order (Direct vs. Inverted)

  • Must also address issue of what to do when rules conflict – which are more important?


Roles in two taxonomy governance teams l.jpg

Executive Sponsor and controlled vocabularies?

Advocate for the taxonomy team

Business Lead

Keeps team on track with larger business objectives

Balances cost/benefit issues to decide appropriate levels of effort

Specialists help in estimating costs

Obtains needed resources if those in team can’t accomplish a particular task

Technical Specialist

Estimates costs of proposed changes in terms of amount of data to be retagged, additional storage and processing burden, software changes, etc.

Helps obtain data from various systems

Content Specialist

Team’s liaison to content creators

Estimates costs of proposed changes in terms of editorial process changes, additional or reduced workload, etc.

Small-scale Metadata QA Responsibility

Taxonomy Specialist

Suggests potential taxonomy changes based on analysis of query logs, indexer feedback

Makes edits to taxonomy, installs into system with aid of IT specialist

Content Owner

Reality check on process change suggestions

Business Lead

Custodians

Responsible for content in a specific CV.

Training Representative

Develops communications plan, training materials

Work Practices Representative

Develops processes, monitors adherence

IT Representative

Backups, admin of CV Tool

Info. Mgmt. Representative

Provides CV expertise, tie-in with larger IM effort in the organization.

Roles in Two Taxonomy Governance Teams

Team structure at a different org.


Taxonomy governance where changes come from l.jpg

Firewall and controlled vocabularies?

Application

Tagging

UI

UI

Tagging

Logic

Taxonomy governance | Where changes come from

Firewall

Firewall

Application

Application

Tagging

Tagging

UI

UI

UI

UI

Application Logic

Content

Content

Tagging

Tagging

Logic

Logic

Taxonomy

Taxonomy

Staff

Staff

Query log

Query log

notes

notes

analysis

analysis

missing

missing

concepts

concepts

End User

End User

Tagging Staff

Tagging Staff

  • Recommendations by Editor

  • Small taxonomy changes (labels, synonyms)

  • Large taxonomy changes (retagging, application changes)

  • New “best bets” content

  • Team considerations

  • Business goals

  • Changes in user experience

  • Retagging cost

Taxonomy Editor

Taxonomy Editor

experience

experience

Taxonomy Team

Requests from other

Requests from other parts of the organization

parts of NASA


Processes l.jpg

Different organizations will need to consider their own change processes.

Organization 1: A custodian is responsible for the content, but checks facts with department heads before making changes.

Organization 2: Analysts suggest changes, editors approve, copyeditors verify consistency.

Change process MUST also consider cost of implementing the change

Retagging data

Reconfiguring auto-classifier

Retraining staff

Changes in user expectations

Taxonomy Change Cases

Case 1. Renaming a term

Case 2. Adding a new leaf term

Case 3. Inserting a new term

Case 4. Splitting a term

Case 5. Deleting a leaf term or subtree

Case 6. Deleting a term

Case 7. Moving a subtree

Case 8. Merging terms

Case 9. Adding a CV

Case 10. Deleting a CV

Processes


Taxonomy governance taxonomy maintenance workflow l.jpg
Taxonomy governance | Taxonomy maintenance workflow change processes.

Problem?

Yes

No

Add to enterprise Taxonomy

Suggest new name/category

Review new name

Copy edit new name

Problem?

Taxon-omy

No

Yes

Analyst

Taxonomy Tool

Editor

Copywriter

Sys Admin


Agenda30 l.jpg
Agenda change processes.

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Taxonomy editing tools vendors l.jpg
Taxonomy editing tools vendors change processes.

Most popular taxonomy editor? MS Excel

Immature industry – no vendors in upper-right quadrant!

high

Ability to Execute

High functionality, high cost ($100k!)

low

Widely used, cheap, single-user

Niche Players

Visionaries

Completeness of Vision


Sample taxonomy editor functionality l.jpg
Sample Taxonomy Editor Functionality change processes.

  • Standard and Custom Fields

  • Standard and Custom Relations

    • Data Typing, Restrictions, and Inference

  • Flexible Reporting

  • Flexible Importing

  • Multiple Vocabulary Support

  • Inter-Vocabulary Relations

  • Unique IDs

    • ISO Codes not sufficient

  • Workflow

    • Voting

    • Change Request Management

  • Programmability

Term Editing

Hierarchy Browser


Where do i put the metadata l.jpg
Where do I put the metadata? change processes.

  • Where can I store metadata?

    • In the content – HTML Headers, File properties, etc.

    • In a centralized repository – Search index, MDDB, etc.

    • In multiple systems – Common case

  • Where should I store metadata?

    • Consultant’s answer – “It depends.”

    • If you are moving files through a process, putting it in the file keeps it from getting dropped at system borders.

    • If you are doing search across multiple documents, it has to be at least copied out of the files.

    • If you make copies of files and modify them, consistent in-file metadata will be impossible.

  • Real question is not where to STORE the metadata, it is how to MAINTAIN the metadata.

    • Web CMS as an example.

    • Central Metadata Database is a very advanced practice.


Agenda34 l.jpg
Agenda change processes.

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Agenda35 l.jpg
Agenda change processes.

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


What processes should i try to institute l.jpg
What Processes Should I Try to Institute? change processes.

  • Processes will vary from one organization to another.

  • Assessing the Organization’s state is the first step.

  • Determining the ROI and potential resources follows.

  • Plan on instituting processes over time, beginning with basic ones.


Search and metadata self assessment form l.jpg

Background change processes.

Rate your organization’s search & metadata maturity from 1 to 10.

What was the most recent change to your organization’s search & metadata processes?

What is the next step for your organization’s search & metadata processes?

Basic

Is there a process in place to examine query logs?

Is there an organization-wide metadata standard, such as an extension of the Dublin Core, for use by search tools, multiple repositories, etc.?

Intermediate

Is there an ongoing data cleansing procedure to look for ROT (Redundant, Obsolete, Trivial content)? If so, describe briefly.

Does the search engine index more than 4 repositories around the organization?

Are system features and metadata fields added based on cost/benefit analysis, rather than things that are easy to do with the current tools?

Are tools only acquired after requirements have been analyzed, or are major purchases sometimes made to use up year-end money?

Are there hiring and training practices especially for metadata and taxonomy positions? If so, describe briefly.

Advanced

Are there established qualitative and quantitative measures of metadata quality? If so, describe briefly.

Can the CEO explain the ROI for search and metadata?

Optional

Your name:

Organization:

E-mail:

Search and Metadata Self-Assessment Form

Contact information will not be used for marketing purposes. It will only be used to follow-up and clarify issues around the survey.


Agenda38 l.jpg
Agenda change processes.

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Metadata maturity model l.jpg
Metadata Maturity Model change processes.

  • Taxonomy governance processes must fit the organization

  • As consultants, we notice different levels of maturity in the business processes around Content Management, Taxonomy, and Metadata

  • Honestly assess your organization’s metadata maturity in order to design appropriate governance processes

  • We are starting to define a maturity model, similar to the CMMI model in the software world.


Metadata maturity model40 l.jpg

Shameless Plug: change processes. Tomorrow Morning at 9:45

Call for Data: Leave Self-Assessments with us

Metadata Maturity Model


Purpose of maturity model l.jpg
Purpose of Maturity Model change processes.

  • Estimating the maturity of an organization’s information management processes tells us:

    • How involved the taxonomy development and maintenance process should be

      • Overly sophisticated processes will fail

    • What to recommend as next steps

  • Maturity is not a goal, it is a characterization of an organization’s methods for achieving particular goals.

  • Mature processes have expenses which must be justified by consequent cost savings or revenue gains.

  • Metadata Maturity may not be core to your business.


Agenda42 l.jpg
Agenda change processes.

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Overview of best practices in metadata and taxonomy l.jpg
Overview of Best Practices in Metadata and Taxonomy change processes.

  • Avoid monolithic ‘subject’ taxonomies

    • May have a browsing taxonomy constructed from combined facets.

  • Use (or map to) Dublin Core for basic information.

  • Extend with custom elements for specific facts.

  • Use pre-existing, standard, vocabularies as much as possible.

    • Validate author names with LDAP directory

    • ISO country codes for locations

    • Product & service info from ERP system

  • Designate a team to manage the taxonomies and related materials

    • Taxonomy Editorial Rules, Processes, Training materials, Outreach & ROI

  • Design a Metadata QC Process

    • Start with an error-correction process, then get more formal on error detection.

    • In the future, large-scale ontologies like CYC may be valuable in automated error detection.


Factor subject into smaller facets l.jpg
Factor “Subject” into smaller facets change processes.

  • Size

    • DMOZ tries to organize all web content, has more than 600k categories!

    • Difficulty in navigating, maintaining

    • Hidden facet structure

  • “Classification Schemes” vs. “Taxonomies”



Facet principles l.jpg
Facet Principles change processes.

  • Basic facets with identified items – people, places, projects, instruments, missions, organizations, … Note that these are not subjective “subjects”, they are objective “objects”.

  • Subjective views can be laid on top of the objective facts, but should be in a different namespace so they are clearly distinguishable.

    • For example, labels like “Anarchist” or “Prime Minister” can be applied to the same person at different times (e.g. Nelson Mandela).


Iterative development vision more participants and tagged content at each iteration l.jpg

1 Identify Objectives change processes.

Interview core team and stakeholders

Review tagged samples, default procedures

Interview alpha users

Interview beta users

2 Inventory Content

ID sources, spider assets & extract metadata

Gather additional sources, if any

Gather additional sources, if any

3 Specify Metadata

Define fields & purpose

Revise if needed, bake into alpha CMS

Modify CMS for beta

Modify for 1.0

4 Model Content

Define content chunks & XML DTDs

Revise if needed, bake into alpha CMS

Modify CMS for beta

Modify for 1.0

5 Specify Vocabularies

Compile controlled vocabularies

Revise, use in alpha CMS

Revise, use in beta CMS

Revise using team procedure

6 Specify Procedures

Start with UI sketches, off-the-shelf rules.

Tailor the default materials

alpha workflows in CMS

Modify & extend workflows

Finalize procedure materials

7 Train Staff

Manually tag small sample

Use alpha CMS to tag larger sample

Use beta CMS to tag larger sample

Finalize training materials & train staff

Stage

Plan & Prototype

Alpha Dev & Test

Beta D&T

Final D&T

Participants

Project Team

Stakeholders and SMEs

Friendly Users

Audiences

Iterative Development Vision (More participants and tagged content at each iteration)


Planning for taxonomy changes l.jpg
Planning for Taxonomy Changes change processes.

  • Error Correction – What to do when end-users and tagging staff notice problems?

    • Provide for it in the Error Correction Process

    • Add Query Log Analysis to help detect user problems

    • How to answer questions re. things to add, delete, or rearrange in the taxonomy?

      • Keep a visible issue log

      • Discuss with SMEs, tag samples, use other testing methods

  • Per-facet changes:

    • Corporate reorganizations, Product lineup changes, Country splits & merges, … will happen. Prepare for them when deploying those facets

  • Long-term – what facets to create, when, and why

    • See Taxonomy Roadmap section


Agenda49 l.jpg
Agenda change processes.

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

    • Brief remarks on Measurements, ROI, Training, Roadmap

  • 4:20 Q &A

  • 4:30 Adjourn


Measuring metadata and taxonomy quality l.jpg
Measuring Metadata and Taxonomy Quality change processes.

  • Taxonomy development is an iterative process

  • Develop an organizational idea, then test it by tagging sample content

  • Elicit feedback via walk-throughs and card sorting exercises

  • Use both qualitative and quantitative methods

    • Time, budget, and availability of tagged data will determine what methods are possible.


Taxonomy testing qualitative methods l.jpg
Taxonomy testing | Qualitative methods change processes.

Include sample pages in walkthroughs, not just the hierarchy.


Tagged samples l.jpg
Tagged Samples change processes.

  • The Taxonomy must fit the content.

  • How to verify this? Tag samples!

    • Spreadsheets are a convenient tool for this. URLs, drop-down choosers, text notes all allowed.

  • Team can review tagged samples when reviewing taxonomy

    • More sophisticated teams may test inter-cataloger agreement

  • Samples should appear in training materials for tagging staff

    • Show typical and unusual cases.

  • Samples are used to define training sets for automatic classifiers.


Quantitative method how evenly does it divide the content l.jpg
Quantitative Method | How evenly does it divide the content? change processes.

  • Background:

    • Documents do not distribute uniformly across categories

    • Zipf (1/x) distribution is expected behavior

    • 80/20 rule in action (actually 70/20 rule)

  • Methodology:

    • Part of alpha test of ‘content type’ for corporate intranet

    • 115 URLs selected at random from search index were manually categorized. Inaccessible files and ‘junk’ were removed

  • Results:

    • Results were slightly more uniform than the Zipf distribution, which is better than expected


Quantitative method how intuitive repeatable are the categorizations l.jpg
Quantitative Method | How intuitive (repeatable) are the categorizations?

  • Methodology: Closed Card Sort

    • For alpha test of a grocery site

    • 15 Testers put each of 100 best-selling products into one of 10 pre-defined categories

    • Categories where fewer than 14 of 15 testers put product into same category were flagged

  • Results:

“Cocoa Drinks – Powder” is best categorized in both “Beverages” and “Grocery”.

In the trade, “Corn Tortillas” are a Dairy item!


Quantitative method how does taxonomy shape match that of content l.jpg

Background: categorizations?

Hierarchical taxonomies allow comparison of “fit” between content and taxonomy areas

Methodology:

25,380 resources tagged with taxonomy of 179 terms. (Avg. of 2 terms per resource)

Counts of terms and documents summed within taxonomy hierarchy

Results:

Roughly Zipf distributed (top 20 terms: 79%; top 30 terms: 87%)

Mismatches between term% and document% flagged

Quantitative Method | How does taxonomy “shape” match that of content?

Source: Courtesy Keith Stubbs, US. Dept. of Education


Taxonomy roi l.jpg
Taxonomy ROI categorizations?

  • What level of effort in taxonomy creation and maintenance is justified?


Fundamentals of taxonomy roi l.jpg
Fundamentals of Taxonomy ROI categorizations?

  • Building and maintaining a taxonomy, and tagging data with it, are costs not benefits.

  • There is no benefit without exposing the tagged data to users in some way that cuts costs or improves revenues.

  • Putting a new taxonomy into operation requires UI changes and/or backend system changes.

  • You need to determine those changes, and their costs, as part of the taxonomy ROI.


Common taxonomy roi scenarios l.jpg
Common Taxonomy ROI Scenarios categorizations?

  • Catalog site - ROI based on increased sales through improved

    • product findability

    • product cross-sells and up-sells

    • customer loyalty

  • Call center - ROI based on cutting costs through

    • fewer customer calls due to improved website self-service

    • faster, more accurate CSR responses through better information access

  • Knowledge worker productivity - ROI based on cutting costs through

    • less time searching for things

    • less time recreating existing materials, with knock-on benefits of less confusion and reduced storage and backup costs

  • Executive mandate

    • No ROI at the start, just someone with a vision and the budget to make it happen.


Tagging and training l.jpg
Tagging and Training categorizations?

  • How are we going to populate metadata elements with complete and consistent values?

    • The tagging problem

  • How are we going to get people (and/or software) to assign consistent, and accurate, metadata to the content?

    • The tagger training problem


Taxonomy governance workflow driven metadata tagging l.jpg

Hard Copy categorizations?

Taxonomy governance: Workflow-driven metadata tagging

Compose in Template

Submit to CMS

Automatically fill-in metadata

Problem?

Yes

No

Review content

Copy Edit content

Approve/Edit metadata

Web site

Tagging Process Doesn’t Stop Here!

Problem?

No

Yes

Analyst

Tagging Tool

Editor

Copywriter

Sys Admin


Training taxonomy editors and tagging staff l.jpg

Indexing UI categorizations?

Training Taxonomy Editors and Tagging Staff

  • Staff will require training on

    • The structure of the taxonomy

    • The UI they use to tag the content

    • The rules to follow when deciding what codes to apply

    • The end-effect of the codes they apply – have a running prototype or QA environment.

  • Tagging examples come from samples tagged during taxonomy development.

  • Hardcopies of the taxonomy, and yellow highlighters, are helpful during training.


Tagging tool example interwoven metatagger l.jpg

Manual form fill-in w/ check boxes, pull-down lists, etc. categorizations?

Parse & lookup (recognize names)

Rules & pattern matching

Auto keyword & summarization

Tagging tool example—Interwoven MetaTagger

Auto-categorization


Taxonomy roadmap l.jpg
Taxonomy Roadmap categorizations?

  • How to plan for long-term taxonomy development projects?


Taxonomy roadmap64 l.jpg
Taxonomy Roadmap categorizations?

  • Most organizations require a phased implementation of an Enterprise Taxonomy

  • A Taxonomy Roadmap defines the facets to be developed, their timing, and the reasons why

  • Factors to consider in prioritizing the facets include:

    • Immediacy of application – how will the taxonomy be put into use? A Search Engine? Portal Navigation? Other? How long will that take?

    • Impact – How many users will a facet help? How big of a help will it be?

    • Ease of development – does the vocabulary exist, can it be bought, or must it be developed? How big and complex will it be? How often will it change? Are there tools to help manage taxonomy changes or must those be acquired too?

    • What data must be tagged for that? What are the requirements on the metadata’s density and accuracy? Can those be met with automatic methods, or will more extensive human involvement be needed?

    • Staff expertise and Team experience.


Roadmap dependencies l.jpg
Roadmap: Dependencies categorizations?

  • Roadmap requires an organization plan their projects well in advance, so that upcoming projects can be influenced by the taxonomy

    • Consequently, this is an advanced practice

  • Roadmap prioritizes vocabularies according to benefit, cost, and fit with projects.

  • Governance Team is responsible for maintaining the Roadmap and the necessary outreach.


Roadmap facet prioritization matrix l.jpg
Roadmap: Facet Prioritization Matrix categorizations?

* Facets already in existence in client’s Intranet


Roadmap timeline l.jpg
Roadmap: Timeline categorizations?

Timeline lists the facets to be developed, and when those development efforts start and end.

Timeline shows what projects will make use of the facet, and how long that should take.

Intermediate and related projects are also shown.

Intermediate and related projects are also shown.

Search

Language

Access

Control

Search

Format

CM?

Search

Auto-Classification Tool

Content Type

Role

Search?

Taxonomy Tool Projects

Search &

Org Chart UI

Organization

Search

Location (Region)

Index

Search & Portal Nav

Subject

CM?

Location (Country)

Search?

Index

Products/ Services

Search &

Index

FY04Q2

FY05Q3

FY05Q4

FY05Q1

FY05Q2

FY04Q3

FY04Q4


Agenda68 l.jpg
Agenda categorizations?

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Agenda69 l.jpg
Agenda categorizations?

  • 1:30 Welcome & Introductions

  • 1:45 Exercise: Taxonomy Revisions

  • 2:15 Fundamental Processes

  • 2:30 Governance Team Roles and Structures

  • 3:00 Tools

  • 3:05 Break

  • 3:15 Exercise: Organizational Self-Assessment

  • 3:30 Maturity Model

  • 3:40 Designing and Building Maintainable Taxonomies & Metadata

  • 4:00 Additional Processes

  • 4:20 Q &A

  • 4:30 Adjourn


Contact info l.jpg

Contact Info categorizations?

Ron Daniel, Jr.

925-368-8371

[email protected]

Joseph Busch

415-377-7912

[email protected]


ad