Computer-Assisted Coding for Surgical Pathology

Computer-assisted coding for surgical pathology Outline • What is surgical pathology? • A typical pathology report • Structure of pathology reports • Vocabulary mining • Structure of lexicon • Suffixes • Specimen codes • Generating rules • Recent work: diagnosis coding • Alignment EM for non-math geeks

Pathology subspecialties Surgical pathology (a.k.a. “histopathology”): • Pathologist examines surgical specimens, makes diagnosis • Transcriptionist types up verbal report • Computer assigns best-guess insurance codes for review by human coders • Coding system: CPT-4 • Six levels of coding for specimens, plus add-ons: • Intraoperative consultations • Microscopic stains • Decalcification of bone Other pathology modalities not covered: • Flow cytometry • Cytology • Anatomical pathology (autopsies)‏ • Drug testing • Molecular genetics • Immunology ... etc. ...

SURGICAL PATHOLOGY REPORT CLINICAL INFORMATION: HIDRADENITIS DIAGNOSIS: A - Right axillary lymph node, biopsy: Benign reactive lymph node. B - Skin and soft tissue of right axilla, excision: Hidradenitis suppurativa. See description. SPECIMEN: A) AXILLARY NODE RT SIDE B) AXILLARY RT HIDRADENITIS GROSS EXAMINATION: A - The container is labeled FIRSTNAME LASTNAME, right axillary lymph node. Received fresh is an enlarged lymph node with surrounding yellow adipose tissue. The entire specimen measures 5 x 3 x 1.5 cm. and the lymph node measures 3 x 1.5 x 1.5 cm. The lymph node has a tan nodal parenchyma which is partially replaced by fat. A representative section is submitted. B - The container is labeled FIRSTNAME LASTNAME, right axillary hidradenitis. Received in formalin is an unoriented segment of skin and subcutaneous tissue measuring 9.5 x 4 cm. and is excised to a depth of 2.5 cm. The skin surface is brown with several ulcerations the largest of which measures 3 cm. in greatest dimension. Sections through the specimen reveal no large masses. There is a 0.6 cm. cystic space in the dermis along one edge of the specimen. Representative sections are submitted in two cassettes. AMG/DKR:cac MICROSCOPIC EXAMINATION: A - This is a benign appearing lymph node which has reactive follicular and interfollicular hyperplasia. B - Skin is partly ulcerated in association with a sinus tract lined by granulation tissue and a mixed chronic and purulent inflammatory response. This forms a small cystic abscess cavity in dermis/subcutis which appears to extend to one margin of the sample.

Structure of surgical pathology notes • Regions: • Clinical Information (why surgery was done)‏ • Final Diagnosis (what pathologist found)‏ • Gross Exam (by eye; chain of responsibility)‏ • Microscopic Exam (optional)‏ • Each specimen gets its own CPT code • Information about specimens distributed throughout the note • Ergo, crucial to keep track of which piece of text refers to which specimen • Good news: Most pathologists (90%?) use labels for coreference • Bad news: Can be more than one label per line DIAGNOSIS: A, C & D - Specimens labeled "possible left submandibular gland", excisional biopsies: Benign lymph nodes. No salivary gland tissue is present. B - Specimen labeled "possible left submandibular gland", excisional biopsy: Segment of unremarkable fibrofatty connective tissue. E - Left submandibular gland, excisional biopsy: Chronic sialoadenitis with mild fibrosis. • Sample label combinations to “fan out”: • “A and B”, “A & B” • “A-C”, “A through C” (includes B)‏ • “A, C & D” (doesn't include B)‏

Vocabulary mining Used tf/idf to extract distinctive/characteristic language • High frequency in target set • Low frequency in overall corpus Various sets of comparison corpora: • All notes that received a given code vs. all notes that didn't e.g., "biopsy" preponderated in Level IV notes • Surgical pathology notes vs. radiology notes e.g., "nevus", "polyp", "Giemsa stain", "resection", ... Largely manual categorization and synonym-detection, done in collaboration with SME's

Structure of lexicon • Categories: bodypart, procedure, stain type, other • Entry = norm with optional variants (synonyms, acronyms, misspellings, irregular plurals)‏ • Longest match preferred • Regular plurals automatically computed • Terms containing non-word characters converted into regexes, indexed by first word for efficiency <entry cat="procedure"><norm>ASPIRATION</norm></entry> <entry cat="procedure"> <norm>FNA</norm> <var>FINE NEEDLE ASPIRATION</var> </entry> <entry cat="bodypart"> <norm>BILE DUCT</norm> <var>BILIARY DUCT</var> <var>BILE DUCK</var>  </entry>

Lexical exceptions • Avoid mis-tagging ambiguous words/phrases • Example: “Gram” stain used to identify microorganisms DON'T want to assign stain code for these phrases: • “Specimen weighs 140 grams” • “140-gram specimen” • “Weight in grams:” DO want to assign stain code for these phrases: • “Negative for diplococci by Gram” • “Stains used: Gram, GMS, H&E” • “Gram reveals no evidence of anthrax bacilli” <entry cat="stain12"> <norm>GRAM</norm>  <except>GRAMS</except>  <except regex=”true”>\d+\W+GRAM</except> </entry>

Suffixes • Tacked onto all sorts of stems (“bunionectomy”!)‏ • Often useful even if you don't know what the root means • Examples: • -ectomy: Cutting out (ex- 'out' + -tomy 'cut')‏ • -(o)cele: Rupture, hernia • -oma: Tumor Lexical entries: <entry cat="procedure"> <norm>RESECTION</norm> <var>RESECT</var> <var type="suffix">-ECTOMY</var> <var type="suffix">-OSTOMY</var> </entry> <entry cat="bodypart"> <norm>UTERUS</norm> <var>HYSTERECTOMY</var> </entry> Result of tagging “HYSTERECTOMY”: <TAG cat="bodypart" norm="UTERUS">HYSTER</TAG><TAG cat="procedure" norm="RESECTION">ECTOMY</TAG>

Suffixation exceptions Example: -oma 'tumor' As in: carcinoma, melanoma, adenoma, ... But not: • stoma (surgical opening in the abdomen)‏ • hematoma (lit. “blood tumor”, but really just a swelling)‏ • lipoma (lump of fatty tissue, not technically a tumor for coding purposes)‏ “Stoma” is easy: Tell the tagger that a stem must contain vowels. For the others, though, we add an anti-de-suffixation attribute in the lexicon: <entry cat="other"> <norm>TUMOR</norm> <var type="suffix">-OMA</var> <var type="suffix">-OMATA</var> <var type="suffix">-OMATOUS</var> </entry> <entry cat="other" suffix="false"><norm>HEMATOMA</norm><entry> <entry cat="other" suffix="false"><norm>LIPOMA<norm></entry>

Specimen codes Six CPT codes, ordered roughly by invasiveness of surgery: • 88300 - Level I: Gross exam only (the rest require microscopic examination)‏ • 88302 - Level II: e.g. appendix, incidental; fallopian tube, sterilization; fingers/toes, traumatic (accidental) amputation; foreskin, newborn; hernia sac; hydrocele sac; nerve; skin, plastic repair; ... • 88304 - Level III: e.g. abortion; abscess; appendix, non-incidental;arterial or ventricular aneurysm; anal tags; appendectomy; biopsy of conjunctiva; bone fragments,except pathologic fracture; arterial plaque; certain cysts; ... • 88305 - Level IV: e.g. miscarriage; biopsy of artery, axilla, bladder, bone marrow, cervix (except cone biopsy)..., tonsil, urethra, vocal cord, vulva; bone exostosis (e.g. calcaneal spurs); brain tissue, except biopsy or... • 88307 - Level V: e.g. adrenal glands; bone biopsy/curettings; bone fragments from pathologic fracture; biopsy of bone marrow, brain, heart, liver, lung, pancreas, prostate, testis; cervical cone biopsy; brain tumor resection; colon, segmental resection not for tumor; breast excisional biopsy; mastectomy without lymph nodes; ... • 88309 - Level VI: e.g. bone resection; mastectomy with lymph nodes; colon,segmental resection for tumor; esophagus, partial/total resection; ... Big buckets, few generalizations possible

Automatic code assignment? • Good: 30,000 coded surgical-pathology reports from customer • Bad: All were coded by a single coder, inexperienced in pathology • Lacking a gold standard, statistical approaches infeasible • Found a “crib sheet” (humans can't do it either!): http://www.pathology.med.umich.edu/intra/templates/cribsheet.pdf Sample: 88305 Prostate transurethral resection (TUR)‏ 88304 Pterygium 88307 Pulmonary wedge resection 88307 Quadrantectomy 88305 Rectal biopsy 88305 Rectal polyp 88305 Reduction mammoplasty 88305 Renal biopsy 88309 Retroperitoneal mass--for tumor 88307 Retroperitoneal mass--other than for tumor 88305 Adnexa--ovary w/ or w/o tube, non-neoplastic 88305 Ovary biopsy or wedge resection 88305 Ovary w/ or w/o fallopian tube--non-neoplastic 88307 Adnexa--ovary w/ or w/o tube, neoplastic 88307 Ovary with or without fallopian tube--neoplastic

Generating rules for code assignment • Tagged cribsheet using lexicon, auto-generated about 350 rules • Edited by hand for exceptions etc. -- about a day's work • Rule application mechanism: competition -- first to apply wins • Rules ordered by: • Code – from Level VI down to Level I • Presence of “other”-type features (e.g. neoplasm)‏ • Number of components • Bodyparts (alphabetical)‏ • Procedures (alphabetical)‏ • Performance substantially improved over prototype • Perfectly coded reports: 52.6% -> 79.2% • Overall F-measure: 66% -> 84% • Bonus: Resulting system largely maintainable by SME's!

Recent work: Diagnostic coding • Now have ~150,000 surgical-pathology reports from various development partners • Susceptible to statistical approaches • Diagnostic coding system: ICD-9 (soon ICD-10)‏ Vocabulary discovery • Work done by colleague for independent CPT-related effort • Inferred hierarchy from regularities in descriptions of ICD-9 codes 891.0: Open wound of knee, leg (except thigh), and ankle, without mention of complication 891.1: Open wound of knee, leg (except thigh), and ankle, complicated • Correlated n-grams not in lexicon with inferred nodes in hierarchy (correlation measures: log likelihood, mutual info, DICE, chi-squared)‏ • Mined for terms not in lexicon • Head start on coding... but what about terms highly correlated across multiple nodes?

Expectation maximization • “Iterative method for computing maximum-likelihood estimates of parameters in statistical models” (thank you, Wikipedia) • Given a model, what parameters make observed results most probable? • Applications: psychometrics, computer vision, image processing, finance, … • In NLP: • Baum-Welch (“forward-backward”) algorithm for HMMs • Inside-outside algorithm for probabilistic CFGs • Expectation: Create a function for expected log-likelihood with current parameters • Maximization: Compute parameters that maximize the expectation function • Analogy: Pumping water through a system of pipes (thank you, Philip Resnik)

Expectation maximization for alignment Recipe: • EM over Bayesian probabilities • Per report: m linguistic feature vectors, n codes • Start with uniform distribution, all alignments equal probability weight: 1/(m*n)‏ • E-step -- For each alignment i,j: Compute Ei,j = sum of probability weights over all reports • M-step: For each report: • Insert E's computed for each alignment • Then re-normalize table so E's sum to 1 (redistributing probability mass)‏ • Lather, rinse, repeat • Probability mass magically converges on correct alignments (even when incorrect alignments have similar numbers of instances)‏

Simple example Initial: C1 C2 C1 C3 C2 C4 FV1 | 0.25 | 0.25 || FV1 | 0.25 | 0.25 || FV2 | 0.25 | 0.25 | FV2 | 0.25 | 0.25 || FV3 | 0.25 | 0.25 || FV4 | 0.25 | 0.25 | Loop 1: Sum Ei,j over notes, then plug it back into each cell. C1 C2 C1 C3 C2 C4 FV1 | 0.5 | 0.25 || FV1 | 0.5 | 0.25 || FV2 | 0.5 | 0.25 | FV2 | 0.25 | 0.5 || FV3 | 0.25 | 0.25 || FV4 | 0.25 | 0.25 | Then normalize Ei,j's back to probabilities. C1 C2 C1 C3 C2 C4 FV1 | 0.33 | 0.17 || FV1 | 0.4 | 0.2 || FV2 | 0.4 | 0.2 | FV2 | 0.17 | 0.33 || FV3 | 0.2 | 0.2 || FV4 | 0.2 | 0.2 | By Loop 11, we have (with some rounding errors): C1 C2 C1 C3 C2 C4 FV1 | 0.5 | 0.0 || FV1 | 0.98 | 0.01 || FV2 | 0.98 | 0.01 | FV2 | 0.0 | 0.5 || FV3 | 0.01 | 0.01 || FV4 | 0.01 | 0.01 |

Oops: Conditional probability, not joint Revised recipe: • Per report: m linguistic feature vectors, n codes • Joint probability: Start with all alignments equal probability weight: 1/(m*n) • Oops! Conditional probability: 1/n‏ (number of codes) • E-step -- For each alignment i,j: Compute Ei,j = sum of probability weights over all reports • M-step: For each report: • Insert E's computed for each alignment • Then re-normalize so E's sum to 1 (redistributing probability mass)‏ • Joint: Probabilities over whole table sum to 1 • Oops! Conditional: Probabilities for each column (i.e. code) sum to 1 • Lather, rinse, repeat • Probability mass magically converges on correct alignments (even when incorrect alignments have similar numbers of instances) But in the problem domain, it didn't make much difference, because...‏

Computer-Assisted Coding for Surgical Pathology

Computer-Assisted Coding for Surgical Pathology

Presentation Transcript