1 / 26

REGNET

This study presents REGNET, a relatedness analysis approach using a regulatory repository to compare regulations and aid in e-rulemaking. The approach involves feature extraction in XML, structural and feature matching, and score refinements based on regulation structure. Performance evaluation shows promising results compared to Latent Semantic Indexing.

heldt
Download Presentation

REGNET

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. REGNET A Relatedness Analysis Approach for Regulation Comparison and E-Rulemaking Applications Gloria Lau, Haoyi Wang, Kincho Law, Gio Wiederhold Stanford University May 16th, 2005

  2. ADAAG in HTML UK DDA in HTML IBC in PDF Motivation: regulatory comparison • Multiple sources of regulations • Multiple jurisdictions: federal, state, local, etc. • Different formats, terminologies, contexts • Amending rules, conflicting ideas

  3. Motivation: e-rulemaking • Increasing amount of electronic data in e-rulemaking • Example • Alcohol and Tobacco Tax and Trade Bureau received over 14,000 comments in 7 months, the majority of which were emails, on a flavored malt beverages proposal • Originally in the Federal Register: • “All comments posted on our Web site will show the name of the commenter but will not show street addresses, telephone numbers, or e-mail addresses.” • Later in the Federal Register: • due to the “unusually large number of comments received,” the Bureau later announced that it was difficult to remove all street addresses, telephone numbers and email addresses “in a timely manner.”

  4. Relatedness analysis based on a regulatory repository • XML regulatory repository with features extracted • Shallow parser to consolidate regulations • HTML, PDF, plain text  XML regulations • Features, references, etc. • Relatedness analysis to help understanding of regulations and the relationships between them • Feature matching • Structural matching • Application to e-rulemaking • Comparisons of drafted regulations and public comments

  5. Development of a Regulatory Repository

  6. Feature Extraction in XML <regulation id="ibc" name="international building code" type="private"> <regElement id="ibc.1107" name="special occupancies"> … <regElement id="ibc.1107.2" name=“assembly area seating"> <reference id="ibc.1107.2.4.1" times="1" /> <concept name="assembl area" times="1" /> … <regText>Assembly areas with fixed seating shall comply … </regText> <regElement id="ibc.1107.2.1" name="services">...</regElement> <regElement id="ibc.1107.2.2" name=“wheelchair …">...</regElement> </regElement> </regElement> </regulation> reference parse tree

  7. Relatedness analysis Structural comparisons ADAAG 4.1.6(3)(d) Doors (i) Where it is technically infeasible to comply with clear opening width requirements of 4.13.5, a projection ... UFAS 4.14.1 Minimum Number Entrances required to be accessible by 4.1 shall be part of an accessible route and shall comply with ... Related elements: door and entrance

  8. Relatedness analysis • To utilize the computational properties of regulations for a complete comparison • Measure • Degree of relatedness: similarity score f(A, U)  (0, 1) • Nodes A and U are provisions from two different regulation trees

  9. Base score f0 computation • Linear combination of feature matching • F(A,U,i) = similarity score between Sections (A,U) based on feature i • N = total number of features •  = weighting coefficient • Feature matching • Based on the Vector model using cosine similarity as the distance between feature vectors • Non-Boolean features • A measurement of “2 inches max” can be a 70% match to “2 inches” • Synonyms exist, e.g., ontology defined for chemicals • Perform vector-space transformation prior to cosine computation

  10. Score refinements based on regulation structure • Neighbor inclusion • Diffusion of similarity between clusters of nodes in the tree

  11. Score refinements based on regulation structure • Reference distribution • Diffusion of similarity between referencing nodes and referenced nodes in the tree • E.g., f(A5.3, U6.4(a)) updates f(A2.1, U3.3)

  12. Performance evaluation • Conduct a user survey of rankings of similarity • 10 randomly chosen sections from the ADAAG and UFAS • Ranks 1 to 100 in the order of relevance • Root mean square error (RMSE) • = user-generated ranking vector • = machine-predicted ranking vector

  13. Survey results - Tabulated RMSE’s • Compared our analysis to Latent Semantic Indexing (LSI) •  = structural weighting coefficient •  = feature weighting coefficient • Average RMSE smaller than LSI • Measurement feature performs best • No improvement in result observed for structural comparison

  14. Results of comparisons: ADAAG vs. UFAS • Related accessible elements: door and entrance • No ontological information • Neighbor inclusion reveals higher similarity • Content of neighbors imply similarity between Section 4.1.6(3)(d) in ADAAG and Section 4.14.1 in UFAS

  15. Results of comparisons : UFAS vs. BS8300 • Terminological differences - revealed through neighbor inclusion

  16. Results of comparisons : UFAS vs. Scottish Technical Standards • Terminological differences - revealed through reference distribution • Stairs and ramps

  17. Application to e-rulemaking • Application domain: e-rulemaking • Comparison between draft of rules and the associated public comments • ADAAG Chapter 11, rights-of-way draft • Less than 15 pages • Over 1400 public comments received within 4 months • Comments ~10MBin size; most are several pages long  New regulation draft can easily generate a huge amount of data that needs to be reviewed and analyzed • Parsing of the draft and comments • From HTML to XML • Recreate structure of the draft using our shallow parser • Extract features from the draft and comments • Treat individual comments as provisions

  18. Application to E-Rulemaking Drafted regulations compared with public comments

  19. Results from e-rulemaking application • Related section in draft and public comment

  20. Results from e-rulemaking application • No related provisions identified • Concern not addressed in the draft

  21. Results from e-rulemaking application • Related section in draft and public comment • Commenting per provision • Forward to right personnel

  22. Results from e-rulemaking application • Related section in draft and public comment • Suggested revision cannot be located automatically • Linguistic analysis can potentially help

  23. Results from e-rulemaking application • Comment on the general intent of the draft • Clustering of comments might help

  24. Conclusions • Prototype for relatedness comparisons of regulations • Contextual comparisons • Domain knowledge • Structural comparisons • Performance Evaluation, Results and Applications • User survey and comparisons with LSI • Observations of comparisons between Federal, State, non-profit organization mandated codes and European standards • Application to e-rulemaking • Compare drafted rules with public comments • Observations of comparisons based on a rights-of-way draft

  25. Future research directions • Regulatory comparison • Regulatory competition • Cross border data transfer laws • Especially in the polyglot countries in EU • Regulatory updates • Track changes in updates • Track cross references between regulations • E-rulemaking • Automated routing of comment to person in charge • Clustering of comments • Web portal for comment submission per provision, in addition to per draft • Linguistic analysis to match patterns of suggested revision embedded in comments

  26. Thank You!

More Related