benchmarking web accessibility evaluation tools
Download
Skip this Video
Download Presentation
Benchmarking Web Accessibility Evaluation Tools:

Loading in 2 Seconds...

play fullscreen
1 / 24

Benchmarking Web Accessibility Evaluation Tools: - PowerPoint PPT Presentation


  • 194 Views
  • Uploaded on

http:// dx.doi.org /10.6084/m9.figshare. 701216. Benchmarking Web Accessibility Evaluation Tools:. Measuring the Harm of Sole Reliance on Automated Tests. Markel Vigo University of Manchester (UK) Justin Brown Edith Cowan University (Australia )

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Benchmarking Web Accessibility Evaluation Tools:' - magda


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
benchmarking web accessibility evaluation tools

http://dx.doi.org/10.6084/m9.figshare.701216

Benchmarking Web Accessibility Evaluation Tools:

Measuring the Harm of Sole Reliance on Automated Tests

Markel Vigo University of Manchester (UK)

Justin Brown Edith Cowan University (Australia)

Vivienne Conway Edith Cowan University (Australia)

10th International Cross-Disciplinary Conference on Web Accessibility

W4A2013

problem fact
Problem & Fact

WWW is not accessible

13 May 2013

W4A2013

evidence
Evidence

Webmasters are familiar with accessibility guidelines

Lazar et al., 2004

Improving web accessibility: a study of webmaster perceptions

Computers in Human Behavior 20(2), 269–288

13 May 2013

W4A2013

hypothesis i
Hypothesis I

Assuming guidelines do a good job...

H1: Accessibility guidelines awareness is not that widely spread.

13 May 2013

W4A2013

evidence ii
Evidence II

Webmasters put compliance logos on non-compliant websites

Gilbertson and Machin, 2012

Guidelines, icons and marketable skills: an accessibility evaluation of 100 web development company homepages

W4A 2012

13 May 2013

W4A2013

hypothesis ii
Hypothesis II

Assuming webmasters are not trying to cheat...

H2: A lack of awareness on the negative effects of overreliance on automated tools.

13 May 2013

W4A2013

slide7

Expanding on H2Why we rely on automated tests

  • It\'s easy
  • In some scenarios seems like the only option: web observatories, real-time...
  • We don\'t know how harmful they can be

13 May 2013

W4A2013

slide8

Expanding on H2Knowing the limitations of tools

  • If we are able to measure these limitations we can raise awareness
  • Inform developers and researchers
  • We run a study with 6 tools
  • Compute coverage, completeness and correctnesswrt WCAG 2.0

13 May 2013

W4A2013

method computed metrics
MethodComputed Metrics
  • Coverage: whether a given Success Criteria (SC) is reported at least once
  • Completeness:
  • Correctness:

13 May 2013

W4A2013

method stimuli
MethodStimuli

Vision Australia

www.visionaustralia.org.au

Non-profit

Non-government

Accessibility resource

Prime Minister

www.pm.gov.au

Federal Government

Should abide by the Transition Strategy

Transperth

www.transperth.wa.gov.au

Government affiliated

Used by people with disabilities

13 May 2013

W4A2013

method obtaining the ground truth
MethodObtaining the "Ground Truth"

Ad-hoc sampling

Manual evaluation

Agreement

Ground truth

13 May 2013

W4A2013

method computing metrics
MethodComputing Metrics

For every page in the sample...

Evaluate

Get reports

Compare with the GT

Compute

metrics

T1

M1

R1

GT

T2

M2

R2

T3

M3

R3

R4

T4

M4

T5

M5

R5

R6

T6

M6

13 May 2013

W4A2013

accessibility of stimuli
Accessibility of Stimuli

Vision Australia

www.visionaustralia.org.au

Prime Minister

www.pm.gov.au

Transperth

www.transperth.wa.gov.au

13 May 2013

W4A2013

slide14

ResultsCoverage

  • 650 WCAG Success Criteria violations (A and AA)
  • 23-50% of SC are covered by automated test
  • Coverage varies across guidelines and tools

13 May 2013

W4A2013

slide15

ResultsCompleteness per tool

  • Completeness ranges in 14-38%
  • Variable across tools and principles

13 May 2013

W4A2013

slide16

ResultsCompleteness per type of SC

  • How conformance levels influence on completeness
  • Wilcoxon Signed Rank: W=21, p<0.05
  • Completeness levels are higher for \'A level\' SC

13 May 2013

W4A2013

slide17

ResultsCompleteness vs. accessibility

  • How accessibility levels influence on completeness
  • ANOVA: F(2,10)=19.82, p<0.001
  • The less accessible a page is the higher levels of completeness

13 May 2013

W4A2013

slide18

ResultsTool Similarity on Completeness

  • Cronbach\'s α = 0.96
  • Multidimensional Scaling (MDS)
  • Tools behave similarly

13 May 2013

W4A2013

slide19

ResultsCorrectness

  • Tools with lower completeness scores exhibit higher levels of correctness 93-96%
  • Tools that obtain higher completeness yield lower correctness 66-71%
  • Tools with higher completeness are also the most incorrect ones

13 May 2013

W4A2013

slide20

ImplicationsCoverage

  • We corroborate that 50% is the upper limit for automatising guidelines
  • Natural Language Processing?
    • Language: 3.1.2 Language of parts
    • Domain: 3.3.4 Error prevention

13 May 2013

W4A2013

slide21

ImplicationsCompleteness I

  • Automated tests do a better job...

...on non-accessible sites

...on \'A level\' success criteria

  • Automated tests aim at catching stereotypical errors

13 May 2013

W4A2013

slide22

ImplicationsCompleteness II

  • Strengths of tools can be identified across WCAG principles and SC
  • A method to inform decision making
  • Maximising completeness in our sample of pages
    • On all tools: 55% (+17 percentage points)
    • On non-commercial tools: 52%

13 May 2013

W4A2013

conclusions
Conclusions
  • Coverage: 23-50%
  • Completeness: 14-38%
  • Higher completeness leads to lower correctness

13 May 2013

W4A2013

follow up
Follow up

Contact

@markelvigo | [email protected]

Presentation DOI

http://dx.doi.org/10.6084/m9.figshare.701216

Datasets

http://www.markelvigo.info/ds/bench12/index.html

10th International Cross-Disciplinary Conference on Web Accessibility

W4A2013

13 May 2013

ad