Skip to content

Conversation

@spectranaut
Copy link

@spectranaut spectranaut commented Jul 26, 2024

Here is a link to the markdown file rendered: https://github.com/Igalia/rfcs/blob/wpt-for-aams-rfc/rfcs/wpt-for-aams.md

There is a lot of background context in the RFC for those who wish to understand more about the underlying accessibility technologies.

Here are the notes for the WPT meeting where this was first brought up: https://pad.0b101010.services/wpt-infra-meeting?view

@spectranaut spectranaut changed the title RFC ??: WPT testing for AAMs RFC 204: WPT testing for AAMs Jul 26, 2024
@annevk
Copy link
Member

annevk commented Aug 15, 2024

Per-platform tests

I think it would be ideal if we could avoid per-platform tests by instead expressing the per-platform mapping for a given feature in some configuration file that's included in tests. I think that will make it easier to contribute new tests as you don't necessarily have to know about the platform mappings (unless you're adding coverage for a completely new feature) and increase the likelihood of cross-platform coverage (platforms will only have to update these configuration files and not duplicate tests and such).

@zcorpan
Copy link
Member

zcorpan commented Aug 19, 2024

@annevk interesting idea. The tests would assert that the appropriate mapping is used (e.g., core-aam "aria-autocomplete=inline, list, or both" or html-aam "input (type attribute in the Color state)").

@spectranaut
Copy link
Author

@annevk I remember talking with you about this at the webengine hacktest -- I think this might be a good idea and I've started to explore it. In this scenario, this json file contains mappings defined in a single table, like @zcorpan mentioned -- core-aam's aria-autocomplete=inline, list, or both and html-aams input (type attribute in the Color state). Then the test will check the mapping appropriate for that platform as defined in the JSON.

The core-aam manual tests already have this data in a similar structure, see the manual test for alert which has a javascript object with the mapping expectations for each platform: https://github.com/web-platform-tests/wpt/blob/master/core-aam/alert-manual.html#L18

@spectranaut
Copy link
Author

Ok, @annevk, here are some tests designed using that use a configuration file containing all of the platform mapping details: Igalia/wpt#14

We can write tests that do not look operating system dependent this way, but, the results of the test are operating system dependent. I tried to get around this, but its a bit awkward -- in this design, every test expands to 4 subtests, one per API, regardless of the operating system the test was run on.

Ideally, we would have a single subtest that says to the backend: "Hey, test the appropriate API and let us know the result", but we can't because of operating systems and APIs are not a 1-1 mapping. Windows has two accessibility APIs (UIA and IAccessible2) and we need to know the status of both APIs separately (different ATs use different APIs, both need to be supported). So at minimum, we need different 2 subtest for windows, where as we only need one for linux and one for macOS.

I wrote about some alternatives test designs in the link PR's description. They could probably all be used with this "configuration file" idea, the configuration file design doesn't solve the platform specific test and/or results problem.

@annevk
Copy link
Member

annevk commented Aug 22, 2024

That is tricky! Would it be possible that for these tests we handle the operating system and API together as a single platform? So we run the test for Windows UIA and again for Windows IAccessible2? (Likely with skips when we already know the combination isn't supported for a given browser.)

This would impact how we have to display the results for these tests (and possibly how we run them, but that already needs changes anyway), but it would keep everything else consistent.

@spectranaut
Copy link
Author

Hmm I think I'm not totally sure I understand the implications of your suggestion, or, it's what I'm already doing :) Here is an example of the linked test designs "results" for a run on linux:
https://spectranaut.github.io/examples/wpt/role_blockquote_test.html

You have to expand the asserts to understand what is really going on. Only the linux test (Atspi) ran any assertions, and passed. On Windows, both the UIA and IAccessible2 rows would have data. The APIs that don't apply to the platform just trivially pass with no assertions.

Essentially, a utility function asks the back end "can you run this test for this API" for each API, and the back end either sends a result or "no I can't run that test".

@annevk
Copy link
Member

annevk commented Aug 22, 2024

It's a bit hard to explain but what I'm contemplating is that you'd have a single subtest, but that e.g. Firefox Linux, Firefox macOS, Firefox Windows UIA, and Firefox Windows IAccessible2 all have their own run for that single subtest. And the same for other browsers. Skipping variants that are not applicable. And the subtest expectations would adjust based on the platform AT API that is running.

Basically I'm saying that we solve your

Ideally, we would have a single subtest that says to the backend: "Hey, test the appropriate API and let us know the result", but we can't because of operating systems and APIs are not a 1-1 mapping.

by making the APIs count and "run" these tests and not the operating systems, if you will.

@zcorpan
Copy link
Member

zcorpan commented Aug 23, 2024

Reporting results to wpt.fyi separately for each (OS, accessibility API, browser) combination allows for side-by-side comparison. However, we wouldn't want to run all of wpt for each accessibility API.

Maybe there can be one combination where all tests are run, and for the remaining applicable accessibility APIs only the AAM tests are run.

@annevk
Copy link
Member

annevk commented Aug 23, 2024

However, we wouldn't want to run all of wpt for each accessibility API.

Right, this is what we would have to special case for "this directory" (or some other signifier) in a couple of places.

@spectranaut
Copy link
Author

Ah now I understand what you are trying to say. I would be nice to see the support for "blockquote" as a single row across all API/OS/Browser combinations and it seems like the right direction to go in.

We already need a special case for this directory, because these are the only tests that need "accessibility turned on". The browser typically does not compute the accessibility tree/accessibility APIs.

So in this scenario, we would need to:

  • change the data collection for all the browser/OS pairs for wpt.fyi, to collect data for that yet-untested API. At this time I'm not sure where all that data collection is happening.
  • update wpt.fyi to show the API information in the column for the special folders.
  • update the harness to be able to specify which API to test.

@jcsteh
Copy link

jcsteh commented Sep 1, 2024

I think it would be ideal if we could avoid per-platform tests by instead expressing the per-platform mapping for a given feature in some configuration file that's included in tests.

I have a couple of concerns with a declarative config style approach:

  1. It separates the expected results from the tests. I'd argue tests are a lot easier to read, understand and write when there is less indirection. To be fair, the indirection is only one level deep here and I do also follow the downside of having separate tests, though I wonder whether that could be resolved by ensuring the tests for all platforms are always in the same file.
  2. This works well enough when dealing with simple property comparisons. However, mappings aren't always simple properties. Examples are text attributes, table navigation and accessibility events. I'm sure we could find a way to express these declaratively, but I think it would become difficult to work with.

I think that will make it easier to contribute new tests as you don't necessarily have to know about the platform mappings

I don't quite follow this. New tests will necessarily have different expected results, which means the mapping to platforms will be different, even if that's just that there are different values for the different properties. And if you don't know the platform mappings, I don't quite follow why you'd be writing a new AAM test, since AAMs are all about platforms. I guess many non-core AAM entries map to ARIA roles/states, so those could be common somehow, but Core AAM in particular (and to a lesser extent other AAMs) inherently requires an understanding of platform mappings. I feel like I'm missing something here, though. What's the scenario where a new AAM test would be added without any concern for platform mappings?

@jcsteh
Copy link

jcsteh commented Sep 2, 2024

The tests may be flaky due to timing issues with the accessibility tree being built. The accessibility tree lags behind the DOM and even paint, with no feedback available to the page about what the status of the accessibility tree is.

This could potentially be solved by waiting for events. I wonder whether this new API might need to support events to solve this problem.

@annevk
Copy link
Member

annevk commented Sep 2, 2024

What's the scenario where a new AAM test would be added without any concern for platform mappings?

Some possible scenarios:

  1. You are contributing a new test that uses an existing mapping. E.g., role=search has coverage, but <search> did not. It would be straightforward to add coverage for <search>.
  2. You are contributing a new test that does not use an existing mapping. In this case you need to know about at least one platform, but you don't need to know about all the platforms. Those can be filled in by others later.

I'm sure we could find a way to express these declaratively, but I think it would become difficult to work with.

I don't think it has to necessarily be declarative. Just abstracted.

@spectranaut
Copy link
Author

The tests may be flaky due to timing issues with the accessibility tree being built. The accessibility tree lags behind the DOM and even paint, with no feedback available to the page about what the status of the accessibility tree is.

This could potentially be solved by waiting for events. I wonder whether this new API might need to support events to solve this problem.

Right, in many scenarios we can wait for the appropriate event on each platform before testing -- for example, before starting the test, we could listen for document:load-complete on linux, and I did have this implemented at one point in the prototype. We switched to polling for whether the tab exist with the correct URL in this scenario.

Eventually, these tests will need to support events for testing events described in the AAMs, and we will probably be able to resolve most if not all sources of flakiness by listening for events. Maybe we are being overly cautious by saying the tests might be flaky...

@foolip
Copy link
Member

foolip commented Dec 10, 2025

I've reviewed this RFC and something like this does seem necessary to adequately test AAM specs, in particular how the browser's accessibility tree is mapped platform accessibility APIs.

w3c/webdriver-bidi#443 proposes to expose the accessibility tree through WebDriver BiDi, which would allow testing the shape and properties of the accessibility tree, also important but a different layer. There might still be some overlap and common problems however, so I'm commenting on both discussions to cross-link them.

My thoughts on some of the open questions listed:

Adding dependencies on the Python bindings for the platform APIs

Adding dependencies to the appropriate requirements.txt file should do the trick.

Using the testharness test type rather than some new thing

Since you need to use the testharness.js assertions and testdriver.js automation bits, I think sticking with the existing test type is best. But if you do want to be able to filter these tests, then a new test type is probably the most straightforward. #229 is an example of this that's in progress, with code to copy if needed.

GetAccessibilityAPINodeAction/PlatformAccessibilityProtocolPart design

As long as the API is "roughly like WebDriver" it should be fine, and having a string argument and returning an object with properties is quite like WebDriver.

Per-platform tests

There is some support for this in the form of __dir__.ini files in expectations metadata, which could be used to skip a directory based on the platform. But we don't use this metadata in the main WPT repo, it's something that each browser vendor can use to manage their test expectations.

I think the style of the tests using test_driver.get_platform_accessibility_node in Igalia/wpt#2 are alright, and if some test only makes sense for one platform, you could use assert_implements_optional to skip it on other platforms. This isn't great for understanding cross-browser results, but infinitely better than no cross-browser test results at all.

Getting the browser PID

w3c/webdriver#1833 would be a good solution to this I think.

@foolip
Copy link
Member

foolip commented Dec 11, 2025

A thought on the testdriver.js API shape. The tests in Igalia/wpt#2 call test_driver.get_platform_accessibility_node('test').

To avoid the test having to give each element a unique ID, I wonder if the argument could instead be an element and if the implementation could ensure there's a way to identify it in the platform accessible tree? Some existing APIs use a get_selector helper to create a unique selector. I assume that platform accessibility APIs don't support CSS selectors, but maybe nodes can be uniquely identified by some other data that can be automatically generated and propagated all the way?

Relatedly, I wonder if "put random data in DOM, wait for it to appear in platform accessibility tree" could also make the test faster and more reliable, by having an unambiguous signal that the tree has been updated.

@spectranaut
Copy link
Author

spectranaut commented Dec 12, 2025

Hi @foolip!

Thanks so much for the review, I really appreciate the thoughts! I'm sorry that I really should have updated the above link to point to the PR I've been working on against WPT directly, which has moved things a long in a few directions since the PR you reviewed: web-platform-tests/wpt#53733

Status of the progress since this RFC, on the new PR, in relation to your comments:

  1. On the test design

The actual design of the test endpoint is not finalized, and I'm discussing with people in ARIA (and most specifically James Teh from firefox), but it probably won't be "get_platform_accessibility_node", like you can see in the first POC in Igalia/wpt#2. The properties/end points of the different accessibility APIs are too varied and it doesn't really make sense to squash them into a single object. The current tests in web-platform-tests/wpt#53733 use a test object that was made/used by Joanie Diggs to test accessibility APIs outside of WPT -- you can see them in all the manual tests in the core-aam directory of wpt. Jamie is advocating in this comment to actually write the tests as a python strings, to be able to write the tests directly in terms of the accessibility API, and then execute the string on the backend.

  1. Testdriver vs a new test type

You mentioned looking into adding another test type, but for testing platform-specific accessibility events, I'll want to be able to call webdriver commands like click, so I think I'll be sticking to extending testdriver unless there are any big yet-unheard objections. Jonathan Lee from Google just reviewed the CL and specifically the changes to testdriver, I've been really tight on time lately but slowly making the changes he's requested.

  1. Python dependencies and wpt.fyi:

You can read in the description of the PR, but the python dependencies have been added :) however, they are only installed if you run the tests with --enable-accessibility-api, because the linux tests have a system dependency probably not on any bots that run WPT downstream, and I didn't want to break downstream runs of WPT. So landing this PR as is will not run the accessibility tests wpt.fyi, they will individually fail with "accessibility API testing not enabled".

However, I pushed a commit to the branch with --enable-accessiblity-api set to true, you can see a run of the tests successfully from the CI on wpt.fyi: https://wpt.fyi/results/core-aam?sha=d508e4016d&label=pr_head

  1. About per platform tests

You mentioned a way to ignore folders... but I'm a bit attached to the idea of a single test file containing all assertions for all 4 accessibility APIs, it will be more obvious to see whether a new role or specific markup is fully supported, across all APIs, if there is a single test that tests all the platforms.

This is how the current PR's tests are designed: each subtest is a different accessibility API, and the accessibility API not relevent for that platform just "pass" in order to not raise false flags: https://wpt.fyi/results/core-aam/role/blockquote.tentative.html?sha=d508e4016d&label=pr_head

I think, ideally, those sub tests (like the AXAPI (mac) subtests, when running the test on linux) could be marked as "not applicable", but I haven't looked into how much work that would be to introduce those concepts to WPT and wpt.fyi.

  1. About DOM IDs

I'm not sure there is a way around the DOM IDs for this purpose, at least, not a low-hanging fruit solution, but it's worth a larger/continued conversation. For now DOM IDs will allow for a lot of testing, and aren't that much test writing overhead. Is it ok to put that aside for now in terms of getting this current PR landed?

My current goals

What I'd really like is for support for the extension to testharness and the changes I've made to wptrunner, and ideally land the current PR against WPT, then solve the remaining problems in follow up PRs.

The follow up PRs would be iterating on test design and, separately, figuring out a big issue -- which is how to turn on accessibility only for these test. Once I can turn on accessibility on for a subset of tests, I think we can switch --enable-accessibility-api to true, and these tests will be run on wpt.fyi by default. I think I can take advantage of the way wpt does chunking for tests, which is usually by directory, and force the browser to only turn on accessibility when the chunk includes a accessibility API directory.

@spectranaut
Copy link
Author

I forgot to say I landed a change on chromedriver to get the PID: https://chromium-review.googlesource.com/c/chromium/src/+/5825307

So all I need is the same change in safaridriver and there is no problem with parallel tests :)

@foolip
Copy link
Member

foolip commented Dec 15, 2025

  1. On the test design

The actual design of the test endpoint is not finalized, and I'm discussing with people in ARIA (and most specifically James Teh from firefox), but it probably won't be "get_platform_accessibility_node", like you can see in the first POC in Igalia/wpt#2. The properties/end points of the different accessibility APIs are too varied and it doesn't really make sense to squash them into a single object. The current tests in web-platform-tests/wpt#53733 use a test object that was made/used by Joanie Diggs to test accessibility APIs outside of WPT -- you can see them in all the manual tests in the core-aam directory of wpt. Jamie is advocating in this comment to actually write the tests as a python strings, to be able to write the tests directly in terms of the accessibility API, and then execute the string on the backend.

If the tests need to do both setup as HTML/CSS/JS and do some assertions using Python, I could see these approaches:

  1. Use a Python test harness and let the Python test code load HTML/CSS/JS into the browser, similar to how WebDriver (wdspec) tests work. (example) This is very different from your current approach, but might make sense if the HTML/CSS/JS parts of the tests are very simple but the assertions against the Python libraries are complicated.
  2. Stick with testharness.js and embed Python code as string literals, sent to be executed on the backend. (I think this is the suggestion above.) This is a style of test I haven't seen in WPT, but I don't see any technical blockers to this.
  3. Stick with testharness.js and use custom Python handlers to interrogate the platform accessibility APIs and run the assertions. The main downside is that the test would be split into two files.
  1. About per platform tests

You mentioned a way to ignore folders... but I'm a bit attached to the idea of a single test file containing all assertions for all 4 accessibility APIs, it will be more obvious to see whether a new role or specific markup is fully supported, across all APIs, if there is a single test that tests all the platforms.

This is how the current PR's tests are designed: each subtest is a different accessibility API, and the accessibility API not relevent for that platform just "pass" in order to not raise false flags: https://wpt.fyi/results/core-aam/role/blockquote.tentative.html?sha=d508e4016d&label=pr_head

I think, ideally, those sub tests (like the AXAPI (mac) subtests, when running the test on linux) could be marked as "not applicable", but I haven't looked into how much work that would be to introduce those concepts to WPT and wpt.fyi.

I think assert_implements_optional would be a good fit, failures turn into the PRECONDITION_FAILED subtest status. I've filed web-platform-tests/wpt.fyi#4672 about the ability to filter out such results on wpt.fyi, which might help make sense of the results.

  1. About DOM IDs

I'm not sure there is a way around the DOM IDs for this purpose, at least, not a low-hanging fruit solution, but it's worth a larger/continued conversation. For now DOM IDs will allow for a lot of testing, and aren't that much test writing overhead. Is it ok to put that aside for now in terms of getting this current PR landed?

Sure, if there's other way to identify an element in the platform accessibility tree, then ID it is.

My current goals

What I'd really like is for support for the extension to testharness and the changes I've made to wptrunner, and ideally land the current PR against WPT, then solve the remaining problems in follow up PRs.

As long as the RFC has a fairly clear intended end state, iterating on the actual implementation SGTM. And making small changes to the RFC itself to clarify after the fact would be fine too, I think.

The follow up PRs would be iterating on test design and, separately, figuring out a big issue -- which is how to turn on accessibility only for these test. Once I can turn on accessibility on for a subset of tests, I think we can switch --enable-accessibility-api to true, and these tests will be run on wpt.fyi by default. I think I can take advantage of the way wpt does chunking for tests, which is usually by directory, and force the browser to only turn on accessibility when the chunk includes a accessibility API directory.

I can't find a clear pattern to follow, but you could imagine arguments that depend on the test type in executor_kwargs for each browser or perhaps in the shared setup. I'm not sure if there's a clean way to do this based on a directory, however. @jgraham do you know?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants