-
Notifications
You must be signed in to change notification settings - Fork 78
RFC 204: WPT testing for AAMs (platform accessibility APIs) #204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
452afcf to
b5b7c1b
Compare
I think it would be ideal if we could avoid per-platform tests by instead expressing the per-platform mapping for a given feature in some configuration file that's included in tests. I think that will make it easier to contribute new tests as you don't necessarily have to know about the platform mappings (unless you're adding coverage for a completely new feature) and increase the likelihood of cross-platform coverage (platforms will only have to update these configuration files and not duplicate tests and such). |
|
@annevk interesting idea. The tests would assert that the appropriate mapping is used (e.g., core-aam "aria-autocomplete=inline, list, or both" or html-aam "input (type attribute in the Color state)"). |
|
@annevk I remember talking with you about this at the webengine hacktest -- I think this might be a good idea and I've started to explore it. In this scenario, this json file contains mappings defined in a single table, like @zcorpan mentioned -- core-aam's aria-autocomplete=inline, list, or both and html-aams input (type attribute in the Color state). Then the test will check the mapping appropriate for that platform as defined in the JSON. The core-aam manual tests already have this data in a similar structure, see the manual test for alert which has a javascript object with the mapping expectations for each platform: https://github.com/web-platform-tests/wpt/blob/master/core-aam/alert-manual.html#L18 |
|
Ok, @annevk, here are some tests designed using that use a configuration file containing all of the platform mapping details: Igalia/wpt#14 We can write tests that do not look operating system dependent this way, but, the results of the test are operating system dependent. I tried to get around this, but its a bit awkward -- in this design, every test expands to 4 subtests, one per API, regardless of the operating system the test was run on. Ideally, we would have a single subtest that says to the backend: "Hey, test the appropriate API and let us know the result", but we can't because of operating systems and APIs are not a 1-1 mapping. Windows has two accessibility APIs (UIA and IAccessible2) and we need to know the status of both APIs separately (different ATs use different APIs, both need to be supported). So at minimum, we need different 2 subtest for windows, where as we only need one for linux and one for macOS. I wrote about some alternatives test designs in the link PR's description. They could probably all be used with this "configuration file" idea, the configuration file design doesn't solve the platform specific test and/or results problem. |
|
That is tricky! Would it be possible that for these tests we handle the operating system and API together as a single platform? So we run the test for Windows UIA and again for Windows IAccessible2? (Likely with skips when we already know the combination isn't supported for a given browser.) This would impact how we have to display the results for these tests (and possibly how we run them, but that already needs changes anyway), but it would keep everything else consistent. |
|
Hmm I think I'm not totally sure I understand the implications of your suggestion, or, it's what I'm already doing :) Here is an example of the linked test designs "results" for a run on linux: You have to expand the asserts to understand what is really going on. Only the linux test (Atspi) ran any assertions, and passed. On Windows, both the UIA and IAccessible2 rows would have data. The APIs that don't apply to the platform just trivially pass with no assertions. Essentially, a utility function asks the back end "can you run this test for this API" for each API, and the back end either sends a result or "no I can't run that test". |
|
It's a bit hard to explain but what I'm contemplating is that you'd have a single subtest, but that e.g. Firefox Linux, Firefox macOS, Firefox Windows UIA, and Firefox Windows IAccessible2 all have their own run for that single subtest. And the same for other browsers. Skipping variants that are not applicable. And the subtest expectations would adjust based on the platform AT API that is running. Basically I'm saying that we solve your
by making the APIs count and "run" these tests and not the operating systems, if you will. |
|
Reporting results to wpt.fyi separately for each (OS, accessibility API, browser) combination allows for side-by-side comparison. However, we wouldn't want to run all of wpt for each accessibility API. Maybe there can be one combination where all tests are run, and for the remaining applicable accessibility APIs only the AAM tests are run. |
Right, this is what we would have to special case for "this directory" (or some other signifier) in a couple of places. |
|
Ah now I understand what you are trying to say. I would be nice to see the support for "blockquote" as a single row across all API/OS/Browser combinations and it seems like the right direction to go in. We already need a special case for this directory, because these are the only tests that need "accessibility turned on". The browser typically does not compute the accessibility tree/accessibility APIs. So in this scenario, we would need to:
|
I have a couple of concerns with a declarative config style approach:
I don't quite follow this. New tests will necessarily have different expected results, which means the mapping to platforms will be different, even if that's just that there are different values for the different properties. And if you don't know the platform mappings, I don't quite follow why you'd be writing a new AAM test, since AAMs are all about platforms. I guess many non-core AAM entries map to ARIA roles/states, so those could be common somehow, but Core AAM in particular (and to a lesser extent other AAMs) inherently requires an understanding of platform mappings. I feel like I'm missing something here, though. What's the scenario where a new AAM test would be added without any concern for platform mappings? |
This could potentially be solved by waiting for events. I wonder whether this new API might need to support events to solve this problem. |
Some possible scenarios:
I don't think it has to necessarily be declarative. Just abstracted. |
Right, in many scenarios we can wait for the appropriate event on each platform before testing -- for example, before starting the test, we could listen for Eventually, these tests will need to support events for testing events described in the AAMs, and we will probably be able to resolve most if not all sources of flakiness by listening for events. Maybe we are being overly cautious by saying the tests might be flaky... |
|
I've reviewed this RFC and something like this does seem necessary to adequately test AAM specs, in particular how the browser's accessibility tree is mapped platform accessibility APIs. w3c/webdriver-bidi#443 proposes to expose the accessibility tree through WebDriver BiDi, which would allow testing the shape and properties of the accessibility tree, also important but a different layer. There might still be some overlap and common problems however, so I'm commenting on both discussions to cross-link them. My thoughts on some of the open questions listed:
Adding dependencies to the appropriate
Since you need to use the testharness.js assertions and testdriver.js automation bits, I think sticking with the existing test type is best. But if you do want to be able to filter these tests, then a new test type is probably the most straightforward. #229 is an example of this that's in progress, with code to copy if needed.
As long as the API is "roughly like WebDriver" it should be fine, and having a string argument and returning an object with properties is quite like WebDriver.
There is some support for this in the form of I think the style of the tests using
w3c/webdriver#1833 would be a good solution to this I think. |
|
A thought on the testdriver.js API shape. The tests in Igalia/wpt#2 call To avoid the test having to give each element a unique ID, I wonder if the argument could instead be an element and if the implementation could ensure there's a way to identify it in the platform accessible tree? Some existing APIs use a Relatedly, I wonder if "put random data in DOM, wait for it to appear in platform accessibility tree" could also make the test faster and more reliable, by having an unambiguous signal that the tree has been updated. |
|
Hi @foolip! Thanks so much for the review, I really appreciate the thoughts! I'm sorry that I really should have updated the above link to point to the PR I've been working on against WPT directly, which has moved things a long in a few directions since the PR you reviewed: web-platform-tests/wpt#53733 Status of the progress since this RFC, on the new PR, in relation to your comments:
The actual design of the test endpoint is not finalized, and I'm discussing with people in ARIA (and most specifically James Teh from firefox), but it probably won't be "get_platform_accessibility_node", like you can see in the first POC in Igalia/wpt#2. The properties/end points of the different accessibility APIs are too varied and it doesn't really make sense to squash them into a single object. The current tests in web-platform-tests/wpt#53733 use a test object that was made/used by Joanie Diggs to test accessibility APIs outside of WPT -- you can see them in all the manual tests in the core-aam directory of wpt. Jamie is advocating in this comment to actually write the tests as a python strings, to be able to write the tests directly in terms of the accessibility API, and then execute the string on the backend.
You mentioned looking into adding another test type, but for testing platform-specific accessibility events, I'll want to be able to call webdriver commands like click, so I think I'll be sticking to extending testdriver unless there are any big yet-unheard objections. Jonathan Lee from Google just reviewed the CL and specifically the changes to testdriver, I've been really tight on time lately but slowly making the changes he's requested.
You can read in the description of the PR, but the python dependencies have been added :) however, they are only installed if you run the tests with --enable-accessibility-api, because the linux tests have a system dependency probably not on any bots that run WPT downstream, and I didn't want to break downstream runs of WPT. So landing this PR as is will not run the accessibility tests wpt.fyi, they will individually fail with "accessibility API testing not enabled". However, I pushed a commit to the branch with --enable-accessiblity-api set to true, you can see a run of the tests successfully from the CI on wpt.fyi: https://wpt.fyi/results/core-aam?sha=d508e4016d&label=pr_head
You mentioned a way to ignore folders... but I'm a bit attached to the idea of a single test file containing all assertions for all 4 accessibility APIs, it will be more obvious to see whether a new role or specific markup is fully supported, across all APIs, if there is a single test that tests all the platforms. This is how the current PR's tests are designed: each subtest is a different accessibility API, and the accessibility API not relevent for that platform just "pass" in order to not raise false flags: https://wpt.fyi/results/core-aam/role/blockquote.tentative.html?sha=d508e4016d&label=pr_head I think, ideally, those sub tests (like the AXAPI (mac) subtests, when running the test on linux) could be marked as "not applicable", but I haven't looked into how much work that would be to introduce those concepts to WPT and wpt.fyi.
I'm not sure there is a way around the DOM IDs for this purpose, at least, not a low-hanging fruit solution, but it's worth a larger/continued conversation. For now DOM IDs will allow for a lot of testing, and aren't that much test writing overhead. Is it ok to put that aside for now in terms of getting this current PR landed? My current goals What I'd really like is for support for the extension to testharness and the changes I've made to wptrunner, and ideally land the current PR against WPT, then solve the remaining problems in follow up PRs. The follow up PRs would be iterating on test design and, separately, figuring out a big issue -- which is how to turn on accessibility only for these test. Once I can turn on accessibility on for a subset of tests, I think we can switch |
|
I forgot to say I landed a change on chromedriver to get the PID: https://chromium-review.googlesource.com/c/chromium/src/+/5825307 So all I need is the same change in safaridriver and there is no problem with parallel tests :) |
If the tests need to do both setup as HTML/CSS/JS and do some assertions using Python, I could see these approaches:
I think
Sure, if there's other way to identify an element in the platform accessibility tree, then ID it is.
As long as the RFC has a fairly clear intended end state, iterating on the actual implementation SGTM. And making small changes to the RFC itself to clarify after the fact would be fine too, I think.
I can't find a clear pattern to follow, but you could imagine arguments that depend on the test type in |
Here is a link to the markdown file rendered: https://github.com/Igalia/rfcs/blob/wpt-for-aams-rfc/rfcs/wpt-for-aams.md
There is a lot of background context in the RFC for those who wish to understand more about the underlying accessibility technologies.
Here are the notes for the WPT meeting where this was first brought up: https://pad.0b101010.services/wpt-infra-meeting?view