Add agent-runner environment parity test suite #12130

Copilot · 2026-01-27T22:57:06Z

Automated validation that the agent container environment maintains parity with GitHub Actions runners for utilities, runtimes, environment variables, and shared libraries. Prevents regressions as we close the gap between agent and runner environments.

Integration Tests (`pkg/workflow/agent_parity_test.go`)

Five test functions validate workflow compilation for environment parity checks:

Utilities: 12 essential tools (jq, curl, git, wget, tar, gzip, unzip, sed, awk, grep, find, xargs)
Runtimes: Node.js, Python, Go, Ruby version checks
Environment Variables: JAVA_HOME, ANDROID_HOME, GOROOT, PATH, HOME, USER
Shared Libraries: ldd validation for python3, node, git, curl binaries
Comprehensive: All categories in single workflow

Tests generate workflow markdown, compile to lock files, and assert expected content is present.

Smoke Test Workflow (`.github/workflows/agent-runner-parity.md`)

Production validation workflow that:

Runs every 6 hours and on workflow_dispatch
Triggers on "test-parity" PR label
Executes validation commands via Copilot + bash tools
Posts concise pass/fail summaries as comments

Example output format:

🔍 Agent-Runner Environment Parity Test Results

Utilities: 11/12 ❌
  - Missing: unzip

Runtimes: 4/4 ✅
Environment Variables: 4/5 ❌
  - Missing: ANDROID_HOME
  
Shared Libraries: 4/4 ✅

Overall: FAIL ❌

Documentation (`specs/agent-container-testing.md`)

Coverage tables for all test categories, test execution flows, extension instructions, troubleshooting guide.

Build Integration

make test-parity runs integration tests with proper build tags and filters.

Original prompt

This section details on the original issue you should resolve

<issue_title>[plan] Create integration test suite for agent-runner environment parity</issue_title>
<issue_description>## Objective

Develop a comprehensive integration test suite that validates the agent container environment has parity with the GitHub Actions runner environment for essential tools, libraries, and environment variables.

Context

As we close the gap between the agent container and the action runner, we need automated tests to verify that:

Mounted utilities are accessible and functional

Runtime tools (Python, Node.js, Go, Ruby) can be found and executed

Environment variables are correctly mirrored

Shared libraries are properly linked

Without these tests, we risk regressions and incomplete implementations.

Approach

Create a test workflow that runs inside the agent container and verifies:

Essential /usr/bin utilities are accessible (jq, curl, git, etc.)

All runtime binaries are in PATH and executable (node, python3, go, ruby)

Environment variables are set correctly (JAVA_HOME, ANDROID_HOME, etc.)

Shared libraries can be loaded (test with ldd on key binaries)

Add Go integration tests in pkg/workflow/*_test.go files

Create a smoke test workflow in .github/workflows/agent-runner-parity.md

Document the test approach in specs/agent-container-testing.md

Files to Create/Modify

Create: pkg/workflow/agent_parity_test.go (integration tests)

Create: .github/workflows/agent-runner-parity.md (smoke test workflow)

Create: specs/agent-container-testing.md (test documentation)

Modify: Makefile (add make test-parity target)

Acceptance Criteria

Integration tests cover utility accessibility (at least 10 utilities)

Integration tests verify runtime availability (node, python3, go, ruby)

Integration tests validate environment variable presence (at least 5 vars)

Smoke test workflow runs successfully in CI

Test documentation explains testing strategy and coverage

All tests pass consistently across multiple runs
Related to epic: build/test environment for agentic workflow #11970

AI generated by Plan Command for #11970

Comments on the Issue (you are @copilot in this section)

Fixes [plan] Create integration test suite for agent-runner environment parity #11975

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions · 2026-01-28T00:37:21Z

🔍 PR Triage Results

Category: test | Risk: low | Priority: 47/100

Scores Breakdown

Impact: 25/50 - Test improvement - prevents regressions
Urgency: 7/30 - Recent PR, CI unstable
Quality: 15/20 - CI passing

📋 Recommended Action: batch_review

Excellent addition of comprehensive integration tests for agent-runner environment parity. This PR validates that the agent container has access to essential utilities, runtimes, environment variables, and shared libraries.

Test Coverage:

12 essential utilities (jq, curl, git, etc.)
Runtime checks (Node.js, Python, Go, Ruby)
Environment variables (JAVA_HOME, GOROOT, etc.)
Shared library validation

Related PRs: This test suite complements #12062 and #12085 which mount binaries and libraries.

Triaged by PR Triage Agent on 2026-01-28T00:34:49Z

AI generated by PR Triage Agent

github-actions · 2026-01-28T06:18:16Z

🔍 PR Triage Results

Category: test | Risk: low | Priority: 37/100

Scores Breakdown

Impact: 25/50 - Test improvements prevent regressions in agent-runner environment parity
Urgency: 5/30 - PR is 7 hours old, standard urgency
Quality: 7/20 - Draft status, CI status unknown

📋 Recommended Action: defer

Rationale: This PR adds comprehensive integration tests and smoke test workflows for agent-runner environment parity (utilities, runtimes, environment variables, shared libraries). While important for preventing regressions, it should remain deferred until out of draft. The changes are substantial (2,340 additions across 7 files) and need CI validation before review.

Next steps: Mark as ready for review when all integration tests pass and CI is green.

Triaged by PR Triage Agent on 2026-01-28

AI generated by PR Triage Agent

github-actions · 2026-01-28T18:23:59Z

🔍 PR Triage Results

Category: feature | Risk: high | Priority: 47/100

Scores Breakdown

Impact: 35/50 - Test infrastructure improvement with significant value
Urgency: 7/30 - Standard feature development timeline
Quality: 5/20 - Needs comprehensive description and CI validation

📋 Recommended Action: batch_review

This PR is part of batch-feature-high (6 PRs) - high-risk feature additions requiring careful review. Test suite additions are valuable but need thorough review for coverage and correctness.

Batch PRs: #12322, #12321, #12320, #12311, #12130, #12085

Triaged by PR Triage Agent on 2026-01-28T18:16:26Z

AI generated by PR Triage Agent

pelikhan · 2026-01-28T23:14:33Z

/q update pr triage agent to only add labels from a limited set of values. It is adding too Many labels. See this issue.

github-actions · 2026-01-28T23:15:02Z

🎩 Mission equipment ready! Q has optimized your workflow. Use wisely, 007! 🔫

github-actions · 2026-01-28T23:23:58Z

🔍 Q Investigation Complete

I've investigated the label proliferation issue you mentioned and created a fix.

Problem Found

This PR (#12130) demonstrates the issue perfectly:

10 total labels with conflicts
Both pr-risk:low AND pr-risk:high present
Both pr-priority:medium AND pr-priority:low present
Multiple pr-type and pr-action labels

Root Cause

The PR triage agent workflow (.github/workflows/pr-triage-agent.md) was configured to allow dynamic label creation without constraints:

safe-outputs:
  add-labels:
    max: 100
    # Omitting 'allowed' to permit dynamic label creation

This allowed unlimited labels including:

Multiple labels from same category
Dynamic pr-agent:{name} labels
Dynamic pr-batch:{id} labels

Solution

I've created a PR that:

Constrains labels to 25 predefined values across 4 categories
Enforces ONE label per category (type, risk, priority, action)
Reduces labels per PR from 10+ down to exactly 4
Removes dynamic label support for agent/batch labels

Next Steps

Once the PR is merged, the triage agent will only add 4 well-defined labels per PR, preventing the conflicts seen here.

Q - Workflow Optimizer

AI generated by Q

Initial plan

0b7e0bd

Copilot AI assigned Copilot and pelikhan Jan 27, 2026

Copilot started work on behalf of pelikhan January 27, 2026 22:58 View session

Copilot AI and others added 2 commits January 27, 2026 23:03

Initial plan for agent-runner environment parity testing

6bcb20e

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Add agent-runner parity integration tests and smoke test workflow

d00224b

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add integration test suite for agent-runner environment parity~~ Add agent-runner environment parity test suite Jan 27, 2026

Copilot AI requested a review from pelikhan January 27, 2026 23:17

Copilot finished work on behalf of pelikhan January 27, 2026 23:17

This was referenced Jan 28, 2026

feat: mount /usr/lib shared libraries for AWF container utilities #12085

Open

Mount recommended binaries into AWF agent container #12062

Merged

Smoke Test: Claude - 21420605932 #12142

Closed

github-actions bot added pr-action:defer pr-agent:copilot pr-priority:low pr-risk:low pr-type:test labels Jan 28, 2026

github-actions bot mentioned this pull request Jan 28, 2026

Add support for expires: false to disable expiration in safe-outputs #12311

Merged

github-actions bot added pr-action:batch_review pr-batch:batch-feature-high pr-priority:medium pr-risk:high pr-type:feature labels Jan 28, 2026

This was referenced Jan 28, 2026

Add gosec rule documentation links to specs/gosec.md #12322

Merged

Add agent persona research findings to project documentation #12321

Closed

[WIP] Add explicit permissions to workflows with risky triggers #12320

Closed

github-actions bot mentioned this pull request Jan 28, 2026

Constrain PR triage agent to limited label set #12385

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add agent-runner environment parity test suite #12130

Add agent-runner environment parity test suite #12130

Copilot AI commented Jan 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

pelikhan commented Jan 28, 2026

Uh oh!

github-actions bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add agent-runner environment parity test suite #12130

Are you sure you want to change the base?

Add agent-runner environment parity test suite #12130

Conversation

Copilot AI commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Integration Tests (pkg/workflow/agent_parity_test.go)

Smoke Test Workflow (.github/workflows/agent-runner-parity.md)

Documentation (specs/agent-container-testing.md)

Build Integration

Context

Approach

Files to Create/Modify

Acceptance Criteria

Comments on the Issue (you are @copilot in this section)

Uh oh!

github-actions bot commented Jan 28, 2026

🔍 PR Triage Results

Scores Breakdown

📋 Recommended Action: batch_review

Uh oh!

github-actions bot commented Jan 28, 2026

🔍 PR Triage Results

Scores Breakdown

📋 Recommended Action: defer

Uh oh!

github-actions bot commented Jan 28, 2026

🔍 PR Triage Results

Scores Breakdown

📋 Recommended Action: batch_review

Uh oh!

pelikhan commented Jan 28, 2026

Uh oh!

github-actions bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 28, 2026

🔍 Q Investigation Complete

Problem Found

Root Cause

Solution

Next Steps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 27, 2026 •

edited

Loading

Integration Tests (`pkg/workflow/agent_parity_test.go`)

Smoke Test Workflow (`.github/workflows/agent-runner-parity.md`)

Documentation (`specs/agent-container-testing.md`)

github-actions bot commented Jan 28, 2026 •

edited

Loading