Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 27, 2026

Automated validation that the agent container environment maintains parity with GitHub Actions runners for utilities, runtimes, environment variables, and shared libraries. Prevents regressions as we close the gap between agent and runner environments.

Integration Tests (pkg/workflow/agent_parity_test.go)

Five test functions validate workflow compilation for environment parity checks:

  • Utilities: 12 essential tools (jq, curl, git, wget, tar, gzip, unzip, sed, awk, grep, find, xargs)
  • Runtimes: Node.js, Python, Go, Ruby version checks
  • Environment Variables: JAVA_HOME, ANDROID_HOME, GOROOT, PATH, HOME, USER
  • Shared Libraries: ldd validation for python3, node, git, curl binaries
  • Comprehensive: All categories in single workflow

Tests generate workflow markdown, compile to lock files, and assert expected content is present.

Smoke Test Workflow (.github/workflows/agent-runner-parity.md)

Production validation workflow that:

  • Runs every 6 hours and on workflow_dispatch
  • Triggers on "test-parity" PR label
  • Executes validation commands via Copilot + bash tools
  • Posts concise pass/fail summaries as comments

Example output format:

🔍 Agent-Runner Environment Parity Test Results

Utilities: 11/12 ❌
  - Missing: unzip

Runtimes: 4/4 ✅
Environment Variables: 4/5 ❌
  - Missing: ANDROID_HOME
  
Shared Libraries: 4/4 ✅

Overall: FAIL ❌

Documentation (specs/agent-container-testing.md)

Coverage tables for all test categories, test execution flows, extension instructions, troubleshooting guide.

Build Integration

make test-parity runs integration tests with proper build tags and filters.

Original prompt

This section details on the original issue you should resolve

<issue_title>[plan] Create integration test suite for agent-runner environment parity</issue_title>
<issue_description>## Objective

Develop a comprehensive integration test suite that validates the agent container environment has parity with the GitHub Actions runner environment for essential tools, libraries, and environment variables.

Context

As we close the gap between the agent container and the action runner, we need automated tests to verify that:

  1. Mounted utilities are accessible and functional
  2. Runtime tools (Python, Node.js, Go, Ruby) can be found and executed
  3. Environment variables are correctly mirrored
  4. Shared libraries are properly linked

Without these tests, we risk regressions and incomplete implementations.

Approach

  1. Create a test workflow that runs inside the agent container and verifies:
    • Essential /usr/bin utilities are accessible (jq, curl, git, etc.)
    • All runtime binaries are in PATH and executable (node, python3, go, ruby)
    • Environment variables are set correctly (JAVA_HOME, ANDROID_HOME, etc.)
    • Shared libraries can be loaded (test with ldd on key binaries)
  2. Add Go integration tests in pkg/workflow/*_test.go files
  3. Create a smoke test workflow in .github/workflows/agent-runner-parity.md
  4. Document the test approach in specs/agent-container-testing.md

Files to Create/Modify

  • Create: pkg/workflow/agent_parity_test.go (integration tests)
  • Create: .github/workflows/agent-runner-parity.md (smoke test workflow)
  • Create: specs/agent-container-testing.md (test documentation)
  • Modify: Makefile (add make test-parity target)

Acceptance Criteria

  • Integration tests cover utility accessibility (at least 10 utilities)
  • Integration tests verify runtime availability (node, python3, go, ruby)
  • Integration tests validate environment variable presence (at least 5 vars)
  • Smoke test workflow runs successfully in CI
  • Test documentation explains testing strategy and coverage
  • All tests pass consistently across multiple runs
    Related to epic: build/test environment for agentic workflow #11970

AI generated by Plan Command for #11970

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits January 27, 2026 23:03
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add integration test suite for agent-runner environment parity Add agent-runner environment parity test suite Jan 27, 2026
Copilot AI requested a review from pelikhan January 27, 2026 23:17
@github-actions
Copy link
Contributor

🔍 PR Triage Results

Category: test | Risk: low | Priority: 47/100

Scores Breakdown

  • Impact: 25/50 - Test improvement - prevents regressions
  • Urgency: 7/30 - Recent PR, CI unstable
  • Quality: 15/20 - CI passing

📋 Recommended Action: batch_review

Excellent addition of comprehensive integration tests for agent-runner environment parity. This PR validates that the agent container has access to essential utilities, runtimes, environment variables, and shared libraries.

Test Coverage:

  • 12 essential utilities (jq, curl, git, etc.)
  • Runtime checks (Node.js, Python, Go, Ruby)
  • Environment variables (JAVA_HOME, GOROOT, etc.)
  • Shared library validation

Related PRs: This test suite complements #12062 and #12085 which mount binaries and libraries.


Triaged by PR Triage Agent on 2026-01-28T00:34:49Z

AI generated by PR Triage Agent

@github-actions
Copy link
Contributor

🔍 PR Triage Results

Category: test | Risk: low | Priority: 37/100

Scores Breakdown

  • Impact: 25/50 - Test improvements prevent regressions in agent-runner environment parity
  • Urgency: 5/30 - PR is 7 hours old, standard urgency
  • Quality: 7/20 - Draft status, CI status unknown

📋 Recommended Action: defer

Rationale: This PR adds comprehensive integration tests and smoke test workflows for agent-runner environment parity (utilities, runtimes, environment variables, shared libraries). While important for preventing regressions, it should remain deferred until out of draft. The changes are substantial (2,340 additions across 7 files) and need CI validation before review.

Next steps: Mark as ready for review when all integration tests pass and CI is green.


Triaged by PR Triage Agent on 2026-01-28

AI generated by PR Triage Agent

@github-actions
Copy link
Contributor

🔍 PR Triage Results

Category: feature | Risk: high | Priority: 47/100

Scores Breakdown

  • Impact: 35/50 - Test infrastructure improvement with significant value
  • Urgency: 7/30 - Standard feature development timeline
  • Quality: 5/20 - Needs comprehensive description and CI validation

📋 Recommended Action: batch_review

This PR is part of batch-feature-high (6 PRs) - high-risk feature additions requiring careful review. Test suite additions are valuable but need thorough review for coverage and correctness.

Batch PRs: #12322, #12321, #12320, #12311, #12130, #12085


Triaged by PR Triage Agent on 2026-01-28T18:16:26Z

AI generated by PR Triage Agent

@pelikhan
Copy link
Contributor

/q update pr triage agent to only add labels from a limited set of values. It is adding too Many labels. See this issue.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 28, 2026

🎩 Mission equipment ready! Q has optimized your workflow. Use wisely, 007! 🔫

@github-actions
Copy link
Contributor

🔍 Q Investigation Complete

I've investigated the label proliferation issue you mentioned and created a fix.

Problem Found

This PR (#12130) demonstrates the issue perfectly:

  • 10 total labels with conflicts
  • Both pr-risk:low AND pr-risk:high present
  • Both pr-priority:medium AND pr-priority:low present
  • Multiple pr-type and pr-action labels

Root Cause

The PR triage agent workflow (.github/workflows/pr-triage-agent.md) was configured to allow dynamic label creation without constraints:

safe-outputs:
  add-labels:
    max: 100
    # Omitting 'allowed' to permit dynamic label creation

This allowed unlimited labels including:

  • Multiple labels from same category
  • Dynamic pr-agent:{name} labels
  • Dynamic pr-batch:{id} labels

Solution

I've created a PR that:

  1. Constrains labels to 25 predefined values across 4 categories
  2. Enforces ONE label per category (type, risk, priority, action)
  3. Reduces labels per PR from 10+ down to exactly 4
  4. Removes dynamic label support for agent/batch labels

Next Steps

Once the PR is merged, the triage agent will only add 4 well-defined labels per PR, preventing the conflicts seen here.


Q - Workflow Optimizer

AI generated by Q

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[plan] Create integration test suite for agent-runner environment parity

2 participants