Skip to content

feat(webhook): support multi-region SQS dispatch in runner webhook#5099

Open
bogdankrasko wants to merge 1 commit intogithub-aws-runners:mainfrom
bogdankrasko:pr/bk/update-dispatch-runner-lambda-sqs-multi-region
Open

feat(webhook): support multi-region SQS dispatch in runner webhook#5099
bogdankrasko wants to merge 1 commit intogithub-aws-runners:mainfrom
bogdankrasko:pr/bk/update-dispatch-runner-lambda-sqs-multi-region

Conversation

@bogdankrasko
Copy link
Copy Markdown

@bogdankrasko bogdankrasko commented Apr 9, 2026

Description

This PR updates the webhook SQS dispatch logic to select the SQS client region from the target queue URL instead of always using AWS_REGION.

This fixes cross-region dispatch scenarios where the webhook needs to send workflow job messages to an SQS queue in a different AWS region than the Lambda runtime. The change also caches SQS clients per region so repeated sends reuse the correct client while still creating separate clients when queues are in different regions.

To support this model, the webhook runner_matcher_config must include each runner group's matcherConfig together with the region-specific SQS queue URL (id) and queue ARN (arn) for every runner group in every region. This allows the webhook to match the job and dispatch it to the correct regional queue.

This also requires runner pool labels to be unique across all configured regions so webhook matching remains deterministic and a given job maps to exactly one regional queue.

Test Plan

  • Added and updated unit tests in lambdas/functions/webhook/src/sqs/index.test.ts to cover:
    • queue URL region parsing
    • fallback to AWS_REGION when the queue URL cannot be parsed
    • behavior when no region can be resolved
    • client reuse for multiple queues in the same region
    • separate clients for different queue regions
  • Ran yarn vitest run functions/webhook/src/sqs/index.test.ts
  • Deployed the webhook stack only in the primary region, us-east-2, so a single API Gateway/webhook entrypoint handles workflow dispatch for runner queues across multiple AWS regions
  • Deployed validation stacks across multiple AWS regions using modules/runners. This setup provisions per-runner-group Lambda functions (scale-up, scale-down, pool, and job-retry), along with the SQS queues defined in modules/multi-runner/queues.tf. This was used to validate both same-region and cross-region dispatch behavior.
    To enable it, an additional wrapper was introduced within modules/runners to incorporate the required SQS queues.
    • us-east-2: Ubuntu 24.04 runner pools for both amd64 and arm64. This region was used to validate that jobs with architecture-specific labels were matched correctly and dispatched successfully to queues in the local region.
    • us-west-2: Ubuntu 24.04 amd64-2 runner pool. This region was used as the secondary-region target to validate that the webhook could dispatch a workflow job to an SQS queue outside the Lambda runtime region.
  • Triggered workflow job dispatch in the deployed environment and verified that the webhook sent messages successfully to the expected queue in each case, including the cross-region queue path this change is intended to support

Related Issues

Derive the SQS client region from the queue URL instead of always using AWS_REGION. This prevents cross-region SignatureDoesNotMatch failures when the webhook sends messages to queues outside the Lambda region.

Cache traced SQS clients by region to avoid recreating clients for repeated sends, while still creating separate clients for different queue regions. Fall back to AWS_REGION when the queue URL cannot be parsed, and fall back to the SDK default region resolution when no region is available.

Expand unit coverage for queue URL region parsing, AWS_REGION fallback, missing-region behavior, same-region client reuse, and per-region client separation.
@bogdankrasko bogdankrasko requested a review from a team as a code owner April 9, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant