Skip to content

fix(job-retry): wait for IAM policy before creating SQS event source mapping#5098

Open
phergoualch wants to merge 1 commit intogithub-aws-runners:mainfrom
phergoualch:fix/job-retry-event-source-depends-on-main
Open

fix(job-retry): wait for IAM policy before creating SQS event source mapping#5098
phergoualch wants to merge 1 commit intogithub-aws-runners:mainfrom
phergoualch:fix/job-retry-event-source-depends-on-main

Conversation

@phergoualch
Copy link
Copy Markdown

@phergoualch phergoualch commented Apr 9, 2026

Fixes #5097

Summary

  • add an explicit dependency from the job-retry event source mapping to the retry Lambda IAM policy
  • ensure the event source mapping is created only after the retry Lambda role has the required SQS permissions

Problem

When job_retry is enabled, Terraform creates the retry queue, retry Lambda, retry role, retry IAM policy, and event source mapping in the same apply.

The retry IAM policy already contains the correct queue permissions, but the event source mapping did not explicitly depend on that policy resource. As a result, AWS could validate the Lambda execution role before the policy was attached and reject the mapping with:

InvalidParameterValueException: The function execution role does not have permissions to call ReceiveMessage on SQS

In my case this was consistently reproducible. Apply failed every time before this change.

Change

This PR adds:

depends_on = [aws_iam_role_policy.job_retry]

to aws_lambda_event_source_mapping.job_retry in modules/runners/job-retry/main.tf.

Result

With this change in place, I was able to apply the same configuration successfully without the event source mapping errors.

Why this is safe

  • no permission set is changed
  • no runtime behavior is changed after successful apply
  • this only makes the intended creation order explicit

@phergoualch phergoualch requested a review from a team as a code owner April 9, 2026 11:55
@npalm
Copy link
Copy Markdown
Member

npalm commented Apr 9, 2026

I have no seen this problem, can you give a bit more details on what type of configuration is causing this. I not had this problem with the multi-runner setup. As far I see terraform should calculate the dependency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

job-retry: terraform apply always fails creating retry SQS event source mappings

2 participants