Skip to content

Add proposal for queues and services in Tyger#291

Open
johnstairs wants to merge 1 commit intomainfrom
johnstairs/queue-proposal
Open

Add proposal for queues and services in Tyger#291
johnstairs wants to merge 1 commit intomainfrom
johnstairs/queue-proposal

Conversation

@johnstairs
Copy link
Member

Adding a functional proposal for queues and services in Tyger. The proposal does not get into implementation details. Comments welcome!

Copy link
Collaborator

@hansenms hansenms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty nice. Any thoughts on the underlying implementation? We handle this in the database or do we use Azure Storage Queue? Would this be supported in Docker mode (I think not)?

The main concern I have is with have the CLI authenticated in the container without being scoped down or something?


while true; do
# Receive returns {"status": "...", "items": [...]}
response=$(tyger queue receive "$queue" --create-output-buffers)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this implies that the container running in the service has Tyger control plane access. We have not had that before but I see how it could be needed now. We should think about if/how this could be scoped down? Would it be possible that it only has queue access and only to the queues and associated buffers relevant for this service?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few models for this:

  1. tyger is logged in with some unspecified account that has, contributor access.
  2. When creating the codespec, you explicitly like the tyger credentials to an existing workload identity.
  3. We have a separate token that is independent of Entra that is mounted as a secret in the container and that is rotated (like we do for SAS tokens). That token would have claims that have been explicitly granted to the service or run. They could grant access to one or more queues, read and write to buffers related to those queues, create buffers, etc.

output_buffer=$(echo "$item_json" | jq -r '.outputs.result')

# Start heartbeat in background
tyger queue item heartbeat "$item_id" --lease "$lease" --while-alive &
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just an example, but there is a good chance if something goes wrong below that this will zombie and the queue will be open forever.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a convenience feature. Ideally, you would take care of the heartbeat in your own code in a background thread with proper error handling etc. In this case, if the main script exits because of a failure, the heartbeat process would detect that its parent is no longer alive and exit.

`<QUEUE_KEY>_QUEUE_NAME` containing the actual queue name (e.g.
`REQUESTS_QUEUE_NAME=inference-requests`).

#### Scaling Services
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we (at some point) have autoscaling? It is somewhere between trigger and service if you let it scale to zero when idle for long enough?

Copy link
Member Author

@johnstairs johnstairs Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think autoscaling would be a valuable feature, but I think how it is defined (function of queue length, based a target average response time?) would require a whole other spec. Scale to 0 would be a really interesting capability.

@@ -0,0 +1,697 @@
# Proposal: Queues and Services in Tyger
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where to park this comment, so it will be at the top....any thoughts on this in a multi-tenant environment. There multiple tenants could really benefit from this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had not thought about multi-tenancy. All of the scenarios in this document are scoped per organization. Are you thinking about queues that are shared across organizations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you're thinking about a shared inferencing service. Ok this gets complicated because now a service has to be able to access queues and buffers across multiple organizations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't think we should consider it right now, but if we consider our own multi-tenancy use cases, this would be pretty handy. Anyways also a reminder that creating these services is probably a privileged operation?

The container is completely unaware of queues—it just reads from input pipes and
writes to output pipes.

#### Parameter Forwarding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit unclear what should happen if you have output parameters since the processor is required to provide them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be an error to create a trigger on a queue with output parameters.

@johnstairs
Copy link
Member Author

This looks pretty nice. Any thoughts on the underlying implementation? We handle this in the database or do we use Azure Storage Queue? Would this be supported in Docker mode (I think not)?

The main concern I have is with have the CLI authenticated in the container without being scoped down or something?

I think that the implementation would probably be handled in the database, because the functionality is a little different than normal message queues in that the queue message is durable and can be updated with a response. And if we implement it in the database, it would be relatively straightforward to get this working in Docker mode as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants