Conversation
|
@cjh1 fyi |
|
My vote would go that everything (FS/Compute) is always async and uses a 202 Accepted return. But that does change the API Semantics (old submit/new submit)... Should we separate this as a new |
|
Sounds good to me and I like the 202 response idea. I can start a |
|
I will also move delete and update to the async model. |
|
@gabor-lbl Sound great! Feel free to proceed. I will adapt to your changes for all new stuff that is pending, too! |
|
Sounds good, I will do this next week (I'm out tomorrow 4/17). |
The problem: one of our users needs to launch 100s of jobs at the same time. They don't want to rewrite their workflow. In the NERSC implementation jobs are launched async and waiting for this queue to clear out can time out. Even if we make job submission api for NERSC sync, starting 100-s of jobs will eventually use up all api resources, or the user will hit the api rate limits.
This solution re-uses the existing task queue (used for file system commands) with no updates needed by facility implementations.
The downside is that
Job.idis reused to contain a task_id. Maybe it would be better to have an explicitJob.task_idfield and makeJob.idnull-able?