Skip to content

SF-3773 Improve onboarding request save speed to fix timeouts#3785

Merged
Nateowami merged 1 commit intomasterfrom
fix/SF-3773-onboarding-request-submission-timeouts
Apr 16, 2026
Merged

SF-3773 Improve onboarding request save speed to fix timeouts#3785
Nateowami merged 1 commit intomasterfrom
fix/SF-3773-onboarding-request-submission-timeouts

Conversation

@Nateowami
Copy link
Copy Markdown
Collaborator

@Nateowami Nateowami commented Apr 9, 2026

I believe the reason we are getting duplicate onboarding requests is because the endpoint to submit the form is too slow and sometimes times out. This results in the user re-submitting the form, but by this point the projects are mostly all connected, and the second time it succeeds.

Instead, we should accept the form submission, and queue the email and project sync tasks, as these don't have to succeed for the form submission to be accepted.


Open with Devin

This change is Reviewable

@Nateowami Nateowami added the will require testing PR should not be merged until testers confirm testing is complete label Apr 9, 2026
@Nateowami Nateowami force-pushed the fix/SF-3773-onboarding-request-submission-timeouts branch from 239d8ea to 03a4ce4 Compare April 9, 2026 01:18
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 2 additional findings in Devin Review.

Open in Devin Review

Comment thread src/SIL.XForge.Scripture/Controllers/OnboardingRequestRpcController.cs Outdated
Comment on lines 198 to 202
else if (existingProject.Id == projectId)
{
// Verify that the source project is not the same as the target project
return InvalidParamsError("Source project cannot be the same as the target project");
// Skip syncing if the source project is the same as the target project.
continue;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Behavior change: source==target project error now silently skipped

The old code returned InvalidParamsError("Source project cannot be the same as the target project") when a referenced Paratext project resolved to the same SF project as the target. The new code at line 200-201 simply does continue, silently skipping the project. This is a deliberate design change since the background task can't return RPC errors to the caller, but it means invalid form data (listing the target project as a source) is no longer validated. If this validation is important, it should be performed before the Task.Run block, while the request can still return an error.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

github-advanced-security[bot]

This comment was marked as resolved.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 0% with 92 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.35%. Comparing base (9010370) to head (c00e394).
⚠️ Report is 1 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...ture/Controllers/OnboardingRequestRpcController.cs 0.00% 92 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3785      +/-   ##
==========================================
- Coverage   81.43%   81.35%   -0.09%     
==========================================
  Files         623      623              
  Lines       39451    39490      +39     
  Branches     6398     6422      +24     
==========================================
  Hits        32128    32128              
- Misses       6335     6361      +26     
- Partials      988     1001      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Nateowami Nateowami force-pushed the fix/SF-3773-onboarding-request-submission-timeouts branch from 03a4ce4 to ff7bc4a Compare April 13, 2026 16:43
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment on lines +76 to +81
_ = Task.Run(async () =>
{
// Query for assignees in drafting signup requests, filter out duplicates, and then look up their email
var adminUserIds = onboardingRequestRepository
.Query()
.Where(r => !string.IsNullOrEmpty(r.AssigneeId))
.Select(r => r.AssigneeId)
.Distinct()
.ToList();

await using IConnection conn = await _realtimeService.ConnectAsync(UserId);
var adminEmails = (await conn.GetAndFetchDocsAsync<User>(adminUserIds)).Select(u => u.Data.Email);

// Send email to each admin using EmailService
string userName = await userService.GetUsernameFromUserId(UserId, UserId);
string subject = $"Onboarding request for {projectDoc.ShortName}";
string link = $"{httpRequestAccessor.SiteRoot}/serval-administration/draft-requests/{request.Id}";
string body =
$@"
using IServiceScope scope = _serviceScopeFactory.CreateScope();
await SendOnboardingRequestEmailsAsync(request, submittingUserId, projectDoc, siteRoot, scope);
await SyncReferencedProjectsAsync(projectId, submittingUserId, formData, scope);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Fire-and-forget Task.Run failure in SendOnboardingRequestEmailsAsync silently prevents SyncReferencedProjectsAsync from running

The Task.Run lambda at lines 76-81 calls SendOnboardingRequestEmailsAsync and then SyncReferencedProjectsAsync sequentially, with no top-level try-catch. In SendOnboardingRequestEmailsAsync, the service resolution at lines 107-112 (GetRequiredService calls) happens outside the method's try-catch block (which starts at line 114). If any service resolution fails, the exception propagates out of the method, preventing SyncReferencedProjectsAsync from executing and causing the exception to be silently lost (the task is discarded with _ = Task.Run(...)). This is a regression from the old code, where the email-sending try-catch (old lines 75-115) was independent of the project-syncing loop — email failures could not prevent project syncing. The same issue exists in SyncReferencedProjectsAsync where service resolution at lines 177-178 is outside the per-iteration try-catch at line 181.

Prompt for agents
The Task.Run lambda at lines 76-81 needs a top-level try-catch to ensure that (1) an exception in SendOnboardingRequestEmailsAsync cannot prevent SyncReferencedProjectsAsync from running, and (2) any unhandled exception is properly reported rather than silently lost.

There are two complementary approaches:

1. Wrap the Task.Run body in a top-level try-catch that calls _exceptionHandler.ReportException. This ensures no exception is ever silently swallowed.

2. Make the two operations independent by calling each in its own try-catch within the Task.Run lambda. This preserves the old behavior where email failures don't prevent project syncing.

Additionally, move the GetRequiredService calls inside the try-catch blocks of SendOnboardingRequestEmailsAsync (lines 107-112) and SyncReferencedProjectsAsync (lines 177-178) so that service resolution failures are caught by the existing error handling rather than escaping the methods.

The fix in the lambda might look like:

_ = Task.Run(async () =>
{
    try
    {
        using IServiceScope scope = _serviceScopeFactory.CreateScope();
        await SendOnboardingRequestEmailsAsync(request, submittingUserId, projectDoc, siteRoot, scope);
        await SyncReferencedProjectsAsync(projectId, submittingUserId, formData, scope);
    }
    catch (Exception ex)
    {
        _exceptionHandler.ReportException(ex);
    }
});

Or better yet, ensure the two operations are independent by catching each separately within the lambda.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@pmachapman pmachapman self-requested a review April 13, 2026 19:15
Copy link
Copy Markdown
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman reviewed 1 file and all commit messages, made 1 comment, resolved 2 discussions, and dismissed @github-advanced-security[bot] from 2 discussions.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on Nateowami).


src/SIL.XForge.Scripture/Controllers/OnboardingRequestRpcController.cs line 81 at r2 (raw file):

                await SendOnboardingRequestEmailsAsync(request, submittingUserId, projectDoc, siteRoot, scope);
                await SyncReferencedProjectsAsync(projectId, submittingUserId, formData, scope);
            });

I wonder if it might be easier and to run this code via a hangfire job, in which case on failure, the error can be caught and reported?

It should also remove the need for the scoped services below. It will also give you the non-blocking behaviour that Task.Run can provide.

I did this for ApplyPreTranslationToProjectAsync and RetrievePreTranslationStatusAsync.

Code quote:

            _ = Task.Run(async () =>
            {
                using IServiceScope scope = _serviceScopeFactory.CreateScope();
                await SendOnboardingRequestEmailsAsync(request, submittingUserId, projectDoc, siteRoot, scope);
                await SyncReferencedProjectsAsync(projectId, submittingUserId, formData, scope);
            });

@pmachapman pmachapman self-assigned this Apr 14, 2026
@Nateowami Nateowami force-pushed the fix/SF-3773-onboarding-request-submission-timeouts branch from ff7bc4a to d38b3b0 Compare April 14, 2026 22:53
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Collaborator Author

@Nateowami Nateowami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nateowami made 1 comment.
Reviewable status: 0 of 1 files reviewed, 6 unresolved discussions (waiting on pmachapman).


src/SIL.XForge.Scripture/Controllers/OnboardingRequestRpcController.cs line 81 at r2 (raw file):

Previously, pmachapman (Peter Chapman) wrote…

I wonder if it might be easier and to run this code via a hangfire job, in which case on failure, the error can be caught and reported?

It should also remove the need for the scoped services below. It will also give you the non-blocking behaviour that Task.Run can provide.

I did this for ApplyPreTranslationToProjectAsync and RetrievePreTranslationStatusAsync.

I'm not sure how it could be easier. Better maybe, though I'm not really sure how it's better either. I'm not actually sure how to make them hangfire jobs, but I can try to work on it if you think it's better. Or if it's trivial for you, feel free to make the change.

Both methods report errors to Bugsnag, though the try/catch doesn't cover all lines. Maybe I should make them broader?

Copy link
Copy Markdown
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman reviewed 1 file and all commit messages, made 2 comments, resolved 3 discussions, and dismissed @github-advanced-security[bot] from 2 discussions.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on Nateowami).


src/SIL.XForge.Scripture/Controllers/OnboardingRequestRpcController.cs line 81 at r2 (raw file):

Previously, Nateowami wrote…

I'm not sure how it could be easier. Better maybe, though I'm not really sure how it's better either. I'm not actually sure how to make them hangfire jobs, but I can try to work on it if you think it's better. Or if it's trivial for you, feel free to make the change.

Both methods report errors to Bugsnag, though the try/catch doesn't cover all lines. Maybe I should make them broader?

It doesn't look trivial, as I would need to rewrite the controller to use a service instead of containing logic withing the controller actions. Sounds like a job for a later time.

@Nateowami Nateowami force-pushed the fix/SF-3773-onboarding-request-submission-timeouts branch from d38b3b0 to c00e394 Compare April 15, 2026 19:56
Copy link
Copy Markdown
Collaborator Author

@Nateowami Nateowami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nateowami made 1 comment and resolved 1 discussion.
Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions (waiting on pmachapman).

Comment thread src/SIL.XForge.Scripture/Controllers/OnboardingRequestRpcController.cs Dismissed
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +167 to 226
try
{
// Find all the paratext project ids in the sign up request.
List<string> paratextProjectIds =
[
.. new List<string>
{
formData.sourceProjectA,
formData.sourceProjectB,
formData.sourceProjectC,
formData.DraftingSourceProject,
formData.BackTranslationProject,
}
.Where(id => !string.IsNullOrEmpty(id))
.Distinct(),
];

// Connect each Paratext project that isn't already connected by creating resource projects
foreach (string paratextId in paratextProjectIds)
{
// Check if project already exists
SFProject existingProject = _realtimeService
SFProject existingProject = scopedRealtimeService
.QuerySnapshots<SFProject>()
.FirstOrDefault(p => p.ParatextId == paratextId);

if (existingProject is null)
{
// Create the resource/source project and add the user to it
string sourceProjectId = await _projectService.CreateResourceProjectAsync(
UserId,
string sourceProjectId = await scopedProjectService.CreateResourceProjectAsync(
userId,
paratextId,
addUser: true
);

// Sync the newly created project to get its data
await _projectService.SyncAsync(UserId, sourceProjectId);
await scopedProjectService.SyncAsync(userId, sourceProjectId);
}
else if (existingProject.Id == projectId)
{
// Verify that the source project is not the same as the target project
return InvalidParamsError("Source project cannot be the same as the target project");
// Skip syncing if the source project is the same as the target project.
continue;
}
else if (existingProject.Sync.LastSyncSuccessful == false)
{
// If the project exists but last sync failed, retry the sync
await _projectService.SyncAsync(UserId, existingProject.Id);
await scopedProjectService.SyncAsync(userId, existingProject.Id);
}
}

return Ok(request.Id);
}
catch (Exception)
catch (Exception exception)
{
_exceptionHandler.RecordEndpointInfoForException(
scopedExceptionHandler.RecordEndpointInfoForException(
new Dictionary<string, string>
{
{ "method", "SubmitOnboardingRequest" },
{ "method", "SyncReferencedProjectsAsync" },
{ "projectId", projectId },
{ "userId", UserId },
{ "userId", userId },
}
);
throw;
scopedExceptionHandler.ReportException(exception);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Single catch around entire foreach silently skips remaining projects on first sync failure

In SyncReferencedProjectsAsync, the foreach loop iterating over paratextProjectIds is wrapped in a single try-catch at lines 167-226. If CreateResourceProjectAsync or SyncAsync throws an exception for one project (e.g., project A), the catch block at line 215 catches it, reports it to Bugsnag, and returns — all subsequent projects (B, C, etc.) are silently never processed.

This is worse than the old code because previously (OnboardingRequestRpcController.cs:163-164 in old code) the exception would propagate to the outer catch and the user would receive an error response, prompting them to retry. Now, since this runs in a fire-and-forget Task.Run (line 76), the user receives a success response (Ok(request.Id) at line 83) and has no indication that some referenced projects were never synced. The per-iteration try-catch pattern would be more appropriate here to ensure all projects are attempted independently.

Prompt for agents
In SyncReferencedProjectsAsync (lines 156-227 in OnboardingRequestRpcController.cs), the try-catch wraps the entire foreach loop over paratextProjectIds. If any single project sync fails with an exception, all remaining projects are skipped silently. Since this now runs in a fire-and-forget background task (the user already received a success response), each iteration of the foreach should have its own try-catch so that one project failure does not prevent processing of the remaining projects. Move the try-catch to be inside the foreach loop body, keeping the exception reporting logic for each individual project failure.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@pmachapman reviewed 1 file and all commit messages, made 1 comment, resolved 1 discussion, and dismissed @github-advanced-security[bot] from a discussion.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on Nateowami).

@Nateowami Nateowami added ready to test testing complete Testing of PR is complete and should no longer hold up merging of the PR and removed will require testing PR should not be merged until testers confirm testing is complete ready to test labels Apr 16, 2026
@Nateowami Nateowami merged commit 3cf1e22 into master Apr 16, 2026
33 of 34 checks passed
@Nateowami Nateowami deleted the fix/SF-3773-onboarding-request-submission-timeouts branch April 16, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing complete Testing of PR is complete and should no longer hold up merging of the PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants