fix: resolve pre-existing test failures on main#2605
fix: resolve pre-existing test failures on main#2605abdulraqeeb33 wants to merge 3 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses pre-existing unit test failures/flakiness by making SDK initialization and dispatcher behavior more deterministic, and by cleaning up global threading state between tests.
Changes:
- Updated
OneSignalImp.initWithContext(context, appId)to always dispatch initialization asynchronously and adjusted accessor gating logic to handleIN_PROGRESSandFAILEDinit states. - Extended
IOMockHelperto synchronously executesuspendifyOnMainwork in tests to remove race conditions. - Added
afterEachcleanup to resetThreadingMode.useBackgroundThreadinginFeatureManagerTests.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
OneSignalSDK/onesignal/testhelpers/src/main/java/com/onesignal/mocks/IOMockHelper.kt |
Mock suspendifyOnMain to make main-thread callbacks deterministic in unit tests. |
OneSignalSDK/onesignal/core/src/test/java/com/onesignal/core/internal/features/FeatureManagerTests.kt |
Reset global ThreadingMode after each test to reduce cross-test interference. |
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/internal/OneSignalImp.kt |
Make non-suspend init always async; improve behavior for IN_PROGRESS/FAILED states in accessors and operations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -50,7 +50,8 @@ internal class OneSignalImp( | |||
| private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO, | |||
| ) : IOneSignal, IServiceProvider { | |||
|
|
|||
There was a problem hiding this comment.
suspendCompletion is now a mutable shared reference. Since internalInit sets initState = FAILED and only then calls notifyInitComplete(), another thread can start a new init attempt in between (FAILED is not "accessible"), replacing suspendCompletion. The first init attempt would then complete the new deferred, unblocking waiters for the second init prematurely. Consider keeping a per-init local CompletableDeferred (capture it in the launched block and complete that instance) and/or synchronizing state transitions + completion signaling under initLock to avoid cross-init races.
| private val initLock = Any() |
| initState = InitState.IN_PROGRESS | ||
| suspendCompletion = CompletableDeferred() | ||
| } | ||
|
|
||
| initFailureException = IllegalStateException("OneSignal initWithContext failed.") | ||
|
|
There was a problem hiding this comment.
initState is set to IN_PROGRESS inside the initLock section, but initFailureException is assigned immediately after the lock is released. There’s a race where another thread can observe IN_PROGRESS, wait for init, and (on failure) throw a generic exception because initFailureException is still null. To make failure reporting deterministic, set initFailureException inside the same synchronized(initLock) block where initState/suspendCompletion are updated (or otherwise ensure it’s assigned-before publishing IN_PROGRESS).
| initState = InitState.IN_PROGRESS | |
| suspendCompletion = CompletableDeferred() | |
| } | |
| initFailureException = IllegalStateException("OneSignal initWithContext failed.") | |
| initFailureException = IllegalStateException("OneSignal initWithContext failed.") | |
| initState = InitState.IN_PROGRESS | |
| suspendCompletion = CompletableDeferred() | |
| } |
| return when (initState) { | ||
| InitState.SUCCESS -> getter() | ||
| InitState.IN_PROGRESS -> waitAndReturn(getter) | ||
| InitState.FAILED -> throw initFailureException | ||
| ?: IllegalStateException("Initialization failed. Cannot proceed.") |
There was a problem hiding this comment.
In legacy mode (!isBackgroundThreadingEnabled), InitState.IN_PROGRESS now blocks via waitAndReturn(getter) instead of failing fast. Given waitForInit() uses runBlocking, calling an accessor on the main thread immediately after initWithContext() can now block the UI thread indefinitely (ANR risk), whereas previously legacy mode threw if not initialized. If this behavior change is required, consider adding a main-thread guard (throw with guidance to use the suspend API / call from background) or otherwise making the blocking behavior explicit and safe.
| every { suspendifyOnMain(any<suspend () -> Unit>()) } answers { | ||
| val block = firstArg<suspend () -> Unit>() | ||
| trackAsyncWork(block) | ||
| } |
There was a problem hiding this comment.
This mock makes suspendifyOnMain execute via trackAsyncWork, but trackAsyncWork currently catches Exception and neither rethrows nor records it (the comment says it logs, but it doesn’t). That means exceptions thrown inside a fireOnMain callback can be silently swallowed and the test may still pass. Consider capturing the exception and failing the test (e.g., complete the waiter exceptionally / rethrow on awaitIO()), or at least logging so failures aren’t hidden.
| initState = InitState.IN_PROGRESS | ||
| suspendCompletion = CompletableDeferred() | ||
| } | ||
|
|
||
| initFailureException = IllegalStateException("OneSignal initWithContext failed.") | ||
|
|
||
| // FeatureManager depends on ConfigModelStore/PreferencesService which requires appContext. | ||
| // Ensure app context is available before evaluating feature gates. | ||
| ensureApplicationServiceStarted(context) | ||
|
|
||
| if (isBackgroundThreadingEnabled) { | ||
| // init in background and return immediately to ensure non-blocking | ||
| suspendifyOnIO { | ||
| internalInit(context, appId) | ||
| } | ||
| return true | ||
| } | ||
|
|
||
| // Legacy FF-OFF behavior intentionally blocks caller thread until initialization completes. | ||
| return runBlocking(runtimeIoDispatcher) { | ||
| // Always dispatch init asynchronously so this method never blocks the caller. | ||
| // Callers that need to wait (accessors, login, logout) will block via suspendCompletion. | ||
| suspendifyOnIO { | ||
| internalInit(context, appId) | ||
| } |
There was a problem hiding this comment.
🔴 If ensureApplicationServiceStarted throws between setting initState=IN_PROGRESS and dispatching suspendifyOnIO, suspendCompletion is never completed and initState stays permanently IN_PROGRESS. Any subsequent accessor call hits the new IN_PROGRESS -> waitAndReturn() branch in getServiceWithFeatureGate, calls suspendCompletion.await(), and deadlocks forever. Fix: wrap the ensureApplicationServiceStarted + suspendifyOnIO block in a try/catch that sets initState = FAILED and calls notifyInitComplete() on any exception.
Extended reasoning...
What the bug is and how it manifests
In the new initWithContext(context, appId), initState is set to IN_PROGRESS and suspendCompletion is replaced with a fresh CompletableDeferred() inside synchronized(initLock). Immediately after the lock is released, ensureApplicationServiceStarted(context) is called on the caller thread -- before suspendifyOnIO is dispatched. If that call throws, the exception propagates out of initWithContext with no cleanup: initState stays IN_PROGRESS and the brand-new, never-completed suspendCompletion is abandoned.
The specific code path that triggers it
ensureApplicationServiceStarted calls (applicationService as ApplicationService).start(context), which casts context.applicationContext to Application (ApplicationService.kt ~line 81). In non-standard host environments (instrumentation tests that wrap the context, Robolectric with custom application factories, multi-process scenarios, etc.) this cast can throw ClassCastException. registerActivityLifecycleCallbacks called inside start() can also throw in restricted environments. Any unchecked runtime exception escaping ensureApplicationServiceStarted triggers the bug.
Why existing code does not prevent it
There is no try/catch around the gap between the synchronized(initLock) block (which writes initState = IN_PROGRESS and creates a new suspendCompletion) and the suspendifyOnIO { internalInit(...) } dispatch. After the throw: isSDKAccessible() returns true for IN_PROGRESS (InitState.kt lines 36-38), so every subsequent initWithContext call returns true immediately without retrying. Meanwhile getServiceWithFeatureGate now has an explicit IN_PROGRESS -> waitAndReturn(getter) branch (added by this PR) which calls waitForInit() -> runBlocking { suspendCompletion.await() } -- a deferred that will never be completed.
What the impact is
Every accessor property (user, notifications, session, location, inAppMessages) and every non-suspend call to login/logout blocks the calling thread forever, producing an ANR if called from the main thread. The application cannot recover: retrying initWithContext is a no-op (isSDKAccessible() returns true for IN_PROGRESS), and there is no API to reset state. The previous runBlocking-based FF-OFF path would have surfaced the exception to the caller so it could be caught and handled. This PR removes that last safety net by making init always-async.
Step-by-step proof
- App calls OneSignal.initWithContext(ctx, appId) where ctx.applicationContext is not an Application subclass.
- synchronized(initLock): initState = IN_PROGRESS, suspendCompletion = CompletableDeferred() (uncompleted).
- ensureApplicationServiceStarted(ctx) calls ApplicationService.start(ctx) which casts ctx.applicationContext as Application -- throws ClassCastException -- propagates out of initWithContext.
- suspendifyOnIO { internalInit(...) } is never reached; notifyInitComplete() is never called.
- initState is still IN_PROGRESS; isSDKAccessible() returns true.
- App calls OneSignal.notifications.permission -- getServiceWithFeatureGate -- IN_PROGRESS branch -- waitAndReturn(getter) -- runBlocking { suspendCompletion.await() } -- blocks forever (ANR on main thread).
- App retries OneSignal.initWithContext(ctx2, appId) -- synchronized(initLock) sees isSDKAccessible() == true -- returns true immediately -- initialization is never retried.
How to fix it
Wrap the call to ensureApplicationServiceStarted and the suspendifyOnIO dispatch in a try/catch block. In the catch handler: set initState = InitState.FAILED, assign initFailureException = e, and call notifyInitComplete() so any waiting accessor unblocks and rethrows the exception instead of hanging forever:
try {
ensureApplicationServiceStarted(context)
suspendifyOnIO { internalInit(context, appId) }
} catch (e: Exception) {
initFailureException = e
initState = InitState.FAILED
notifyInitComplete()
throw e
}- SDKInitTests: make initWithContext(context, appId) non-blocking by always dispatching internalInit asynchronously. Update getServiceWithFeatureGate to properly handle FAILED (throw initFailureException) and IN_PROGRESS (block until ready) states instead of only checking isInitialized. Add requireInitForOperation helper for login/logout to also wait when init is in progress. - InAppMessagesManagerTests: add suspendifyOnMain mock to IOMockHelper so fireOnMain callbacks execute synchronously in tests, eliminating the race condition where verify ran before the spawned thread. - NotificationGenerationProcessorTests: register IOMockHelper and remove delay() calls from "prevent default behavior twice" tests. The delays exceeded the 10ms EXTERNAL_CALLBACKS_TIMEOUT causing the timeout to fire before the async block could set wantsToDisplay. - FeatureManagerTests: add afterEach cleanup for ThreadingMode global state to prevent CI flakiness from parallel test execution. Made-with: Cursor
dd21578 to
c4eb74b
Compare
Made-with: Cursor
Remove notification.display() after the second preventDefault(true). The display waiter uses a CONFLATED channel; display() wakes with true and overwrites preventDefault(true)'s wake(false), so the processor could still call displayNotification — flaky on fast CI (see PR #2605). Clarify IOMockHelper KDoc: top-level launchOnIO is not stubbed; only OneSignalDispatchers.launchOnIO is. Made-with: Cursor
Summary
initWithContext(context, appId)now always dispatches init asynchronously (non-blocking), andgetServiceWithFeatureGateproperly handlesFAILED(throwsinitFailureException) andIN_PROGRESS(blocks until ready) states. Fixes both "initWithContext with appId does not block" and "accessor instances after initWithContext without appID shows the failure reason".suspendifyOnMainmock toIOMockHelpersofireOnMaincallbacks execute synchronously in tests, eliminating the race condition in "addLifecycleListener subscribes listener".afterEachcleanup forThreadingModeglobal state to prevent CI flakiness from parallel test execution.Test plan
SDKInitTestspass (15/15), including the two previously failing testsInAppMessagesManagerTestspass (63/63), including "addLifecycleListener subscribes listener"FeatureManagerTestspass (4/4)testDebugUnitTestacross all modules passes with no regressionsMade with Cursor