Conversation
4509a8f to
457b745
Compare
113e96b to
00d3799
Compare
|
@claude re-review |
|
Claude finished @nan-li's task in 3m 55s —— View job Re-review: feat/identity_verification_5.8
All four critical issues flagged in the previous review have been addressed. Here's a status breakdown: Previously Critical — Now Fixed ✅1. Runtime 401 not firing Fixed correctly. _operationRepo.setJwtInvalidatedHandler { externalId ->
_userManager.fireJwtInvalidated(externalId)
}And in the 2. Logout race condition — Fixed correctly. 3. Fixed correctly. The JSON parsing is wrapped in 4. Fixed correctly. The field now defaults to useIdentityVerification = responseJson.safeBool("jwt_required") ?: falseNon-IV apps get Remaining Observations (non-blocking)A. private var _jwtInvalidatedHandler: ((String) -> Unit)? = nullThis field is written from B. Expired JWT at logout time blocks the unsubscribe operation ( When IV=true and a 401 has already invalidated the JWT before the user calls This is an edge case (JWT expired + user logs out before providing a refresh) and the overall security posture of the IV feature prevents worse outcomes, but it may be worth documenting or handling explicitly — for example, by skipping the JWT gate for the unsubscribe-on-logout operation. Not blocking the PR. The implementation is solid. All critical issues are resolved and the approach is well-structured. The |
|
@claude re-review |
|
Claude finished @nan-li's task in 2m 39s —— View job Re-review: feat/identity_verification_5.8
Both remaining observations from the previous re-review have been addressed. The PR is clean. Previously Non-blocking — Now Fixed ✅A. Fixed in commit @Volatile
private var _jwtInvalidatedHandler: ((String) -> Unit)? = nullB. Expired JWT at logout blocks the push subscription disable ( Fixed in commit // Operation.kt:69
open val requiresJwt: Boolean get() = true
// UpdateSubscriptionOperation.kt:89
override val requiresJwt: Boolean get() = false
Minor Observation (no action needed)The The implementation is solid. All six issues identified across both review passes are resolved. The architecture is clean: |
972eace to
f6e4227
Compare
| <CurrentIssues> | ||
| <ID>ComplexCondition:InAppMessagesManager.kt$InAppMessagesManager$!message.isTriggerChanged && isMessageDisplayed && (isTriggerOnMessage || isNewTriggerAdded && isOnlyDynamicTriggers)</ID> | ||
| <ID>ComplexMethod:TriggerController.kt$TriggerController$private fun evaluateTrigger(trigger: Trigger): Boolean</ID> | ||
| <ID>ComplexMethod:InAppMessagesManager.kt$InAppMessagesManager$private suspend fun fetchMessages(rywData: RywData)</ID> |
There was a problem hiding this comment.
we should break it down if its a complex method.
There was a problem hiding this comment.
Yes, I undid the detekt changes and then added them back in the last few commits with reasoning. I actually didn't end up making changes to in-app-messages in detekt.
| * @param externalId The external ID of the user whose token is being updated. | ||
| * @param token The new JWT bearer token. | ||
| */ | ||
| fun updateUserJwt( |
There was a problem hiding this comment.
like i mentioned elsewhere, we need suspend methods for these
ca64cdf to
258c8e6
Compare
abdulraqeeb33
left a comment
There was a problem hiding this comment.
Lets try and do a bug bash on this today/tomorrow?
|
@claude review |
b445a01 to
17fc13b
Compare
There was a problem hiding this comment.
The two bugs I flagged in the last pass (IAM rate-limit poisoning and JWT exposure in login() logs) have been fixed in the latest commit — thanks for addressing them. Given the security-sensitive nature of this feature (JWT-based identity gating across 86 files), a final human look before merge is warranted.
Extended reasoning...
Overview
This PR adds Identity Verification (JWT-based authentication gating) to the Android SDK, spanning 86 files including new public API, a new JwtTokenStore, OperationRepo changes, IAM fetch gating, and HttpClient JWT header injection.
Prior Review Rounds
- Rounds 1–2 (late March): 6 bugs found and fixed.
- Round 3 (2026-04-14): 4 new bugs filed as inline comments. The latest commit (82678a5 "Address Claude review bot feedback on IV PR") addresses all of them:
- ✅ IAM rate-limit poisoning: Guard now checks both and , returning before is set.
- ✅ JWT logging in / : Both now use .
- ✅ FAIL_UNAUTHORIZED loop when IV=OFF: Gated on .
- ✅ JWT in HttpClient logs: Authorization header filtered from logged request properties.
- 403/401 conflation and stale ops at logout explicitly deferred by author (out of scope / by design).
Security Risks
The PR touches auth-critical paths: JWT storage/retrieval, JWT injection into HTTP headers, FAIL_UNAUTHORIZED error routing, and the public-facing / APIs. No new vulnerabilities found in this pass, but the breadth of the change means a human reviewer should give final sign-off.
Level of Scrutiny
This is production-facing security code that gates API calls behind developer-issued JWTs. The implementation looks correct across the key paths reviewed, but the sheer scope (86 files, new public API surface, threading concerns) means automated review alone is insufficient.
Other Factors
- The LoginHelper pre-HYDRATE logic is sound: the null case falls through to IV=OFF behavior, and IdentityVerificationService cleans up on HYDRATE.
- intentionally omits UNAUTHORIZED handling (requiresJwt=false by design; JWT is passed opportunistically).
- Test infrastructure () correctly gates test suites from the IV hold path.
16f9650 to
fbc53f2
Compare
…ication nullable, OptionalHeaders.jwt Foundational models and infrastructure for identity verification: - Create JwtTokenStore: persistent Map<externalId, JWT> backed by SharedPreferences, supporting multi-user JWT storage with getJwt/putJwt/invalidateJwt/pruneToExternalIds - Add var externalId to Operation base class so OperationRepo can stamp and gate operations per-user; remove redundant externalId from LoginUserOperation and TrackCustomEventOperation (same Model data-map key, no migration needed) - Change ConfigModel.useIdentityVerification from Boolean to Boolean? (null = unknown, false = off, true = on) to eliminate race between operation processing and remote params - Add jwt field to OptionalHeaders for passing Bearer tokens through HTTP layer - Add PREFS_OS_JWT_TOKENS key to PreferenceOneSignalKeys Made-with: Cursor
When the server omits jwt_required from the params response, safeBool() returned null, leaving useIdentityVerification unset. getNextOps() treated null as "not yet known" and returned null permanently, silently blocking every SDK operation for the session. Default to false when the field is absent so non-IV apps are unaffected by the identity verification gating logic. Made-with: Cursor
The field is written from the main thread in IdentityVerificationService.start() and read on the OperationRepo coroutine thread. @volatile guarantees cross-thread visibility. Made-with: Cursor
When IV is enabled and the JWT has already been invalidated (e.g. by a prior 401), the UpdateSubscriptionOperation to disable the push subscription would be permanently blocked by hasValidJwtIfRequired. Since the backend call would fail with 401 anyway, skip it entirely and just switch to the new anonymous user. Made-with: Cursor Revert "Skip push subscription disable on logout when JWT is already expired" This reverts commit 5ce284257b6e03b8c862745ff58fcf4293299577. Exempt UpdateSubscriptionOperation from JWT gating The subscription update endpoint does not require a JWT on the backend. Add Operation.requiresJwt (default true) and override it to false in UpdateSubscriptionOperation so these operations are not blocked by hasValidJwtIfRequired when the JWT is expired or missing. This fixes the edge case where logging out with an expired JWT would permanently block the push subscription disable operation. Made-with: Cursor
Made-with: Cursor
…ationRepo Pass externalId alongside onesignalId in every operation constructor, eliminating the fragile stampExternalId workaround and parentExternalId propagation from OperationRepo. Every creation site (model store listeners, session listener, executors, etc.) now explicitly provides the externalId at construction time. Made-with: Cursor
Add suspend variant of updateUserJwt for API consistency with login/loginSuspend. Also add waitForInit to the non-suspend updateUserJwt so it safely handles calls before SDK initialization.
- Baseline 2 new UseCheckOrError entries for OneSignalImp.kt for updateUserJwt/updateUserJwtSuspend init guards that follow the existing login/logout pattern. - Baseline ReturnCount for hasValidJwtIfRequired — the guard-clause style is clearer than collapsing into a single boolean expression for this JWT gating logic. - Narrow JwtTokenStore catch from Exception to JSONException to satisfy detekt's TooGenericExceptionCaught rule, no baseline added.
…r params All new _prefix constructor parameters (e.g. _jwtTokenStore, _identityModelStore, _configModelStore) follow the existing codebase convention — 80+ identical entries already exist in the baseline. These are DI-injected dependencies across IdentityVerificationService, OperationRepo, JwtTokenStore, and various executors/listeners.
- Update 4 existing baseline entries (added jwt/_jwtTokenStore to signatures) - Baseline 4 new LongParameterList entries for IV constructors: CreateSubscriptionOperation, UpdateSubscriptionOperation, RefreshUserOperationExecutor, and UpdateUserOperationExecutor crossed the constructor threshold (8) after adding externalId and _jwtTokenStore params. This follows existing patterns.
OperationRepo: on FAIL_UNAUTHORIZED, invalidate JWT and re-queue before notifying the app; invoke setJwtInvalidatedHandler synchronously with try/catch so listener failures do not hit executeOperations outer catch. UserManager.fireJwtInvalidated performs the single async hop (SupervisorJob + Dispatchers.Default) with Otel-style launch try/catch and per-listener isolation. IdentityVerificationService: call forceExecuteOperations before fireJwtInvalidated on HYDRATE. Documentation: IOperationRepo handler runs on the op-repo thread and must return quickly; public API documents background-thread delivery for IUserJwtInvalidatedListener, addUserJwtInvalidatedListener, and UserJwtInvalidatedEvent (LiveData postValue guidance). Tests: FAIL_UNAUTHORIZED JWT handler coverage; no artificial delay after sync repo handler.
…2600) Co-authored-by: AR Abdul Azeez <abdul@onesignal.com>
…s post-enqueue OperationRepo: gate FAIL_UNAUTHORIZED re-queue on useIdentityVerification == true. When IV is OFF, hasValidJwtIfRequired() always returns true so re-queued ops were immediately eligible, creating a ~200ms infinite retry loop. Now IV-OFF treats FAIL_UNAUTHORIZED as FAIL_NORETRY (drop + wake waiters). OperationRepo: in removeOperationsWithoutExternalId(), clear local existingOnesignalId on queued LoginUserOperations. When IV=ON arrives via HYDRATE, anonymous CreateUserOperations are purged, orphaning the local ID that LoginUserOperation.canStartExecute was waiting on translateIds to resolve. Clearing it unblocks the operation and routes the executor through createUser(). LoginUserOperation: widen existingOnesignalId setter to internal. Fix Operation.externalId KDoc to reflect that subclass constructors set this field, not IOperationRepo at enqueue time. Made-with: Cursor
LogoutHelper: add explicit null branch for unknown IV state (pre-HYDRATE). Disables push subscription and suppresses backend op (like IV=ON) while also enqueuing a LoginUserOperation so the anonymous user is created on the backend if IV turns out to be OFF. If IV=ON, removeOperationsWithoutExternalId() purges the anonymous op at HYDRATE. Known IV=ON and IV=OFF paths are unchanged. HttpClient: filter Authorization header from requestProperties before passing to logHTTPSent() so the full Bearer token is not written to debug logs. OneSignalImp: redact token in updateUserJwt/updateUserJwtSuspend debug logs. Made-with: Cursor
- Redact JWT from login/loginSuspend logs (show last 8 chars) - Fix error message in updateUserJwtSuspend - Add FAIL_UNAUTHORIZED to deleteSubscription and CustomEvent executors - Drop TransferSubscriptionOperation when IV=ON - Skip IAM fetch when JWT is invalidated (prevent rate-limit poisoning) - Clear existingOnesignalId unconditionally in removeOperationsWithoutExternalId
- MagicNumber(8): log suffix length in takeLast(8) — extracting a named constant for a single-use log format arg adds noise without improving readability - ReturnCount(transferSubscription): early-exit guard follows the same pattern as LoginUserFromSubscriptionOperationExecutor; restructuring to reduce returns would obscure the guard-clause intent - NestedBlockDepth(executeOperations): pre-existing from IV branch, the when-block nesting is inherent to the dispatch logic - UseCheckOrError: pre-existing non-suspend login/logout/accessor throws — consistent with existing codebase pattern - Removed stale entry for renamed error message
fbc53f2 to
fb2353f
Compare
- LogoutHelper: remove suppressBackendOperation=true in pre-HYDRATE branch so CreateSubscriptionOperation is enqueued alongside the anonymous LoginUserOperation (fixes IV=OFF path) - InAppMessagesManager: capture JWT once before guard check to eliminate TOCTOU between guard and backend call - nit: LoginUserOperationExecutor: add error log when anonymous LoginUserOperation is dropped with no subscription op
- Move anonymous operation suppression from individual model store listeners into OperationRepo.enqueue/enqueueAndWait so all enqueue sites (SessionListener, TrackGooglePurchase, CustomEventController, etc.) are covered. LoginUserOperation is exempt. - Remove redundant shouldSuppressForAnonymousUser() from IdentityModelStoreListener, PropertiesModelStoreListener, and SubscriptionModelStoreListener. - Add LogoutHelper tests for IV=true and IV=null (pre-HYDRATE) branches.
c5086c3 to
3f3dd09
Compare
| operationRepo.forceExecuteOperations() | ||
| } else { | ||
| if (!isInitialized) { | ||
| throw IllegalStateException("Must call 'initWithContext' before 'updateUserJwt'") | ||
| } | ||
| jwtTokenStore.putJwt(externalId, token) | ||
| operationRepo.forceExecuteOperations() | ||
| } | ||
| } | ||
|
|
||
| override suspend fun updateUserJwtSuspend( | ||
| externalId: String, | ||
| token: String, | ||
| ) = withContext(runtimeIoDispatcher) { | ||
| Logging.log(LogLevel.DEBUG, "updateUserJwtSuspend(externalId: $externalId, token: ...${token.takeLast(8)})") | ||
|
|
||
| suspendUntilInit(operationName = "updateUserJwt") | ||
|
|
||
| if (!isInitialized) { | ||
| throw IllegalStateException("Must call 'initWithContext' before 'updateUserJwt'") | ||
| } | ||
|
|
||
| jwtTokenStore.putJwt(externalId, token) | ||
| operationRepo.forceExecuteOperations() |
There was a problem hiding this comment.
🔴 Two issues in updateUserJwt()/updateUserJwtSuspend(): (1) forceExecuteOperations() wakes the operation queue with force=false, so after a 401-triggered re-queue the retry is delayed by the full ~5-second opRepoExecutionInterval batching window instead of executing immediately; (2) neither method signals the IAM subsystem to re-fetch in-app messages, leaving any IAM fetch that was skipped due to a missing JWT permanently stuck until the next session-start or push-subscription event.
Extended reasoning...
Bug 1 — forceExecuteOperations() uses force=false, causing a ~5-second delay after JWT renewal
When the operation queue receives a FAIL_UNAUTHORIZED result it re-enqueues the failing operation and sets the JWT to null. Control returns to processQueueForever(), which calls waitForNewOperationAndExecutionInterval() and enters the idle-wait state. When the app calls updateUserJwt() or updateUserJwtSuspend(), the implementation calls operationRepo.forceExecuteOperations() (OperationRepo.kt line 218). However forceExecuteOperations() wakes the main waiter with LoopWaiterMessage(false) — force=false. Inside waitForNewOperationAndExecutionInterval() (lines 237–253), the while (\!wakeMessage.force) loop stays true and the function waits up to opRepoExecutionInterval (default 5 000 ms) before breaking out to execute. The retryWaiter.wake(LoopWaiterMessage(true)) call also in forceExecuteOperations() only shortens the delayBeforeNextExecution() retry-backoff wait, which is not the active state here.
Concrete proof: App calls login("alice", jwt1) → LoginUserOperation enqueued → backend returns HTTP 401 → executor sets jwtTokenStore.putJwt("alice", null) and returns FAIL_UNAUTHORIZED → op re-queued → processQueueForever loop calls waitForNewOperationAndExecutionInterval(). Now app calls updateUserJwt("alice", jwt2). forceExecuteOperations() fires waiter.wake(LoopWaiterMessage(false)). waitForNewOperationAndExecutionInterval receives the wake but wakeMessage.force == false, so it continues waiting for up to opRepoExecutionInterval (5 000 ms). The retry only fires after that full delay.
Fix: Change the waiter.wake call inside forceExecuteOperations() to LoopWaiterMessage(true), or add a dedicated immediateExecuteOperations() path that passes force=true to the main waiter.
Bug 2 — JWT renewal does not trigger an IAM re-fetch
InAppMessagesManager.fetchMessagesWhenConditionIsMet() is called from exactly four places: initial appId configuration, ConfigModel HYDRATE, push-subscription ID change, and explicit session-start. The new JWT guard added in this PR (InAppMessagesManager.kt lines 310–313) correctly avoids fetching IAMs when JWT is null, but there is no corresponding call to re-fetch once a valid JWT becomes available via updateUserJwt().
Concrete proof: App is configured with IV required. SDK starts, JWT is missing → fetchMessagesWhenConditionIsMet() short-circuits because jwtTokenStore.getJwt(externalId) == null. App later calls updateUserJwt("alice", validJwt). OneSignalImp.updateUserJwt() (lines 424–444) calls jwtTokenStore.putJwt() then operationRepo.forceExecuteOperations() — nothing else. InAppMessagesManager has no listener on JwtTokenStore and is never signalled. Result: in-app messages are never fetched for this session unless a push-subscription ID change or new session start coincidentally fires.
Fix: Call inAppMessagesManager.fetchMessagesWhenConditionIsMet() inside both updateUserJwt() and updateUserJwtSuspend() after putJwt(), or adopt the IJwtUpdateListener pattern already used in the companion PR #2613 and have InAppMessagesManager subscribe to JWT-renewal events.
| lastTimeFetchedIAMs = now | ||
| } | ||
|
|
||
| val (aliasLabel, aliasValue) = | ||
| IdentityConstants.resolveAlias( | ||
| _configModelStore.model.useIdentityVerification, | ||
| externalId, | ||
| _identityModelStore.model.onesignalId, | ||
| ) | ||
| // lambda so that it is updated on each potential retry | ||
| val sessionDurationProvider = { _time.currentTimeMillis - _sessionService.startTime } | ||
| val newMessages = _backend.listInAppMessages(appId, subscriptionId, rywData, sessionDurationProvider) | ||
| val newMessages = _backend.listInAppMessages(appId, aliasLabel, aliasValue, subscriptionId, rywData, sessionDurationProvider, jwt) | ||
|
|
||
| if (newMessages != null) { | ||
| this.messages = newMessages as MutableList<InAppMessage> |
There was a problem hiding this comment.
🔴 In fetchMessages(), externalId and jwt are captured before fetchIAMMutex.withLock as a TOCTOU guard, but _identityModelStore.model.onesignalId is read live after the mutex suspension point at line 332. If a user switch occurs while suspended on the mutex, the IV=true path will fetch IAM messages for the old user (using stale externalId+jwt) while the new user is active. The fix is to snapshot onesignalId alongside externalId and jwt before the mutex, so the entire identity triple is consistent.
Extended reasoning...
The inconsistency
The PR carefully snapshots two identity fields before fetchIAMMutex.withLock to avoid TOCTOU issues — externalId at line 304 and jwt at line 307 (the comment on line 305 even explains the rationale). However, a third identity field is still read live: _identityModelStore.model.onesignalId is accessed inside resolveAlias() at line 332, which runs after the coroutine suspends on the mutex.
Why a user switch can happen during the mutex wait
fetchIAMMutex is a coroutine Mutex, not a blocking lock. When fetchIAMMutex.withLock { ... } cannot immediately acquire the lock, the coroutine suspends, yielding the thread. During that suspension window other coroutines can run — including a user login/logout that replaces _identityModelStore.model with a new user's identity model. When fetchMessages() resumes after the mutex, externalId and jwt are stale (user A), but _identityModelStore.model.onesignalId is live (user B).
Impact per IV mode
IV=true path: resolveAlias(true, externalId, onesignalId) returns ("external_id", externalId) — onesignalId is ignored. The backend call uses user A's externalId as the alias and user A's jwt for authentication. This fetches and applies user A's in-app messages while user B is the active user — a cross-user data leak.
IV=false path: resolveAlias(false, externalId, onesignalId) returns ("onesignal_id", onesignalId) — externalId is ignored. Here the live read of onesignalId is actually correct: it reflects the current user B, so the fetch targets the right user. This path is benign.
Addressing the refutation
A refutation argued that resolveAlias uses XOR logic, so externalId and onesignalId are never both used in the same URL, meaning there's no "mixed" snapshot. This is correct but beside the point. The problem is not that two fields are mixed for the same URL — it is that the entire fetch (URL + JWT) belongs to user A while user B is active. The XOR logic means we consistently use user A's identity for the request; the inconsistency is between which user is fetched vs. which user is displayed.
Concrete proof
- User A (externalId="alice") is logged in;
fetchMessages()is called. - Line 304:
externalId = "alice". Line 307:jwt = alice_jwt. IV guards pass. fetchIAMMutex.withLockcannot acquire lock — coroutine suspends.- Concurrent coroutine: user switches to user B (externalId="bob");
_identityModelStore.modelis replaced. - Mutex is released;
fetchMessages()resumes. - Line 332:
resolveAlias(true, "alice", <bob's onesignalId>)→ returns("external_id", "alice"). _backend.listInAppMessages(..., "external_id", "alice", ..., alice_jwt)is called — fetching alice's IAMs, applying them to bob's session.
Suggested fix
Snapshot onesignalId alongside the other two identity fields, before the mutex:
val externalId = _identityModelStore.model.externalId
val onesignalId = _identityModelStore.model.onesignalId // snapshot here for consistent identity triple
val jwt = externalId?.let { _jwtTokenStore.getJwt(it) }Then pass the captured onesignalId variable to resolveAlias() instead of the live model read.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| externalId: String, | ||
| jwtBearerToken: String?, | ||
| ) { | ||
| Logging.log(LogLevel.DEBUG, "Calling deprecated login(externalId: $externalId, jwtBearerToken: $jwtBearerToken)") | ||
| Logging.log(LogLevel.DEBUG, "Calling deprecated login(externalId: $externalId, jwtBearerToken: ...${jwtBearerToken?.takeLast(8)})") | ||
|
|
||
| if (isBackgroundThreadingEnabled) { | ||
| waitForInit(operationName = "login") | ||
| suspendifyOnIO { loginHelper.login(externalId, jwtBearerToken) } | ||
| } else { | ||
| if (!isInitialized) { | ||
| throw IllegalStateException("Must call 'initWithContext' before 'login'") | ||
| } | ||
| } | ||
|
|
||
| val context = loginHelper.switchUser(externalId, jwtBearerToken) ?: return | ||
|
|
||
| if (isBackgroundThreadingEnabled) { | ||
| suspendifyOnIO { loginHelper.enqueueLogin(context) } | ||
| } else { | ||
| Thread { | ||
| runBlocking(runtimeIoDispatcher) { | ||
| loginHelper.login(externalId, jwtBearerToken) | ||
| loginHelper.enqueueLogin(context) | ||
| } | ||
| }.start() | ||
| } |
There was a problem hiding this comment.
🔴 The refactored non-suspend login() now calls loginHelper.switchUser() synchronously on the caller's thread before spawning the background Thread — exposing jwtTokenStore.putJwt() and userSwitcher.createAndSwitchToNewUser() (both SharedPreferences writes) to the main/UI thread for apps with isBackgroundThreadingEnabled=false (Java, Unity, React Native). The fix is to move switchUser() inside the Thread { runBlocking(runtimeIoDispatcher) { ... } } block alongside enqueueLogin(), restoring the pre-PR behaviour where the entire login operation ran off the calling thread.
Extended reasoning...
What the bug is and how it manifests
Prior to this PR the non-suspend login(externalId, jwtBearerToken) override wrapped its entire body (then a single call to loginHelper.login()) inside Thread { runBlocking(runtimeIoDispatcher) { ... } }.start() when isBackgroundThreadingEnabled was false. This PR split that single helper call into two steps — switchUser() and enqueueLogin() — but only moved enqueueLogin() into the background Thread. switchUser() is now called synchronously at line 393, before the Thread is even constructed, meaning it executes on whatever thread called login().
The specific code path that triggers it
OneSignalImp.login(externalId, jwtBearerToken) // caller's thread (UI thread in typical apps)
└─ loginHelper.switchUser(externalId, jwtBearerToken) // ← still on caller's thread
├─ jwtTokenStore.putJwt(externalId, jwtBearerToken)
│ └─ _prefs.saveString(...) // SharedPreferences commit — DISK WRITE
└─ userSwitcher.createAndSwitchToNewUser(...)
├─ identityModelStore.replace(...) // SharedPreferences commit — DISK WRITE
├─ propertiesModelStore.replace(...) // SharedPreferences commit — DISK WRITE
└─ subscriptionModelStore.replace(...) // SharedPreferences commit — DISK WRITE
Thread { runBlocking(runtimeIoDispatcher) {
loginHelper.enqueueLogin(context) // correctly off-thread
}}.start()
Why existing code doesn't prevent it
The isBackgroundThreadingEnabled = false branch intentionally skips waitForInit() (which suspends on an IO dispatcher) and jumps straight to line 393. There is no coroutine context switch between the return of the if/else block and the switchUser() call, so no dispatcher redirect occurs. StrictMode will fire a DiskWriteViolation and potentially a DiskReadViolation (SharedPreferences lazy-loads on first access) on every call to this overload from the UI thread.
Impact
Any Java, Unity, or React Native consumer that calls the deprecated synchronous login() from the UI thread — the common pattern shown in OneSignal's own migration guides — will now trigger StrictMode violations that were absent before this PR. Under SharedPreferences contention (e.g., another thread flushing a large prefs file) the commit can block for tens of milliseconds, contributing to jank or, in edge cases, an ANR. This is a regression introduced solely by this PR's refactoring.
How to fix it
Move switchUser() inside the background Thread block so both steps run off the calling thread, matching the original pre-PR behaviour:
} else {
Thread {
runBlocking(runtimeIoDispatcher) {
val context = loginHelper.switchUser(externalId, jwtBearerToken) ?: return@runBlocking
loginHelper.enqueueLogin(context)
}
}.start()
}Step-by-step proof
- A React Native app calls
OneSignal.login("user42")on the JS/UI thread after SDK init. isBackgroundThreadingEnabledisfalse(default for non-Kotlin hosts).- Execution reaches line 393:
loginHelper.switchUser("user42", null)— still on UI thread. - Inside
switchUser:jwtTokenStore.putJwt("user42", null)callsSharedPreferences.edit().putString(...).commit()— synchronous disk write on UI thread → StrictMode DiskWriteViolation. userSwitcher.createAndSwitchToNewUser()performs three more SharedPreferences commits (identity, properties, subscriptions) — three additional violations.- Only after all four disk writes complete does the code reach the
Thread { ... }.start()line. - In the old code, step 4–6 were entirely inside the spawned
Thread, so the UI thread returned immediately after.start().
| verify(atLeast = 1) { mockUserSwitcher.createAndSwitchToNewUser() } | ||
| verify(atLeast = 1) { mockOperationRepo.enqueue(any()) } | ||
| } |
There was a problem hiding this comment.
🟡 The thread-safety test 'logout is thread-safe with synchronized block' uses verify(atLeast = 1) with a relaxed UserSwitcher mock that never updates the identity model, so all 10 concurrent threads pass the early-exit guard and execute the full logout path — meaning the test never actually verifies the single-switch invariant that synchronized(lock) is meant to guarantee. Fix by configuring the mock to set externalId = null after the first createAndSwitchToNewUser() call and asserting exactly = 1.
Extended reasoning...
What the bug is and how it manifests
The test spawns 10 threads all calling logoutHelper.logout(). The intent is to verify that only the first thread executes the full logout path; subsequent threads should hit the early-exit guard (if (identityModelStore.model.externalId == null) return) and do nothing. However, because mockUserSwitcher is constructed as mockk<UserSwitcher>(relaxed = true), the call to createAndSwitchToNewUser() is a no-op — it never touches identityModelStore. As a result, identityModelStore.model.externalId remains "current-user" (non-null) for every thread from start to finish.
The specific code path that triggers it
LogoutHelper.logout() checks identityModelStore.model.externalId == null as its early-exit guard. In production, the real UserSwitcher.createAndSwitchToNewUser() switches the active user and clears the external ID, so only the first thread to acquire the lock proceeds; the rest exit early. In the test, the relaxed mock leaves the external ID untouched, so every one of the 10 threads independently acquires the lock, passes the guard, calls createAndSwitchToNewUser(), and enqueues a LoginUserOperation.
Why existing code doesn't prevent it
The assertions on lines 178–179 use verify(atLeast = 1):
verify(atLeast = 1) { mockUserSwitcher.createAndSwitchToNewUser() }
verify(atLeast = 1) { mockOperationRepo.enqueue(any()) }This trivially passes whether the methods were called 1 time or 10 times. There is no upper-bound check, so the over-invocation (10 actual calls vs. 1 expected) is invisible to the test framework.
Impact
The test provides false confidence that the synchronized(lock) block enforces the single-switch invariant. A future refactor that accidentally removes the early-exit guard would still pass this test, because verify(atLeast = 1) is satisfied by any non-zero call count. The bug is in test code only; production behavior is unaffected.
How to fix it
Configure the mock to simulate the real side-effect: after createAndSwitchToNewUser() is called, clear the external ID so subsequent threads exit early.
every { mockUserSwitcher.createAndSwitchToNewUser() } answers {
mockIdentityModelStore.model.externalId = null
}Then tighten the assertions to exactly = 1:
verify(exactly = 1) { mockUserSwitcher.createAndSwitchToNewUser() }
verify(exactly = 1) { mockOperationRepo.enqueue(any()) }Step-by-step proof
- Test sets up
mockIdentityModelStorewithexternalId = "current-user". - 10 threads are started; each calls
logoutHelper.logout(). - Thread 1 acquires the lock, reads
externalId = "current-user"(non-null), proceeds past the guard. - Thread 1 calls
mockUserSwitcher.createAndSwitchToNewUser()— relaxed mock, no-op,externalIdstill"current-user". - Thread 1 enqueues a
LoginUserOperation, releases the lock. - Thread 2 acquires the lock, reads
externalId = "current-user"(still non-null), proceeds past the guard — executes the full logout path again. - Steps 5–6 repeat for all 10 threads.
verify(atLeast = 1)passes becausecreateAndSwitchToNewUserandenqueuewere each called 10 times.
| */ | ||
| var useIdentityVerification: Boolean | ||
| get() = getBooleanProperty(::useIdentityVerification.name) { false } | ||
| var useIdentityVerification: Boolean? |
There was a problem hiding this comment.
should we consider maintaining an Enum state here as opposed to a boolean that signifies three states?
| con.setRequestProperty("OneSignal-Session-Duration", headers.sessionDuration.toString()) | ||
| } | ||
|
|
||
| if (headers?.jwt != null) { |
There was a problem hiding this comment.
can we try using the kotlin version of null check here?
|
|
||
| val ops = getNextOps(executeBucket) | ||
| Logging.debug("processQueueForever:ops:\n$ops") | ||
| val queueSnapshot = synchronized(queue) { queue.toList() } |
There was a problem hiding this comment.
Can you please elaborate on what we are doing here?
| } | ||
| ExecutionResult.FAIL_UNAUTHORIZED, // TODO: Need to provide callback for app to reset JWT. For now, fail with no retry. | ||
| ExecutionResult.FAIL_UNAUTHORIZED -> { | ||
| val identityVerificationEnabled = _configModelStore.model.useIdentityVerification == true |
There was a problem hiding this comment.
should we move this as a convenience method in the configModelStore so that its easy enough to refer to it. you could event create a Kotlin extension for it. also it could be testable
| } | ||
|
|
||
| private fun dispatchJwtInvalidatedToApp(externalId: String) { | ||
| _jwtInvalidatedHandler?.let { handler -> |
There was a problem hiding this comment.
if these handlers are not setup in time, we could silently skip this. should we queue up the invocations and invoke them when the handler is actually set?
| */ | ||
| private fun shouldSuppressAnonymousOp(op: Operation): Boolean { | ||
| if (op is LoginUserOperation) return false | ||
| return _configModelStore.model.useIdentityVerification == true && op.externalId == null |
There was a problem hiding this comment.
is useIdentityVerification set to be volatile? that way has the accurate value in every read?
Description
One Line Summary
Add Identity Verification (JWT-based) support to the Android SDK, gating API operations behind server-issued JWTs when enabled.
Details
Motivation
Identity Verification allows app developers to authenticate users with JWTs before the SDK sends operations to the OneSignal backend. This prevents unauthorized API calls by requiring a valid JWT for all user-scoped operations when the feature is enabled server-side via the
jwt_requiredremote param.Scope
Public API additions:
OneSignal.login(externalId, jwtBearerToken)— accepts an optional JWT on loginOneSignal.updateUserJwt(externalId, token)— supply a fresh JWT when the current one expiresOneSignal.addUserJwtInvalidatedListener(listener)/removeUserJwtInvalidatedListener(listener)— get notified when a JWT is invalidated (e.g. 401 from backend) so the app can provide a new oneIUserJwtInvalidatedListener/UserJwtInvalidatedEvent— listener interface and event classInternal changes:
JwtTokenStore— persistent store mapping externalId → JWT, backed by SharedPreferencesIdentityVerificationService— reacts to thejwt_requiredremote param arriving via config HYDRATE; purges anonymous operations when IV is enabled; wires JWT invalidation callbacksOperationRepo— gates operation execution on valid JWT when IV is enabled (hasValidJwtIfRequired); handlesFAIL_UNAUTHORIZEDby invalidating the JWT, re-queuing operations, and notifying the developer; stampsexternalIdsynchronously at enqueue time to avoid race conditionsLogoutHelper— when IV is enabled, disables the push subscription internally and suppresses the backend login operation (the device-scoped user is local-only)ConfigModelStoreListener/ParamsBackendService— readsjwt_requiredfrom remote params, defaults tofalsewhen absentAuthorization: Bearer <jwt>header to requests when IV is enabledexternalIdto result operations and attach JWT to backend requestsWhat is NOT affected:
jwt_requireddefaults tofalseand no JWT gating occursTesting
Unit testing
WIP Adding more testing
OperationRepoTests— added tests forFAIL_UNAUTHORIZEDhandling (identified user fires handler + retries; anonymous user drops operations), and synchronousexternalIdstamping before async dispatchLoginUserOperationExecutorTests— updated for identity verification parametersLogoutHelperTests— covers IV-enabled logout flowLoginHelperTests— covers IV-enabled login flow with JWTManual testing
Tested with the demo app on an Android emulator (Pixel 7, API 35):
IUserJwtInvalidatedListener, callupdateUserJwt→ operations resumeAffected code checklist
Checklist
Overview
Testing
Final pass