feat: identity verification 5.8 by nan-li · Pull Request #2599 · OneSignal/OneSignal-Android-SDK

nan-li · 2026-03-30T09:21:48Z

Description

One Line Summary

Add Identity Verification (JWT-based) support to the Android SDK, gating API operations behind server-issued JWTs when enabled.

Details

Motivation

Identity Verification allows app developers to authenticate users with JWTs before the SDK sends operations to the OneSignal backend. This prevents unauthorized API calls by requiring a valid JWT for all user-scoped operations when the feature is enabled server-side via the jwt_required remote param.

Scope

Public API additions:

OneSignal.login(externalId, jwtBearerToken) — accepts an optional JWT on login
OneSignal.updateUserJwt(externalId, token) — supply a fresh JWT when the current one expires
OneSignal.addUserJwtInvalidatedListener(listener) / removeUserJwtInvalidatedListener(listener) — get notified when a JWT is invalidated (e.g. 401 from backend) so the app can provide a new one
IUserJwtInvalidatedListener / UserJwtInvalidatedEvent — listener interface and event class

Internal changes:

JwtTokenStore — persistent store mapping externalId → JWT, backed by SharedPreferences
IdentityVerificationService — reacts to the jwt_required remote param arriving via config HYDRATE; purges anonymous operations when IV is enabled; wires JWT invalidation callbacks
OperationRepo — gates operation execution on valid JWT when IV is enabled (hasValidJwtIfRequired); handles FAIL_UNAUTHORIZED by invalidating the JWT, re-queuing operations, and notifying the developer; stamps externalId synchronously at enqueue time to avoid race conditions
LogoutHelper — when IV is enabled, disables the push subscription internally and suppresses the backend login operation (the device-scoped user is local-only)
ConfigModelStoreListener / ParamsBackendService — reads jwt_required from remote params, defaults to false when absent
HTTP layer — attaches Authorization: Bearer <jwt> header to requests when IV is enabled
In-App Messaging — uses alias-based fetch endpoint and includes JWT in IAM backend calls
Operation executors — propagate externalId to result operations and attach JWT to backend requests

What is NOT affected:

Apps without Identity Verification enabled behave exactly as before — jwt_required defaults to false and no JWT gating occurs
Notification display, open handling, push processing, and confirm deliveries are unchanged

Testing

Unit testing

WIP Adding more testing

OperationRepoTests — added tests for FAIL_UNAUTHORIZED handling (identified user fires handler + retries; anonymous user drops operations), and synchronous externalId stamping before async dispatch
LoginUserOperationExecutorTests — updated for identity verification parameters
LogoutHelperTests — covers IV-enabled logout flow
LoginHelperTests — covers IV-enabled login flow with JWT

Manual testing

Tested with the demo app on an Android emulator (Pixel 7, API 35):

Login with valid JWT → operations execute normally
Login with expired JWT → 401 triggers IUserJwtInvalidatedListener, call updateUserJwt → operations resume
Logout with IV enabled → push subscription disabled, new anonymous user created locally
App without IV enabled → SDK operates normally, no regression

Affected code checklist

Checklist

Overview

I have filled out all REQUIRED sections above
PR does one thing
Any Public API changes are explained in the PR details and conform to existing APIs

Testing

I have included test coverage for these changes, or explained why they are not needed
All automated tests pass, or I explained why that is not possible
I have personally tested this on my device, or explained why that is not possible

Final pass

Code is as readable as possible.
I have reviewed this PR myself, ensuring it meets each checklist item

nan-li · 2026-03-31T07:26:01Z

@claude re-review

claude · 2026-03-31T07:26:17Z

Claude finished @nan-li's task in 3m 55s —— View job

Re-review: feat/identity_verification_5.8

Read current state of files related to previous review findings
Verify fixes for the 4 critical issues flagged
Look for any remaining or new issues
Post updated review

All four critical issues flagged in the previous review have been addressed. Here's a status breakdown:

Previously Critical — Now Fixed ✅

1. Runtime 401 not firing IUserJwtInvalidatedListener (OperationRepo.kt:292-297)

Fixed correctly. OperationRepo now holds a _jwtInvalidatedHandler: ((String) -> Unit)? that is set in IdentityVerificationService.start():

_operationRepo.setJwtInvalidatedHandler { externalId ->
    _userManager.fireJwtInvalidated(externalId)
}

And in the FAIL_UNAUTHORIZED branch, _jwtInvalidatedHandler?.invoke(externalId) is called after _jwtTokenStore.invalidateJwt(externalId). The queue correctly re-enqueues the operations, and the developer receives the callback to provide a fresh JWT. When updateUserJwt is called, forceExecuteOperations() wakes the queue — the full round trip is wired.

2. Logout race condition — UpdateSubscriptionOperation stamped with null externalId (OperationRepo.kt:165-168)

Fixed correctly. stampExternalId(operation) is now called synchronously in enqueue() before scope.launch { internalEnqueue(...) }. Since SubscriptionModelStoreListener.getUpdateOperation() fires synchronously when isDisabledInternally = true is set — and createAndSwitchToNewUser() has not yet run at that point — _identityModelStore.model.externalId is still the old user's ID when it's captured.

3. JwtTokenStore.ensureLoaded() throws uncaught JSONException (JwtTokenStore.kt:32-42)

Fixed correctly. The JSON parsing is wrapped in try { } catch (e: Exception) and isLoaded = true is placed unconditionally at the end of ensureLoaded(). Corrupted SharedPreferences data is now logged and discarded gracefully instead of causing an infinite throw loop.

4. useIdentityVerification stays null for non-IV apps, permanently blocking the queue (ParamsBackendService.kt:87)

Fixed correctly. The field now defaults to false when jwt_required is absent from the backend response:

useIdentityVerification = responseJson.safeBool("jwt_required") ?: false

Non-IV apps get false after HYDRATE, getNextOps() proceeds, and IdentityVerificationService.onModelReplaced() still calls forceExecuteOperations() to wake the queue.

Remaining Observations (non-blocking)

A. _jwtInvalidatedHandler is not @Volatile (OperationRepo.kt:47)

private var _jwtInvalidatedHandler: ((String) -> Unit)? = null

This field is written from IdentityVerificationService.start() (likely the main thread) and read from executeOperations() running on OSOperationRepoScope. In practice this is safe because start() runs before any operations execute, but the JMM doesn't guarantee visibility without @Volatile or a synchronisation point. Worth annotating for correctness. Fix this →

B. Expired JWT at logout time blocks the unsubscribe operation (LogoutHelper.kt)

When IV=true and a 401 has already invalidated the JWT before the user calls logout(), the UpdateSubscriptionOperation (to disable the push subscription) gets a null result from _jwtTokenStore.getJwt(externalId) and is permanently held by hasValidJwtIfRequired. The developer would need to call updateUserJwt with a fresh token to unblock it — but if they're logging out, they likely won't. The backend push subscription stays enabled for the logged-out user until the next cold start (when pruneToExternalIds cleans up the orphaned operation).

This is an edge case (JWT expired + user logs out before providing a refresh) and the overall security posture of the IV feature prevents worse outcomes, but it may be worth documenting or handling explicitly — for example, by skipping the JWT gate for the unsubscribe-on-logout operation. Not blocking the PR.

The implementation is solid. All critical issues are resolved and the approach is well-structured. The IdentityVerificationService / OperationRepo / JwtTokenStore responsibilities are cleanly separated.

nan-li · 2026-03-31T08:19:47Z

@claude re-review

claude · 2026-03-31T08:20:06Z

Claude finished @nan-li's task in 2m 39s —— View job

Re-review: feat/identity_verification_5.8

Read current state of key files
Verify fixes for the 2 observations from previous review
Look for any remaining or new issues
Post updated review

Both remaining observations from the previous re-review have been addressed. The PR is clean.

Previously Non-blocking — Now Fixed ✅

A. _jwtInvalidatedHandler not @Volatile (OperationRepo.kt:47)

Fixed in commit 76dd958. The field is now correctly annotated:

@Volatile
private var _jwtInvalidatedHandler: ((String) -> Unit)? = null

B. Expired JWT at logout blocks the push subscription disable (UpdateSubscriptionOperation.kt:89)

Fixed in commit 130aa16. The first approach (skip in LogoutHelper) was tried and reverted; the final solution is cleaner — it adds Operation.requiresJwt: Boolean = true as an open property on the base class and overrides it in UpdateSubscriptionOperation:

// Operation.kt:69
open val requiresJwt: Boolean get() = true

// UpdateSubscriptionOperation.kt:89
override val requiresJwt: Boolean get() = false

OperationRepo.hasValidJwtIfRequired() short-circuits on !op.requiresJwt, so subscription updates always proceed regardless of JWT state. This is backend-aligned — the subscription endpoint doesn't enforce JWT auth — and the executor still passes the JWT opportunistically when one is available (val jwt = lastOperation.externalId?.let { _jwtTokenStore.getJwt(it) }).

Minor Observation (no action needed)

The requiresJwt = false behavior on UpdateSubscriptionOperation has no dedicated test in OperationRepoTests. The behavior is implicitly covered by LogoutHelperTests, but a unit test verifying that an operation with requiresJwt = false executes when hasValidJwtIfRequired would otherwise block could add confidence. Given the simplicity of the change, this is not blocking.

The implementation is solid. All six issues identified across both review passes are resolved. The architecture is clean: IdentityVerificationService / OperationRepo / JwtTokenStore responsibilities are well-separated, the requiresJwt flag gives per-operation control over JWT gating, and the developer-facing callback round-trip is fully wired. Good to merge.

abdulraqeeb33 · 2026-03-31T17:08:48Z

  <CurrentIssues>
    <ID>ComplexCondition:InAppMessagesManager.kt$InAppMessagesManager$!message.isTriggerChanged &amp;&amp; isMessageDisplayed &amp;&amp; (isTriggerOnMessage || isNewTriggerAdded &amp;&amp; isOnlyDynamicTriggers)</ID>
-    <ID>ComplexMethod:TriggerController.kt$TriggerController$private fun evaluateTrigger(trigger: Trigger): Boolean</ID>
+    <ID>ComplexMethod:InAppMessagesManager.kt$InAppMessagesManager$private suspend fun fetchMessages(rywData: RywData)</ID>


we should break it down if its a complex method.

Yes, I undid the detekt changes and then added them back in the last few commits with reasoning. I actually didn't end up making changes to in-app-messages in detekt.

abdulraqeeb33 · 2026-03-31T18:11:01Z

+     * @param externalId The external ID of the user whose token is being updated.
+     * @param token The new JWT bearer token.
+     */
+    fun updateUserJwt(


like i mentioned elsewhere, we need suspend methods for these

abdulraqeeb33

Lets try and do a bug bash on this today/tomorrow?

nan-li · 2026-04-13T22:31:24Z

@claude review

claude

The two bugs I flagged in the last pass (IAM rate-limit poisoning and JWT exposure in login() logs) have been fixed in the latest commit — thanks for addressing them. Given the security-sensitive nature of this feature (JWT-based identity gating across 86 files), a final human look before merge is warranted.

Extended reasoning...

Overview

This PR adds Identity Verification (JWT-based authentication gating) to the Android SDK, spanning 86 files including new public API, a new JwtTokenStore, OperationRepo changes, IAM fetch gating, and HttpClient JWT header injection.

Prior Review Rounds

Rounds 1–2 (late March): 6 bugs found and fixed.
Round 3 (2026-04-14): 4 new bugs filed as inline comments. The latest commit (82678a5 "Address Claude review bot feedback on IV PR") addresses all of them:
- ✅ IAM rate-limit poisoning: Guard now checks both and , returning before is set.
- ✅ JWT logging in / : Both now use .
- ✅ FAIL_UNAUTHORIZED loop when IV=OFF: Gated on .
- ✅ JWT in HttpClient logs: Authorization header filtered from logged request properties.
- 403/401 conflation and stale ops at logout explicitly deferred by author (out of scope / by design).

Security Risks

The PR touches auth-critical paths: JWT storage/retrieval, JWT injection into HTTP headers, FAIL_UNAUTHORIZED error routing, and the public-facing / APIs. No new vulnerabilities found in this pass, but the breadth of the change means a human reviewer should give final sign-off.

Level of Scrutiny

This is production-facing security code that gates API calls behind developer-issued JWTs. The implementation looks correct across the key paths reviewed, but the sheer scope (86 files, new public API surface, threading concerns) means automated review alone is insufficient.

Other Factors

The LoginHelper pre-HYDRATE logic is sound: the null case falls through to IV=OFF behavior, and IdentityVerificationService cleans up on HYDRATE.
intentionally omits UNAUTHORIZED handling (requiresJwt=false by design; JWT is passed opportunistically).
Test infrastructure () correctly gates test suites from the IV hold path.

…ication nullable, OptionalHeaders.jwt Foundational models and infrastructure for identity verification: - Create JwtTokenStore: persistent Map<externalId, JWT> backed by SharedPreferences, supporting multi-user JWT storage with getJwt/putJwt/invalidateJwt/pruneToExternalIds - Add var externalId to Operation base class so OperationRepo can stamp and gate operations per-user; remove redundant externalId from LoginUserOperation and TrackCustomEventOperation (same Model data-map key, no migration needed) - Change ConfigModel.useIdentityVerification from Boolean to Boolean? (null = unknown, false = off, true = on) to eliminate race between operation processing and remote params - Add jwt field to OptionalHeaders for passing Bearer tokens through HTTP layer - Add PREFS_OS_JWT_TOKENS key to PreferenceOneSignalKeys Made-with: Cursor

When the server omits jwt_required from the params response, safeBool() returned null, leaving useIdentityVerification unset. getNextOps() treated null as "not yet known" and returned null permanently, silently blocking every SDK operation for the session. Default to false when the field is absent so non-IV apps are unaffected by the identity verification gating logic. Made-with: Cursor

The field is written from the main thread in IdentityVerificationService.start() and read on the OperationRepo coroutine thread. @volatile guarantees cross-thread visibility. Made-with: Cursor

When IV is enabled and the JWT has already been invalidated (e.g. by a prior 401), the UpdateSubscriptionOperation to disable the push subscription would be permanently blocked by hasValidJwtIfRequired. Since the backend call would fail with 401 anyway, skip it entirely and just switch to the new anonymous user. Made-with: Cursor Revert "Skip push subscription disable on logout when JWT is already expired" This reverts commit 5ce284257b6e03b8c862745ff58fcf4293299577. Exempt UpdateSubscriptionOperation from JWT gating The subscription update endpoint does not require a JWT on the backend. Add Operation.requiresJwt (default true) and override it to false in UpdateSubscriptionOperation so these operations are not blocked by hasValidJwtIfRequired when the JWT is expired or missing. This fixes the edge case where logging out with an expired JWT would permanently block the push subscription disable operation. Made-with: Cursor

Made-with: Cursor

…ationRepo Pass externalId alongside onesignalId in every operation constructor, eliminating the fragile stampExternalId workaround and parentExternalId propagation from OperationRepo. Every creation site (model store listeners, session listener, executors, etc.) now explicitly provides the externalId at construction time. Made-with: Cursor

Add suspend variant of updateUserJwt for API consistency with login/loginSuspend. Also add waitForInit to the non-suspend updateUserJwt so it safely handles calls before SDK initialization.

- Baseline 2 new UseCheckOrError entries for OneSignalImp.kt for updateUserJwt/updateUserJwtSuspend init guards that follow the existing login/logout pattern. - Baseline ReturnCount for hasValidJwtIfRequired — the guard-clause style is clearer than collapsing into a single boolean expression for this JWT gating logic. - Narrow JwtTokenStore catch from Exception to JSONException to satisfy detekt's TooGenericExceptionCaught rule, no baseline added.

…r params All new _prefix constructor parameters (e.g. _jwtTokenStore, _identityModelStore, _configModelStore) follow the existing codebase convention — 80+ identical entries already exist in the baseline. These are DI-injected dependencies across IdentityVerificationService, OperationRepo, JwtTokenStore, and various executors/listeners.

- Update 4 existing baseline entries (added jwt/_jwtTokenStore to signatures) - Baseline 4 new LongParameterList entries for IV constructors: CreateSubscriptionOperation, UpdateSubscriptionOperation, RefreshUserOperationExecutor, and UpdateUserOperationExecutor crossed the constructor threshold (8) after adding externalId and _jwtTokenStore params. This follows existing patterns.

OperationRepo: on FAIL_UNAUTHORIZED, invalidate JWT and re-queue before notifying the app; invoke setJwtInvalidatedHandler synchronously with try/catch so listener failures do not hit executeOperations outer catch. UserManager.fireJwtInvalidated performs the single async hop (SupervisorJob + Dispatchers.Default) with Otel-style launch try/catch and per-listener isolation. IdentityVerificationService: call forceExecuteOperations before fireJwtInvalidated on HYDRATE. Documentation: IOperationRepo handler runs on the op-repo thread and must return quickly; public API documents background-thread delivery for IUserJwtInvalidatedListener, addUserJwtInvalidatedListener, and UserJwtInvalidatedEvent (LiveData postValue guidance). Tests: FAIL_UNAUTHORIZED JWT handler coverage; no artificial delay after sync repo handler.

…2600) Co-authored-by: AR Abdul Azeez <abdul@onesignal.com>

…s post-enqueue OperationRepo: gate FAIL_UNAUTHORIZED re-queue on useIdentityVerification == true. When IV is OFF, hasValidJwtIfRequired() always returns true so re-queued ops were immediately eligible, creating a ~200ms infinite retry loop. Now IV-OFF treats FAIL_UNAUTHORIZED as FAIL_NORETRY (drop + wake waiters). OperationRepo: in removeOperationsWithoutExternalId(), clear local existingOnesignalId on queued LoginUserOperations. When IV=ON arrives via HYDRATE, anonymous CreateUserOperations are purged, orphaning the local ID that LoginUserOperation.canStartExecute was waiting on translateIds to resolve. Clearing it unblocks the operation and routes the executor through createUser(). LoginUserOperation: widen existingOnesignalId setter to internal. Fix Operation.externalId KDoc to reflect that subclass constructors set this field, not IOperationRepo at enqueue time. Made-with: Cursor

LogoutHelper: add explicit null branch for unknown IV state (pre-HYDRATE). Disables push subscription and suppresses backend op (like IV=ON) while also enqueuing a LoginUserOperation so the anonymous user is created on the backend if IV turns out to be OFF. If IV=ON, removeOperationsWithoutExternalId() purges the anonymous op at HYDRATE. Known IV=ON and IV=OFF paths are unchanged. HttpClient: filter Authorization header from requestProperties before passing to logHTTPSent() so the full Bearer token is not written to debug logs. OneSignalImp: redact token in updateUserJwt/updateUserJwtSuspend debug logs. Made-with: Cursor

- Redact JWT from login/loginSuspend logs (show last 8 chars) - Fix error message in updateUserJwtSuspend - Add FAIL_UNAUTHORIZED to deleteSubscription and CustomEvent executors - Drop TransferSubscriptionOperation when IV=ON - Skip IAM fetch when JWT is invalidated (prevent rate-limit poisoning) - Clear existingOnesignalId unconditionally in removeOperationsWithoutExternalId

- MagicNumber(8): log suffix length in takeLast(8) — extracting a named constant for a single-use log format arg adds noise without improving readability - ReturnCount(transferSubscription): early-exit guard follows the same pattern as LoginUserFromSubscriptionOperationExecutor; restructuring to reduce returns would obscure the guard-clause intent - NestedBlockDepth(executeOperations): pre-existing from IV branch, the when-block nesting is inherent to the dispatch logic - UseCheckOrError: pre-existing non-suspend login/logout/accessor throws — consistent with existing codebase pattern - Removed stale entry for renamed error message

- LogoutHelper: remove suppressBackendOperation=true in pre-HYDRATE branch so CreateSubscriptionOperation is enqueued alongside the anonymous LoginUserOperation (fixes IV=OFF path) - InAppMessagesManager: capture JWT once before guard check to eliminate TOCTOU between guard and backend call - nit: LoginUserOperationExecutor: add error log when anonymous LoginUserOperation is dropped with no subscription op

- Move anonymous operation suppression from individual model store listeners into OperationRepo.enqueue/enqueueAndWait so all enqueue sites (SessionListener, TrackGooglePurchase, CustomEventController, etc.) are covered. LoginUserOperation is exempt. - Remove redundant shouldSuppressForAnonymousUser() from IdentityModelStoreListener, PropertiesModelStoreListener, and SubscriptionModelStoreListener. - Add LogoutHelper tests for IV=true and IV=null (pre-HYDRATE) branches.

claude · 2026-04-16T08:52:41Z

+            operationRepo.forceExecuteOperations()
+        } else {
+            if (!isInitialized) {
+                throw IllegalStateException("Must call 'initWithContext' before 'updateUserJwt'")
+            }
+            jwtTokenStore.putJwt(externalId, token)
+            operationRepo.forceExecuteOperations()
+        }
+    }
+
+    override suspend fun updateUserJwtSuspend(
+        externalId: String,
+        token: String,
+    ) = withContext(runtimeIoDispatcher) {
+        Logging.log(LogLevel.DEBUG, "updateUserJwtSuspend(externalId: $externalId, token: ...${token.takeLast(8)})")
+
+        suspendUntilInit(operationName = "updateUserJwt")
+
+        if (!isInitialized) {
+            throw IllegalStateException("Must call 'initWithContext' before 'updateUserJwt'")
+        }
+
+        jwtTokenStore.putJwt(externalId, token)
+        operationRepo.forceExecuteOperations()


🔴 Two issues in updateUserJwt()/updateUserJwtSuspend(): (1) forceExecuteOperations() wakes the operation queue with force=false, so after a 401-triggered re-queue the retry is delayed by the full ~5-second opRepoExecutionInterval batching window instead of executing immediately; (2) neither method signals the IAM subsystem to re-fetch in-app messages, leaving any IAM fetch that was skipped due to a missing JWT permanently stuck until the next session-start or push-subscription event.

Extended reasoning...

Bug 1 — forceExecuteOperations() uses force=false, causing a ~5-second delay after JWT renewal

When the operation queue receives a FAIL_UNAUTHORIZED result it re-enqueues the failing operation and sets the JWT to null. Control returns to processQueueForever(), which calls waitForNewOperationAndExecutionInterval() and enters the idle-wait state. When the app calls updateUserJwt() or updateUserJwtSuspend(), the implementation calls operationRepo.forceExecuteOperations() (OperationRepo.kt line 218). However forceExecuteOperations() wakes the main waiter with LoopWaiterMessage(false) — force=false. Inside waitForNewOperationAndExecutionInterval() (lines 237–253), the while (\!wakeMessage.force) loop stays true and the function waits up to opRepoExecutionInterval (default 5 000 ms) before breaking out to execute. The retryWaiter.wake(LoopWaiterMessage(true)) call also in forceExecuteOperations() only shortens the delayBeforeNextExecution() retry-backoff wait, which is not the active state here.

Concrete proof: App calls login("alice", jwt1) → LoginUserOperation enqueued → backend returns HTTP 401 → executor sets jwtTokenStore.putJwt("alice", null) and returns FAIL_UNAUTHORIZED → op re-queued → processQueueForever loop calls waitForNewOperationAndExecutionInterval(). Now app calls updateUserJwt("alice", jwt2). forceExecuteOperations() fires waiter.wake(LoopWaiterMessage(false)). waitForNewOperationAndExecutionInterval receives the wake but wakeMessage.force == false, so it continues waiting for up to opRepoExecutionInterval (5 000 ms). The retry only fires after that full delay.

Fix: Change the waiter.wake call inside forceExecuteOperations() to LoopWaiterMessage(true), or add a dedicated immediateExecuteOperations() path that passes force=true to the main waiter.

Bug 2 — JWT renewal does not trigger an IAM re-fetch

InAppMessagesManager.fetchMessagesWhenConditionIsMet() is called from exactly four places: initial appId configuration, ConfigModel HYDRATE, push-subscription ID change, and explicit session-start. The new JWT guard added in this PR (InAppMessagesManager.kt lines 310–313) correctly avoids fetching IAMs when JWT is null, but there is no corresponding call to re-fetch once a valid JWT becomes available via updateUserJwt().

Concrete proof: App is configured with IV required. SDK starts, JWT is missing → fetchMessagesWhenConditionIsMet() short-circuits because jwtTokenStore.getJwt(externalId) == null. App later calls updateUserJwt("alice", validJwt). OneSignalImp.updateUserJwt() (lines 424–444) calls jwtTokenStore.putJwt() then operationRepo.forceExecuteOperations() — nothing else. InAppMessagesManager has no listener on JwtTokenStore and is never signalled. Result: in-app messages are never fetched for this session unless a push-subscription ID change or new session start coincidentally fires.

Fix: Call inAppMessagesManager.fetchMessagesWhenConditionIsMet() inside both updateUserJwt() and updateUserJwtSuspend() after putJwt(), or adopt the IJwtUpdateListener pattern already used in the companion PR #2613 and have InAppMessagesManager subscribe to JWT-renewal events.

claude · 2026-04-16T08:52:41Z

            lastTimeFetchedIAMs = now
        }

+        val (aliasLabel, aliasValue) =
+            IdentityConstants.resolveAlias(
+                _configModelStore.model.useIdentityVerification,
+                externalId,
+                _identityModelStore.model.onesignalId,
+            )
        // lambda so that it is updated on each potential retry
        val sessionDurationProvider = { _time.currentTimeMillis - _sessionService.startTime }
-        val newMessages = _backend.listInAppMessages(appId, subscriptionId, rywData, sessionDurationProvider)
+        val newMessages = _backend.listInAppMessages(appId, aliasLabel, aliasValue, subscriptionId, rywData, sessionDurationProvider, jwt)

        if (newMessages != null) {
            this.messages = newMessages as MutableList<InAppMessage>


🔴 In fetchMessages(), externalId and jwt are captured before fetchIAMMutex.withLock as a TOCTOU guard, but _identityModelStore.model.onesignalId is read live after the mutex suspension point at line 332. If a user switch occurs while suspended on the mutex, the IV=true path will fetch IAM messages for the old user (using stale externalId+jwt) while the new user is active. The fix is to snapshot onesignalId alongside externalId and jwt before the mutex, so the entire identity triple is consistent.

Extended reasoning...

The inconsistency

The PR carefully snapshots two identity fields before fetchIAMMutex.withLock to avoid TOCTOU issues — externalId at line 304 and jwt at line 307 (the comment on line 305 even explains the rationale). However, a third identity field is still read live: _identityModelStore.model.onesignalId is accessed inside resolveAlias() at line 332, which runs after the coroutine suspends on the mutex.

Why a user switch can happen during the mutex wait

fetchIAMMutex is a coroutine Mutex, not a blocking lock. When fetchIAMMutex.withLock { ... } cannot immediately acquire the lock, the coroutine suspends, yielding the thread. During that suspension window other coroutines can run — including a user login/logout that replaces _identityModelStore.model with a new user's identity model. When fetchMessages() resumes after the mutex, externalId and jwt are stale (user A), but _identityModelStore.model.onesignalId is live (user B).

Impact per IV mode

IV=true path: resolveAlias(true, externalId, onesignalId) returns ("external_id", externalId) — onesignalId is ignored. The backend call uses user A's externalId as the alias and user A's jwt for authentication. This fetches and applies user A's in-app messages while user B is the active user — a cross-user data leak.

IV=false path: resolveAlias(false, externalId, onesignalId) returns ("onesignal_id", onesignalId) — externalId is ignored. Here the live read of onesignalId is actually correct: it reflects the current user B, so the fetch targets the right user. This path is benign.

Addressing the refutation

A refutation argued that resolveAlias uses XOR logic, so externalId and onesignalId are never both used in the same URL, meaning there's no "mixed" snapshot. This is correct but beside the point. The problem is not that two fields are mixed for the same URL — it is that the entire fetch (URL + JWT) belongs to user A while user B is active. The XOR logic means we consistently use user A's identity for the request; the inconsistency is between which user is fetched vs. which user is displayed.

Concrete proof

User A (externalId="alice") is logged in; fetchMessages() is called.

Line 304: externalId = "alice". Line 307: jwt = alice_jwt. IV guards pass.

fetchIAMMutex.withLock cannot acquire lock — coroutine suspends.

Concurrent coroutine: user switches to user B (externalId="bob"); _identityModelStore.model is replaced.

Mutex is released; fetchMessages() resumes.

Line 332: resolveAlias(true, "alice", <bob's onesignalId>) → returns ("external_id", "alice").

_backend.listInAppMessages(..., "external_id", "alice", ..., alice_jwt) is called — fetching alice's IAMs, applying them to bob's session.

Suggested fix

Snapshot onesignalId alongside the other two identity fields, before the mutex:

val externalId = _identityModelStore.model.externalId val onesignalId = _identityModelStore.model.onesignalId // snapshot here for consistent identity triple val jwt = externalId?.let { _jwtTokenStore.getJwt(it) }

Then pass the captured onesignalId variable to resolveAlias() instead of the live model read.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude · 2026-04-16T19:02:18Z

        externalId: String,
        jwtBearerToken: String?,
    ) {
-        Logging.log(LogLevel.DEBUG, "Calling deprecated login(externalId: $externalId, jwtBearerToken: $jwtBearerToken)")
+        Logging.log(LogLevel.DEBUG, "Calling deprecated login(externalId: $externalId, jwtBearerToken: ...${jwtBearerToken?.takeLast(8)})")

        if (isBackgroundThreadingEnabled) {
            waitForInit(operationName = "login")
-            suspendifyOnIO { loginHelper.login(externalId, jwtBearerToken) }
        } else {
            if (!isInitialized) {
                throw IllegalStateException("Must call 'initWithContext' before 'login'")
            }
+        }
+
+        val context = loginHelper.switchUser(externalId, jwtBearerToken) ?: return
+
+        if (isBackgroundThreadingEnabled) {
+            suspendifyOnIO { loginHelper.enqueueLogin(context) }
+        } else {
            Thread {
                runBlocking(runtimeIoDispatcher) {
-                    loginHelper.login(externalId, jwtBearerToken)
+                    loginHelper.enqueueLogin(context)
                }
            }.start()
        }


🔴 The refactored non-suspend login() now calls loginHelper.switchUser() synchronously on the caller's thread before spawning the background Thread — exposing jwtTokenStore.putJwt() and userSwitcher.createAndSwitchToNewUser() (both SharedPreferences writes) to the main/UI thread for apps with isBackgroundThreadingEnabled=false (Java, Unity, React Native). The fix is to move switchUser() inside the Thread { runBlocking(runtimeIoDispatcher) { ... } } block alongside enqueueLogin(), restoring the pre-PR behaviour where the entire login operation ran off the calling thread.

Extended reasoning...

What the bug is and how it manifests

Prior to this PR the non-suspend login(externalId, jwtBearerToken) override wrapped its entire body (then a single call to loginHelper.login()) inside Thread { runBlocking(runtimeIoDispatcher) { ... } }.start() when isBackgroundThreadingEnabled was false. This PR split that single helper call into two steps — switchUser() and enqueueLogin() — but only moved enqueueLogin() into the background Thread. switchUser() is now called synchronously at line 393, before the Thread is even constructed, meaning it executes on whatever thread called login().

The specific code path that triggers it

OneSignalImp.login(externalId, jwtBearerToken) // caller's thread (UI thread in typical apps) └─ loginHelper.switchUser(externalId, jwtBearerToken) // ← still on caller's thread ├─ jwtTokenStore.putJwt(externalId, jwtBearerToken) │ └─ _prefs.saveString(...) // SharedPreferences commit — DISK WRITE └─ userSwitcher.createAndSwitchToNewUser(...) ├─ identityModelStore.replace(...) // SharedPreferences commit — DISK WRITE ├─ propertiesModelStore.replace(...) // SharedPreferences commit — DISK WRITE └─ subscriptionModelStore.replace(...) // SharedPreferences commit — DISK WRITE Thread { runBlocking(runtimeIoDispatcher) { loginHelper.enqueueLogin(context) // correctly off-thread }}.start()

Why existing code doesn't prevent it

The isBackgroundThreadingEnabled = false branch intentionally skips waitForInit() (which suspends on an IO dispatcher) and jumps straight to line 393. There is no coroutine context switch between the return of the if/else block and the switchUser() call, so no dispatcher redirect occurs. StrictMode will fire a DiskWriteViolation and potentially a DiskReadViolation (SharedPreferences lazy-loads on first access) on every call to this overload from the UI thread.

Impact

Any Java, Unity, or React Native consumer that calls the deprecated synchronous login() from the UI thread — the common pattern shown in OneSignal's own migration guides — will now trigger StrictMode violations that were absent before this PR. Under SharedPreferences contention (e.g., another thread flushing a large prefs file) the commit can block for tens of milliseconds, contributing to jank or, in edge cases, an ANR. This is a regression introduced solely by this PR's refactoring.

How to fix it

Move switchUser() inside the background Thread block so both steps run off the calling thread, matching the original pre-PR behaviour:

} else { Thread { runBlocking(runtimeIoDispatcher) { val context = loginHelper.switchUser(externalId, jwtBearerToken) ?: return@runBlocking loginHelper.enqueueLogin(context) } }.start() }

Step-by-step proof

A React Native app calls OneSignal.login("user42") on the JS/UI thread after SDK init.

isBackgroundThreadingEnabled is false (default for non-Kotlin hosts).

Execution reaches line 393: loginHelper.switchUser("user42", null) — still on UI thread.

Inside switchUser: jwtTokenStore.putJwt("user42", null) calls SharedPreferences.edit().putString(...).commit() — synchronous disk write on UI thread → StrictMode DiskWriteViolation.

userSwitcher.createAndSwitchToNewUser() performs three more SharedPreferences commits (identity, properties, subscriptions) — three additional violations.

Only after all four disk writes complete does the code reach the Thread { ... }.start() line.

In the old code, step 4–6 were entirely inside the spawned Thread, so the UI thread returned immediately after .start().

claude · 2026-04-16T19:02:18Z

        verify(atLeast = 1) { mockUserSwitcher.createAndSwitchToNewUser() }
        verify(atLeast = 1) { mockOperationRepo.enqueue(any()) }
    }


🟡 The thread-safety test 'logout is thread-safe with synchronized block' uses verify(atLeast = 1) with a relaxed UserSwitcher mock that never updates the identity model, so all 10 concurrent threads pass the early-exit guard and execute the full logout path — meaning the test never actually verifies the single-switch invariant that synchronized(lock) is meant to guarantee. Fix by configuring the mock to set externalId = null after the first createAndSwitchToNewUser() call and asserting exactly = 1.

Extended reasoning...

What the bug is and how it manifests

The test spawns 10 threads all calling logoutHelper.logout(). The intent is to verify that only the first thread executes the full logout path; subsequent threads should hit the early-exit guard (if (identityModelStore.model.externalId == null) return) and do nothing. However, because mockUserSwitcher is constructed as mockk<UserSwitcher>(relaxed = true), the call to createAndSwitchToNewUser() is a no-op — it never touches identityModelStore. As a result, identityModelStore.model.externalId remains "current-user" (non-null) for every thread from start to finish.

The specific code path that triggers it

LogoutHelper.logout() checks identityModelStore.model.externalId == null as its early-exit guard. In production, the real UserSwitcher.createAndSwitchToNewUser() switches the active user and clears the external ID, so only the first thread to acquire the lock proceeds; the rest exit early. In the test, the relaxed mock leaves the external ID untouched, so every one of the 10 threads independently acquires the lock, passes the guard, calls createAndSwitchToNewUser(), and enqueues a LoginUserOperation.

Why existing code doesn't prevent it

The assertions on lines 178–179 use verify(atLeast = 1):

verify(atLeast = 1) { mockUserSwitcher.createAndSwitchToNewUser() } verify(atLeast = 1) { mockOperationRepo.enqueue(any()) }

This trivially passes whether the methods were called 1 time or 10 times. There is no upper-bound check, so the over-invocation (10 actual calls vs. 1 expected) is invisible to the test framework.

Impact

The test provides false confidence that the synchronized(lock) block enforces the single-switch invariant. A future refactor that accidentally removes the early-exit guard would still pass this test, because verify(atLeast = 1) is satisfied by any non-zero call count. The bug is in test code only; production behavior is unaffected.

How to fix it

Configure the mock to simulate the real side-effect: after createAndSwitchToNewUser() is called, clear the external ID so subsequent threads exit early.

every { mockUserSwitcher.createAndSwitchToNewUser() } answers { mockIdentityModelStore.model.externalId = null }

Then tighten the assertions to exactly = 1:

verify(exactly = 1) { mockUserSwitcher.createAndSwitchToNewUser() } verify(exactly = 1) { mockOperationRepo.enqueue(any()) }

Step-by-step proof

Test sets up mockIdentityModelStore with externalId = "current-user".

10 threads are started; each calls logoutHelper.logout().

Thread 1 acquires the lock, reads externalId = "current-user" (non-null), proceeds past the guard.

Thread 1 calls mockUserSwitcher.createAndSwitchToNewUser() — relaxed mock, no-op, externalId still "current-user".

Thread 1 enqueues a LoginUserOperation, releases the lock.

Thread 2 acquires the lock, reads externalId = "current-user" (still non-null), proceeds past the guard — executes the full logout path again.

Steps 5–6 repeat for all 10 threads.

verify(atLeast = 1) passes because createAndSwitchToNewUser and enqueue were each called 10 times.

abdulraqeeb33 · 2026-04-16T12:54:25Z

     */
-    var useIdentityVerification: Boolean
-        get() = getBooleanProperty(::useIdentityVerification.name) { false }
+    var useIdentityVerification: Boolean?


should we consider maintaining an Enum state here as opposed to a boolean that signifies three states?

abdulraqeeb33 · 2026-04-16T12:55:16Z

                        con.setRequestProperty("OneSignal-Session-Duration", headers.sessionDuration.toString())
                    }

+                    if (headers?.jwt != null) {


can we try using the kotlin version of null check here?

abdulraqeeb33 · 2026-04-16T12:58:46Z


            val ops = getNextOps(executeBucket)
-            Logging.debug("processQueueForever:ops:\n$ops")
+            val queueSnapshot = synchronized(queue) { queue.toList() }


Can you please elaborate on what we are doing here?

abdulraqeeb33 · 2026-04-16T13:03:14Z

                }
-                ExecutionResult.FAIL_UNAUTHORIZED, // TODO: Need to provide callback for app to reset JWT. For now, fail with no retry.
+                ExecutionResult.FAIL_UNAUTHORIZED -> {
+                    val identityVerificationEnabled = _configModelStore.model.useIdentityVerification == true


should we move this as a convenience method in the configModelStore so that its easy enough to refer to it. you could event create a Kotlin extension for it. also it could be testable

abdulraqeeb33 · 2026-04-16T19:24:52Z

    }

+    private fun dispatchJwtInvalidatedToApp(externalId: String) {
+        _jwtInvalidatedHandler?.let { handler ->


if these handlers are not setup in time, we could silently skip this. should we queue up the invocations and invoke them when the handler is actually set?

abdulraqeeb33 · 2026-04-16T19:27:37Z

+     */
+    private fun shouldSuppressAnonymousOp(op: Operation): Boolean {
+        if (op is LoginUserOperation) return false
+        return _configModelStore.model.useIdentityVerification == true && op.externalId == null


is useIdentityVerification set to be volatile? that way has the accurate value in every read?

nan-li changed the title ~~Feat/identity verification 5.8~~ Feat: identity verification 5.8 Mar 30, 2026

nan-li changed the title ~~Feat: identity verification 5.8~~ feat: identity verification 5.8 Mar 30, 2026

nan-li force-pushed the feat/identity_verification_5.8 branch from 4509a8f to 457b745 Compare March 30, 2026 09:35

claude bot reviewed Mar 30, 2026

View reviewed changes

nan-li force-pushed the feat/identity_verification_5.8 branch 4 times, most recently from 113e96b to 00d3799 Compare March 30, 2026 23:05

nan-li force-pushed the feat/identity_verification_5.8 branch from 972eace to f6e4227 Compare March 31, 2026 08:49

nan-li requested a review from a team March 31, 2026 16:01

abdulraqeeb33 requested changes Mar 31, 2026

View reviewed changes

nan-li force-pushed the feat/identity_verification_5.8 branch 3 times, most recently from ca64cdf to 258c8e6 Compare April 3, 2026 03:46

nan-li requested a review from abdulraqeeb33 April 3, 2026 06:57

abdulraqeeb33 approved these changes Apr 9, 2026

View reviewed changes

claude bot reviewed Apr 13, 2026

View reviewed changes

claude bot reviewed Apr 14, 2026

View reviewed changes

nan-li force-pushed the feat/identity_verification_5.8 branch from b445a01 to 17fc13b Compare April 14, 2026 00:55

claude bot reviewed Apr 14, 2026

View reviewed changes

Comment thread ...DK/onesignal/core/src/main/java/com/onesignal/core/internal/operations/impl/OperationRepo.kt

Comment thread ...DK/onesignal/core/src/main/java/com/onesignal/core/internal/operations/impl/OperationRepo.kt

claude bot reviewed Apr 14, 2026

View reviewed changes

claude bot reviewed Apr 15, 2026

View reviewed changes

nan-li force-pushed the feat/identity_verification_5.8 branch from 16f9650 to fbc53f2 Compare April 15, 2026 02:56

nan-li added 2 commits April 14, 2026 20:03

add public API to interface

ca6974e

nan-li and others added 19 commits April 14, 2026 20:03

Add @volatile to _jwtInvalidatedHandler for JMM visibility

c7bc6b2

The field is written from the main thread in IdentityVerificationService.start() and read on the OperationRepo coroutine thread. @volatile guarantees cross-thread visibility. Made-with: Cursor

Add identity verification manual test plan

e818c0a

Made-with: Cursor

Update IDENTITY_VERIFICATION_MANUAL_TEST_PLAN.md

d663e43

Add KDoc to hasValidJwtIfRequired explain gating

ab453bb

fireJwtInvalidated off main

e4bf212

nit: update formatting

374528b

Add updateUserJwtSuspend and waitForInit to updateUserJwt

8606964

Add suspend variant of updateUserJwt for API consistency with login/loginSuspend. Also add waitForInit to the non-suspend updateUserJwt so it safely handles calls before SDK initialization.

fix: login race condition and JWT FAIL_UNAUTHORIZED blocking callers (#…

d198153

…2600) Co-authored-by: AR Abdul Azeez <abdul@onesignal.com>

nan-li force-pushed the feat/identity_verification_5.8 branch from fbc53f2 to fb2353f Compare April 15, 2026 03:04

claude bot reviewed Apr 15, 2026

View reviewed changes

Comment thread OneSignalSDK/onesignal/core/src/test/java/com/onesignal/user/internal/LogoutHelperTests.kt

nan-li force-pushed the feat/identity_verification_5.8 branch from c5086c3 to 3f3dd09 Compare April 16, 2026 08:06

claude bot reviewed Apr 16, 2026

View reviewed changes

Remove manual test plan doc from repo

9d3d46f

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude bot reviewed Apr 16, 2026

View reviewed changes

abdulraqeeb33 reviewed Apr 16, 2026

View reviewed changes

Conversation

nan-li commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

One Line Summary

Details

Motivation

Scope

Testing

Unit testing

Manual testing

Affected code checklist

Checklist

Overview

Testing

Final pass

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nan-li commented Mar 31, 2026

Uh oh!

claude bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Re-review: feat/identity_verification_5.8

Previously Critical — Now Fixed ✅

Remaining Observations (non-blocking)

Uh oh!

nan-li commented Mar 31, 2026

Uh oh!

claude bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Re-review: feat/identity_verification_5.8

Previously Non-blocking — Now Fixed ✅

Minor Observation (no action needed)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abdulraqeeb33 left a comment

Choose a reason for hiding this comment

Uh oh!

nan-li commented Apr 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Overview

Prior Review Rounds

Security Risks

Level of Scrutiny

Other Factors

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nan-li commented Mar 30, 2026 •

edited

Loading

claude bot commented Mar 31, 2026 •

edited

Loading

claude bot commented Mar 31, 2026 •

edited

Loading