Skip to content

[WIP] address ably-cocoa push notification token storage issues#441

Draft
lawrence-forooghian wants to merge 6 commits intomainfrom
2026-03-20-investigating-ably-cocoa-push-registration-failures
Draft

[WIP] address ably-cocoa push notification token storage issues#441
lawrence-forooghian wants to merge 6 commits intomainfrom
2026-03-20-investigating-ably-cocoa-push-registration-failures

Conversation

@lawrence-forooghian
Copy link
Collaborator

@lawrence-forooghian lawrence-forooghian commented Mar 23, 2026

Partially addresses the issue of ably-cocoa storing device secret in keychain and not wiping the device token when it fails to load the secret. WIP, and largely Claude-gnerated. There are still things to address here and need to think about whether we want to try and fetch a new token instead of just validating one (see commit messages for why haven't done this yet). Also we need to decide whether we're happy migrating to a storage for token and secret that's always available. Will finish add references to existing conversations once back from illness; we should discuss

lawrence-forooghian and others added 6 commits March 20, 2026 10:08
The note in RSH8k2 mentioned RSH3a2b three times but only linked
the first reference. Link the other two for consistency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The note previously said "our implementations" all generate eagerly,
but this isn't accurate. ably-java follows RSH3a2b (lazy generation
on CalledActivate); ably-cocoa and ably-js generate eagerly at device
fetch time.

Checked against:
- ably-cocoa 745e7b7
- ably-java da4c60f
- ably-js 17be43e

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RSH8j describes the failure path of RSH8a's loading step, so it
belongs as a subpoint of RSH8a rather than a standalone point that
reaches across into the state machine. Move it to RSH8a1, keeping
the original text, and leave a replacement stub at RSH8j.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the problems with this spec point:

- It doesn't clear the deviceIdentityToken, so RSH3a2a will try to
  use a stale token against an absent or non-matching device ID
- It's ambiguous whether "generate new" is a new instruction or a
  restatement of RSH3a2b, and if the latter, RSH3a2a runs first
- It bypasses the state machine's event-driven model and doesn't
  specify what should happen to an in-flight activate() call
- No implementations currently implement this spec point

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add RSH3h to require loading and verifying LocalDevice details at
state machine init time, before processing any events. If the load
fails, the state machine starts in NotActivated with all device
state discarded (RSH3h1).

Add RSH8a2 to document the atomicity requirement for the (id,
deviceSecret, deviceIdentityToken) tuple, with allowance for
split storage (e.g. platform keychain) provided the implementation
can detect when the loaded tuple diverges from what was persisted.
RSH8a2a provides a fallback for legacy data without an atomicity
mechanism.

Update RSH8a to reference RSH3h for the state machine init trigger.
Replace RSH8a1 with a stub pointing to RSH3h.

TODO: squash this branch's commits before merging, and consider
removing the RSH8a1 and RSH8j replacement stubs entirely (since
neither was in a released spec version).

Note: this change does not take a position on whether implementations
should or should not use secure-but-sometimes-unavailable storage
(e.g. a platform keychain) for sensitive device details. The intent
is to meet ably-cocoa where it is and not rule out this possibility
in the future (customers have requested the ability to store device
secrets securely [1]). We should still consider how to handle the
case where such storage is temporarily unavailable (e.g. iOS
Keychain before first unlock after reboot) — the current spec
treats this as a load failure and discards the data, but a future
enhancement could allow the state machine to wait for the storage
to become available.

[1] ably/ably-java#593

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add RSH3i (ValidatingDeviceIdentityToken state) for validating
legacy ably-cocoa data where the deviceIdentityToken may not belong
to the current device id. On CalledActivate, a non-mutating GET
to /push/deviceRegistrations/:deviceId authenticated with the token
confirms whether it matches the device id. Success re-persists with
the atomicity mechanism; server rejection discards everything and
transitions to NotActivated; transient errors remain in state.

Add RSH8a2a with ably-cocoa-specific handling for legacy data:
- RSH8a2a1: invariant checks (id/secret both present or absent,
  token only with id+secret) — violation is a load failure
- RSH8a2a2: if invariants pass and all three present, indicate to
  RSH3h2 that the token needs server validation

Update RSH3h2 to enter ValidatingDeviceIdentityToken when RSH8a2a
indicates validation is needed.

The context for this state is: ably-cocoa stores the deviceSecret
in the Keychain keyed by device id, so if a secret is present it
is guaranteed to belong to the current id. The deviceIdentityToken
is stored under a fixed key in NSUserDefaults and may belong to a
previous device id. The token is opaque to the client so we cannot
verify the binding locally.

We considered three options for handling this:

1. Validate the token with a non-mutating GET; if invalid, discard
   everything and re-register. Simple, no custom callback needed,
   but orphans the server-side registration.

2. Recreate the token by PATCHing with the deviceSecret; the server
   would return a fresh token. Preserves the registration, but
   the spec's custom registerCallback has no way to distinguish
   between POST/PUT/PATCH semantics, making it unclear whether the
   callback would do the right thing.

3. Abandon the token and use the deviceSecret for all subsequent
   authentication. Simple, but unknown consequences — the token
   exists for a reason and this would diverge from the spec's
   authentication model.

We chose option 1. The trade-off (orphaning a registration) is
acceptable: it's what already happens today with the bug, the
orphaned registration is already unusable, and it only occurs
once per device during migration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lawrence-forooghian lawrence-forooghian changed the title [WIP] address ably-cocoa [WIP] address ably-cocoa push notification token storage issues Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant