TKTokenDriverConfiguration becomes permanently unusable after ctkd process restart

Background

We're building a macOS application that acts as a CryptoTokenKit software token. The architecture follows the documented pattern: a container app (a long-running agent process) manages token registration and identity updates via TKTokenDriverConfiguration, and a separate appex extension process handles the actual signing operations for client sessions.

What we're doing

At agent startup, the container app calls [TKTokenDriverConfiguration driverConfigurations] to obtain our token driver, then registers a token instance ID:

NSDictionary<TKTokenDriverClassID, TKTokenDriverConfiguration *> *driverConfigurations = [TKTokenDriverConfiguration driverConfigurations]; TKTokenDriverConfiguration driver = / first value from driverConfigurations */;

[driver addTokenConfigurationForTokenInstanceID:@"setoken"];

When the agent renews a certificate, it pushes updated TKTokenKeychainItem objects to ctkd by setting keychainItems on the TKTokenConfiguration:

TKTokenConfiguration *tokenCfg = driver.tokenConfigurations[@"setoken"]; tokenCfg.keychainItems = updatedItems;

This works correctly during normal operation.

The failure

When ctkd is restarted (e.g., killall ctkd, or the system restarts the daemon), all subsequent calls through the existing TKTokenDriverConfiguration reference silently fail. Specifically:

  1. [TKTokenDriverConfiguration driverConfigurations] returns the same stale object - it does not establish a new connection to the newly-started ctkd process. There is no error, no exception, and no indication the returned object is invalid.
  2. driver.tokenConfigurations[@"setoken"] still returns a non-nil value reflecting the pre-restart state - so any nil check intended to detect "token not registered with ctkd" does not fire.
  3. [driver addTokenConfigurationForTokenInstanceID:@"setoken"] appears to succeed (no error) but the token is not actually registered with the new ctkd instance.
  4. Setting tokenCfg.keychainItems = updatedItems appears to succeed but the new ctkd instance has no knowledge of the update.

The only reliable recovery we've found is restarting the container app process itself, at which point [TKTokenDriverConfiguration driverConfigurations] returns a fresh object connected to the new ctkd instance.

What we've investigated

  • There is no public API on TKTokenDriverConfiguration to invalidate or refresh the internal XPC connection to ctkd
  • TKTokenWatcher can observe token insertions/removals, but we found no documented way to use it to detect a ctkd process restart specifically
  • The NSXPCConnection invalidation handler pattern is not accessible through the TKTokenDriverConfiguration abstraction
  • Moving credential management into the appex extension. Since the appex extension is recreated when the ctkd process restarts, we are able to update keychainItems from the extension. However, this comes with it's own set of problems: the extension is ephemeral and using the keychain APIs directly from the extension is not well documented and does not appear to be a supported pattern.

Questions

  1. Is there a supported API to detect that ctkd has restarted and that the existing TKTokenDriverConfiguration reference is no longer valid?
  2. Is there a supported way to obtain a fresh TKTokenDriverConfiguration without restarting the container app?
  3. Should the container app be re-architected to avoid holding long-lived TKTokenDriverConfiguration references?
Answered by DTS Engineer in 882222022

I recommend that you file a bug about this. Once you’re done, please post your bug number, just for the record.

In general, frameworks that use XPC to talk to a daemon should recover if the daemon terminates for some reason. Now, you manually killing the daemon is an extra-ordinary thing, but I understand that you’re doing that just to illustrate your point. ctkd could be terminated by the system for other reasons [1], or it could crash, and the framework should handle that.

Are you actually seeing this in practice? I mean, without you manually killing the daemon?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] Notably, both the daemon and agent variants specifically opt in to memory pressure termination:

% plutil -p /System/Library/LaunchDaemons/com.apple.ctkd.plist
{
  "EnablePressuredExit" => true
  …
}
% plutil -p /System/Library/LaunchAgents/com.apple.ctkd.plist
{
  "EnablePressuredExit" => true
  …
}

I recommend that you file a bug about this. Once you’re done, please post your bug number, just for the record.

In general, frameworks that use XPC to talk to a daemon should recover if the daemon terminates for some reason. Now, you manually killing the daemon is an extra-ordinary thing, but I understand that you’re doing that just to illustrate your point. ctkd could be terminated by the system for other reasons [1], or it could crash, and the framework should handle that.

Are you actually seeing this in practice? I mean, without you manually killing the daemon?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] Notably, both the daemon and agent variants specifically opt in to memory pressure termination:

% plutil -p /System/Library/LaunchDaemons/com.apple.ctkd.plist
{
  "EnablePressuredExit" => true
  …
}
% plutil -p /System/Library/LaunchAgents/com.apple.ctkd.plist
{
  "EnablePressuredExit" => true
  …
}

Hey Quinn, thanks for response. I will open a bug and post the bug number.

We are actually seeing seeing this happen in practice. I only recently started correlating it to ctkd restart. The behavior we see is that the xpc connection from our container app loses connectivity to ctkd. It happens after running without issue for some time, and it's not deterministic.

We are actually seeing seeing this happen in practice.

OK.

it's not deterministic.

Right. It’s hard to say what’s going on without more logging, but the most likely cause is a memory pressure exit, and that’s very non-deterministic.

I will open a bug and post the bug number.

Much appreciated.

Ideally this would include a sysdiagnose log taken:

  • On a device with extra CryptoTokenKit logging enabled (see below)
  • Shortly after seeing the problem

But that might be hard to get, so you should feel free to file a bug with only your ‘kill ctkd’ step as evidence.

On the logging front, our Bug Reporting > Profiles and Logs only has a CryptoTokenKit profile for iOS. However, you can install that profile on macOS, and I think it’ll just work. Do that by navigating to System Settings > General > Device Management, clicking the add (+) button under the list, and choosing the profile in the file selection sheet.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

TKTokenDriverConfiguration becomes permanently unusable after ctkd process restart
 
 
Q