Our driver next fails due to DriverKit attempting to call

We are developing an IOUserSCSIParallelInterfaceController driver for a legacy controller.

After completing init() successfully, no other methods are called before a DriverKit assert failure - .driver.dext) Assertion failed: (notsync || !remote || (msgid == IOService_Start_ID) || queue->OnQueue()), function Invoke, file uioserver.cpp, line 1654. - that leads immediately to "IOPCIDevice::ClientCrashed_Impl() for client " .

This appears to be because the framework is attempting a SetPowerState()' during DEXT load, which is delivered off-queue and panics in OSMetaClassBase::Invoke (uioserver.cpp`) before any other dext method runs.

Apple Silicon Mac, macOS 26 (DriverKit 25.1 SDK), Xcode 26.x.

Expected: The SetPowerState() power-up is delivered on the driver object's dispatch queue (or otherwise handled by the framework), allowing the dext's SetPowerState override to run, ACK via super::SetPowerState(powerFlags, SUPERDISPATCH), and proceed to UserInitializeController.

Actual behavior: init() completes (our trace logs init then completed init) Neither Start_Impl nor SetPowerState_Impl ever executes Instead the process fails with the assertion above , IOPCIDevice::ClientCrashed_Impl() reports "client … does not have open session … skipping recovery".

  • The dext crash-loops ("Driver … has crashed N time(s)").

Any suggestions?

Answered by DTS Engineer in 890880022

First off, as a disclaimer, this is a crazy week for me so you may get a lot more "quick guess" than well thought out answers. If you're still stuck, then next week I can look at things more deeply.

Getting into things:

This appears to be because the framework is attempting a SetPowerState()' during DEXT load, which is delivered off-queue and panics in OSMetaClassBase::Invoke (uioserver.cpp`) before any other dext method runs.

A few suggestions off the top of my head:

  • Start by using IOService as your provider, NOT the SCSI stack. In my experience, it's MUCH easier to try to figure out why a DEXT that worked broke than it is to figure out "why it just doesn't work".

  • Similar, particularly in the early stage, take VERY small steps. You can end up wasting SO much time because you did just a LITTLE to much work at once.

  • This sounds a bit like this panic.

In terms of the specific :

Expected: The SetPowerState() power-up is delivered on the driver object's dispatch queue (or otherwise handled by the framework), allowing the dext's SetPowerState override to run, ACK via super::SetPowerState(powerFlags, SUPERDISPATCH), and proceed to UserInitializeController.

...SCSIControllerDriverKit has VERY specific expectations and rules about which DispatchQueue's you'll use and how you'll use them. Watch "Modernize PCI and SCSI drivers with DriverKit" and make sure you create exactly the same queues in exactly the same places.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

The title should obviously be Our driver DEXT fails due to DriverKit attempting to call SetPowerState()

First off, as a disclaimer, this is a crazy week for me so you may get a lot more "quick guess" than well thought out answers. If you're still stuck, then next week I can look at things more deeply.

Getting into things:

This appears to be because the framework is attempting a SetPowerState()' during DEXT load, which is delivered off-queue and panics in OSMetaClassBase::Invoke (uioserver.cpp`) before any other dext method runs.

A few suggestions off the top of my head:

  • Start by using IOService as your provider, NOT the SCSI stack. In my experience, it's MUCH easier to try to figure out why a DEXT that worked broke than it is to figure out "why it just doesn't work".

  • Similar, particularly in the early stage, take VERY small steps. You can end up wasting SO much time because you did just a LITTLE to much work at once.

  • This sounds a bit like this panic.

In terms of the specific :

Expected: The SetPowerState() power-up is delivered on the driver object's dispatch queue (or otherwise handled by the framework), allowing the dext's SetPowerState override to run, ACK via super::SetPowerState(powerFlags, SUPERDISPATCH), and proceed to UserInitializeController.

...SCSIControllerDriverKit has VERY specific expectations and rules about which DispatchQueue's you'll use and how you'll use them. Watch "Modernize PCI and SCSI drivers with DriverKit" and make sure you create exactly the same queues in exactly the same places.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thank you very much for your thoughtful response.

Start by using IOService as your provider, NOT the SCSI stack. In my experience, it's MUCH easier to try to figure out why a DEXT that worked broke than it is to figure out "why it just doesn't work".

Yes, rapidly learning that. Tried something part way towards that - but not all the way down to IOService as the base class, can try that next.

One thing to emphasize is that if we do not attempt to override init(), absolutely no DEXT methods get called before that assert fail.

This sounds a bit like this panic.

Indeed, was very excited to find that one before posting here! But alas, it doesn't seem to be the same sequence of events: it's immediate, not a timeout and in the simplest case we haven't had a chance to run UserInitializeController() (or any other code) yet?

Are any declarations of queues in the header matter or is it only about the initialization code?

Watch "Modernize PCI and SCSI drivers with DriverKit" and make sure you create exactly the same queues in exactly the same places.

Have watched it but will again, surely it must be something embarrassingly simple. (As an aside, wish the source code in that video was available).

Thanks again.

Yes, rapidly learning that.

Yeah... my rule of thumb used to be that it ALWAYS took 2 days to get a KEXT (and now DEXT) matching, loading, and unloading— not working or useful, just able to behave correctly in the "basic" case. Critically, that time didn't change with experience, nor was there any consistent reason. It just always seemed to take a while because kernel development is trifficult[1].

As a side note, that should be the first state you're trying to get to, particularly the unloading case. As in many other places, it's MUCH easier to figure out why a change you "just" made broke unloading vs. why a fully functional driver won't unload cleanly. Clean unloading also means you’re less likely to be forced to reboot.

Tried something part way towards that - but not all the way down to IOService as the base class, can try that next.

Also, coming at this from the "other" direction, "PEX8733" inside the IOPCIFamily opensource release is a full DEXT implementation. It's fairly easy to remove the guts of its implementation (which will obviously be different than your hardware) and adjust its matching, at which point you'll have a basic DEXT implementation to move forward "from".

One thing to emphasize is that if we do not attempt to override init(), absolutely no DEXT methods get called before that assert fails.

So, take a look at my "Basic introduction to DEXT Matching and Loading" post, as my guess is that your IOKitPersonalities dictionary is actually telling the kernel to do something you didn't really want it to do. My physic guess would be this:

"Note that the common mistake many developers make is leaving "IOUserService" in place when they should have specified a family-specific subclass (case 2 above). This is an undocumented implementation detail, but if there is a mismatch between your DEXT driver ("IOUserSCSIPeripheralDeviceType00") and your kernel driver ("IOUserService"), you end up trying to call unimplemented kernel methods. When a method is "missing" like that, the codegen system ends up handling that by returning kIOReturnUnsupported."

...but there are lots of ways the world can blow up. Speaking of which, what was the panic's own description and panic stack?

As a side note, the panic trace actually contains a stack trace from every thread on the system, which can actually be symbolicated using the (somewhat complicated) process described in this post. And, yes, we should have a tool/script for this and, yes, you should file a bug asking for that.

If you're still stuck, then I'd need to see your IOKitPersonalities dictionary and the panic info I mentioned above. If you want me to take a look, you can post that here or file a bug and post the bug number here.

Are any declarations of queues in the header matter or is it only about the initialization code?

There are some queue declarations in the header that can matter, but the easiest way to solve that is to simplify the contents of the head to the point that there isn't anything to fail PCIDriverKitPEX8733.iig only declares 5 methods and at this point you don't even need "InterruptOccurred".

[1] "Tricky and Difficult", Alexander J. Elliott

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thank you so much for your help. This has gotten us over the problem (to other problems, but better ones).

To recap:

  1. our IOUserSCSIParallelInterfaceController DEXT was matching, loading and getting its init() method called
  2. but then hit this assert - DriverKit assert failure - .driver.dext) Assertion failed: (notsync || !remote || (msgid == IOService_Start_ID) || queue->OnQueue()), function Invoke, file uioserver.cpp, line 1654. - sorry, I haven't had a chance to get the stacktrace you asked for yet
  3. it appears that the assert fails because the kernel was unhappy trying to dispatch a SetPowerState call to the wrong or no queue, before the driver was fully initiated

This led to a lot of questions and debugging about initialization, queues and C++ inheritance.

as my guess is that your IOKitPersonalities dictionary is actually telling the kernel to do something you didn't really want it to do. My physic guess would be this

But instead, your answer here was exactly the problem! The IOKitPersonalities were wrong, which apparently (but not obviously to a new user) controls was methods the kernel tries to call in the driver. I mean, it makes sense in retrospect but it's trifficult as you say.

For SCSI, it needs to be CFBundleIdentifierKernel: com.apple.iokit.IOSCSIParallelFamily IOClass: IOUserSCSIParallelInterfaceController IOProviderClass: IOPCIDevice IOUserClass: [ourclassname]

We had: IOClass: IOService IOProviderClass: IOPCIDevice IOUserClass: [ourclassname]

so missing the CFBundleIdentifierKernel and the wrong IOClass.

Such a silly thing really, esp. since it tells you exactly what to put in there at the very start of IOUserSCSI documentation... but as a mitigating circumstances: 1. most of the examples out there are about the stripped down examples (like the PEX8733), not the SCSI case 2. the DEXT did match and start running, so it was easy to think that all of the .plist issues were solved 3. it would've been easier to figure out if the kernel (or Xcode) provided some validation of the C++ vtables to the .plist declarations - it never occurred to me to. There was an erroneous bias towards looking at the .iig and C++ code, I think, rather than the property files... Hopefully some of this is helpful to others in the future. Anyway, thanks again for your help. I'm sure there will be more questions with the next set of problems.

Our driver next fails due to DriverKit attempting to call
 
 
Q