Audio

Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Post

Replies

Boosts

Views

Activity

.sf3 sound font

...when will they be supported natively?

Media Technologies Audio

AVPlayerNode + AVAudioEngine

What's the best way of detecting if player node / audio engine is in broken state (for example as a result of AVAudioSession.mediaServicesWereResetNotification). Sometimes it requires being reset, but from time to time it is not enought to just reset it for example playerNode.reset() engine.reset() But also required to reinitialize it playerNode.reset() playerNode = .init() engine.reset() engine = .init() Can we subscribe on the broken states and reinitialize them proactively?

Media Technologies Audio

kAudioUnitProperty_ParameterList vs AUParameterTree

This is kind of related to my other question, where I think AUv3's KVO to AUv2 C API notification mapping doesn't seem to work. For example, Reaper supports AUv3 plug-ins but I think it uses AUv2 APIs, and the developer told me that they observe these: kAudioUnitEvent_PropertyChange / kAudioUnitScope_Global / kAudioUnitProperty_ParameterList If one of them changes, they will rescan the parameter list/tree. My AUv3 Mela has a dynamic parameter tree, and changes based on the preset loaded, and it works well enough in Logic Pro for iPad/Mac and AUM on iOS. I know it sends out KVOs for the parameterTree property and I've also tried sending out allParameterValues KVOs but it seems Reaper is not getting these notifications. Anything I can do? (Reaper forum discussion, check last few messages https://forum.cockos.com/showthread.php?t=300840)

Media Technologies Audio

Is there plans to evolve AUv3 API?

AUv3 was introduced back when iOS 9 came out, and it only had very minor updates. In a way, it didn't seem important enough for the desktop world, as they continue to use AUv2, and only iOS musicians/developers embraced AUv3. Even Logic Pro doesn't feel like it fully embraces it. It is almost like there was no good enough reason for the desktop world to switch over. Also, because of this, I think many bugs remain. One of the bugs I keep hitting, I think, is an AUv2 to AUv3 API mapping issue. Where AUv3 uses KVO for certain properties, and somehow doesn't work well if the host uses the C API. Here's one example: https://developer.apple.com/forums/thread/828549

Media Technologies Audio

a problem

Handling multilingual text. When a user selects a block of text that is mostly Chinese but contains embedded English words (e.g., technical terms in parentheses), the system reader often stutters, stops, or skips the English entirely. What is the best way to handle mixed-language text processing so that the speech engine can seamlessly and fluidly read Chinese and English together without dropping words?

Media Technologies Audio

Native diarization in '27?

I'm working on a macOS transcription utility that uses Apple's Speech framework (SpeechAnalyzer) for speech-to-text. This is for meetings/interviews/podcasts where speaker identification is critical. The current limitation is speaker attribution — I need to identify which speaker is producing each segment of transcribed text. I have three questions: Native diarization in iOS 27 / macOS Golden Gate Is native diarization coming in the fall release? I've reviewed the WWDC 2026 session catalog and found no mention of diarization in SpeechAnalyzer or elsewhere. I'm probably going to use FluidAudio for speaker attribution, but I'd strongly prefer a native solution if one exists or is planned. Do I need to stay with third-party libraries, or is this coming? Core AI and custom models The new Core AI framework was announced for on-device model deployment. Can I train or integrate a custom diarization model via Core AI? If yes, are there sample implementations or documentation for audio-processing models? Core Audio framework updates Were there any Core Audio API-level additions announced at WWDC 2026 that might support audio analysis or speaker detection downstream? I saw no dedicated session, but wanted to verify. Thanks for any guidance on this.

Media Technologies Audio

How to Fix the Emotionless and Cold Tone of Machine-Read Text?

I am designing an educational app. I notice that current system text-to-speech (like AVSpeechSynthesizer) often sounds too mechanical because the time intervals between characters are strictly equal, making it lack natural human prosody, phrasing, and warmth-which is a huge dealbreaker for sensitive users like children. How can we customize text-to-speech to break this uniform word-spacing, manage prosody dynamically, and make the Al voice sound more emotionally engaging and natural rather than a cold robot? I really want to create an elegant listening experience that feels like a real human storytelling, not just machine reading.

Media Technologies Audio

Native diarization in '27?

I'm working on a macOS transcription utility that uses Apple's Speech framework (SpeechAnalyzer) for speech-to-text. This is for meetings/interviews/podcasts where speaker identification is critical. The current challenge is speaker attribution — I need to identify which speaker is producing each segment of transcribed text, and Apple doesn't support this in '26. I have three questions: Is native diarization coming in the fall release? I've reviewed the WWDC 2026 session catalog and found no mention of diarization in SpeechAnalyzer or elsewhere. I'm probably going to use FluidAudio for speaker attribution, but strongly prefer a native solution if one exists or is planned. Do I need to stay with third-party libraries, or is this coming? Core AI and custom models The new Core AI framework was announced for on-device model deployment. Can I train or integrate a custom diarization model via Core AI? If yes, are there sample implementations or documentation for audio-processing models? Core Audio framework updates Were there any Core Audio API-level additions announced at WWDC 2026 that might support audio analysis or speaker detection downstream? I saw no dedicated session, but wanted to verify. Thanks for any guidance on this.

Media Technologies Audio

Real-time Audio Analysis of Audio Played by Other Apps on iPhone

I’m evaluating a simple iOS application that would perform real-time beat detection and audio analysis. My question is: Can an App Store-compliant iOS application access or analyze audio that is being played by other applications on the same device (e.g. Spotify, Apple Music, YouTube, TikTok, Safari, etc.) in real time, without using the microphone? Specifically: Is there any Apple-supported framework that allows access to system audio for real-time beat detection or frequency analysis? Can ReplayKit be used to analyze audio buffers from other applications in real time without recording or saving the audio? If direct access is not permitted, what Apple-approved architecture would be recommended for synchronizing external hardware with music being played on the iPhone? Would such an implementation be acceptable under App Store Review Guidelines? I am trying to determine whether real-time beat detection from audio played by other apps is technically and policy-wise supported on iOS. Thank you.

Media Technologies Audio Audio ReplayKit AVFoundation iOS

148

Push Notification sounds with AVAudioSession, AVAudioEngine

I am using AVAudioSession, AVAudioEngine and SpeechAnalyzer to listen to commands, also when the phone is locked. In the same time, I can receive PushNotifications with pre-defined sound. However, the pre-defined sound is not played when the AVAudioEngine is running and the phone is locked. In the code below, I have made many experiments, all of them are "Receive Push Notification while the phone is locked", and I have the following results: If audioEngine has started - I only see the alert, but no sound. If I comment out audioEngine.start, all works as expected and I hear the apns sound on the speaker. If I change the AVAudioSession category to 'record' I don't receive the push message at all! I wonder if anyone has seen it. Here is my code: private func doStartListening() async { print("SpeechService: doStartListening called") guard !audioEngine.isRunning else { print("SpeechService: Audio engine already running") return } do { try configureAudioSession() let recordingFormat = audioEngine.inputNode.outputFormat(forBus: 0) audioEngine.inputNode.removeTap(onBus: 0) guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: "en-US")) else { print("English is not supported on this device") return } let transcriber = SpeechTranscriber(locale: locale, preset: .transcription) if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) { try await installationRequest.downloadAndInstall() } let (inputSequence, inputBuilder) = AsyncStream.makeStream(of: AnalyzerInput.self) let audioFormat = await SpeechAnalyzer.bestAvailableAudioFormat(compatibleWith: [transcriber]) let analyzer = SpeechAnalyzer(modules: [transcriber]) // Initialize the modern SpeechAnalyzer self.analyzer = analyzer task = Task { print("SpeechService: Starting analyzer results loop") do { for try await result in transcriber.results { if Task.isCancelled { break } self.handleAnalyzerResult(result) } } catch { print("SpeechService: Analyzer error: \(error.localizedDescription)") let nsError = error as NSError if nsError.domain == "kAFAssistantErrorDomain" && nsError.code == 203 { self.addLog(NSLocalizedString("error_siri_disabled", comment: "")) Task { await self.stopListening() } } else if self.isListening { self.restartRecognition() } } } audioEngine.inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { [weak self]buffer, _ in guard let audioFormat else { return } do { let converted = try self!.converter.convertBuffer(buffer, to: audioFormat) inputBuilder.yield(AnalyzerInput(buffer: converted)) } catch { print("Exception when converting audio") } } audioEngine.prepare() try audioEngine.start() print("SpeechService: Audio engine started") try await analyzer.start(inputSequence: inputSequence) isListening = true addLog(NSLocalizedString("waiting_wakeup", comment: "")) } catch { print("SpeechService: Error starting listening: \(error.localizedDescription)") addLog("Error starting listening: \(error.localizedDescription)") lastError = error.localizedDescription isListening = false } } private func configureAudioSession() throws { let audioSession = AVAudioSession.sharedInstance() try audioSession.setCategory(.playAndRecord, mode: .default, options: [.mixWithOthers, .defaultToSpeaker]) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) }

Media Technologies Audio APNS Media Player AVAudioSession AVAudioEngine

430

Entitlement "com.apple.developer.carplay-driving-task" not allowing audio playback for voice controlled interaction

According to https://developer.apple.com/download/files/CarPlay-Developer-Guide.pdf , apps with entitlement com.apple.developer.carplay-driving-task are allowed to use voice control. In my current implementation the voice recording working fine but the voice response (AVPlayer with category "playback set") does not output any audio. I suspect that it is a entitlement limitation because if I quickly tap to play a music while the voice assistant AVPlayer is "playing", then I can hear the response, but without this trick it stays playing but mute. In parallel I have now requested com.apple.developer.carplay-voice-based-conversation entitlement , but I don't even know if when approved I will be able to use 2 entitlement for the same CarPlay app. Long story short: 1 - Should an app be able to play audio responses when it's CarPlay entitlement is com.apple.developer.carplay-driving-task? 2 - If not, can I combine entitlements com.apple.developer.carplay-driving-task and com.apple.developer.carplay-voice-based-conversation?

Media Technologies Audio CarPlay Audio Siri and Voice

865

macOS 26 – NSSound/CoreAudio causes SIGILL crash in caulk allocator

Hi everyone, We are the engineering team behind an enterprise communications application for macOS. We are experiencing a critical crash on macOS 26 that did not occur on any previous macOS version. We are seeking clarification from Apple engineers or anyone who may have insight into this behaviour. Environment Architecturex86_64macOS26.4.1 (25E253)HardwareMac15,13 (MacBook Pro)ExceptionSIGILL / ILL_ILLOPCCrashed ThreadThread 0 (Main Thread)TriggerPlaying a notification sound via NSSound during an incoming call Crash Stack 0 caulk consolidating_free_map::maybe_create_free_node + 119 ← SIGILL 1 caulk tiered_allocator + 1469 2 caulk exported_resource::do_allocate + 15 3 AudioToolboxCore EABLImpl::create + 204 4 CoreAudio AUNotQuiteSoSimpleTimeFactory + 33267 8 AudioToolboxCore AudioUnitInitialize + 189 9 AudioToolbox XAudioUnit::Initialize + 19 10 AudioToolbox MESubmixGraph::initialize + 125 11 AudioToolbox MESubmixGraph::connectInputChannel + 1172 12 AudioToolbox MEDeviceStreamClient::AddRunningClient + 509 15 AudioToolbox AudioQueueObject::StartRunning + 194 16 AudioToolbox AudioQueueObject::Start + 1447 22 AudioToolbox AQ::API::V2Impl::AudioQueueStartWithFlags + 805 23 AVFAudio AVAudioPlayerCpp::playQueue + 354 24 AVFAudio AVAudioPlayerCpp::DoAction + 134 25 AVFAudio -[AVAudioPlayer play] + 26 26 AppKit -[NSSound play] + 100 27 Our App -[AudioHelper tryToStartSound:ofType:] + 569 28 Our App block_invoke + 59 Behaviour Difference Between macOS Versions The exact same code path that triggers this crash on macOS 26 works without any issue on macOS 14 and macOS 15 — no crash, no warning, no log output of any kind. The crash occurs inside Apple's private caulk memory allocator during CoreAudio audio engine initialisation, triggered by a call to [NSSound play]. The SIGILL / ILL_ILLOPC at maybe_create_free_node + 119 suggests a hard ud2 trap — an intentional abort guard inserted at compile time. This strongly suggests that something changed in macOS 26 within NSSound / CoreAudio / caulk that causes this code path to fail in a way it previously did not. Questions We have the following specific questions: Was there a deliberate threading policy change in NSSound / CoreAudio in macOS 26? Is the SIGILL in caulk::consolidating_free_map::maybe_create_free_node an intentional thread-affinity assertion introduced in macOS 26? Are there any other NSSound / AVAudioPlayer / AudioQueue APIs that have similarly tightened their requirements in macOS 26 that we should be aware of? Is there a migration guide, release note, or WWDC session that covers CoreAudio changes in macOS 26 that we may have missed? Has anyone else in the developer community encountered a similar SIGILL crash in caulk on macOS 26 during audio playback?

Media Technologies Audio AudioToolbox Audio Sound and Haptics AVFoundation

2.5k

SIGILL crash in AudioToolbox/caulk during AudioQueue creation on macOS 26.4.1 (Apple Silicon + Rosetta)

Product: macOS Version: macOS 26.4.1 (25E253) Area: Audio / AVFoundation / AudioToolbox Summary: We are observing a reproducible crash during audio playback initialization in our macOS application on Apple Silicon systems running macOS 26.4.1. The crash occurs inside Apple audio frameworks while creating an AudioQueue through AVAudioPlayer/NSSound APIs. Environment: Application: Avaya Workplace 3.41.0 Hardware: Apple Silicon (Mac14,7) OS: macOS 26.4.1 Application architecture: x86_64 running under Rosetta Frameworks involved: AppKit (NSSound) AVFAudio AudioToolbox caulk Crash Type: SIGILL (ILL_ILLOPC) Observed Stack: -[NSSound play] AVAudioPlayer play AudioQueueNewOutput AudioConverterNewWithOptions caulk::alloc::consolidating_free_map::maybe_create_free_node Details: The crash occurs while attempting to start ringtone/notification playback from the application. The failure happens during AudioQueue initialization before actual playback begins. The crashing thread consistently shows: caulk AudioToolboxCore AudioToolbox AVFAudio AppKit Application audio helper We also observed similar AudioQueue initialization stacks on multiple threads, which may indicate concurrent audio queue initialization. Questions: Is there any known regression in AudioToolbox/AVFAudio/caulk on macOS 26.4.1 affecting x86_64 applications running under Rosetta? Are there known limitations or unsupported scenarios involving AudioQueue creation from Rosetta-translated applications? Are there recommended alternatives or mitigations for NSSound/AVAudioPlayer usage on macOS 26? Reproduction: Launch application on Apple Silicon Mac Trigger ringtone/notification playback Application intermittently crashes during AudioQueue initialization Additional Notes: Crash is intermittent but reproducible in customer environments. The application currently uses NSSound/AVAudioPlayer for ringtone playback. We are also investigating whether concurrent sound initialization may contribute to the issue.

Media Technologies Audio AudioToolbox AVAudioEngine AVKit AVFoundation

177

Public API for controlling AirPods listening modes on macOS?

Hello, I am developing a macOS menu bar app, and I would like to let users switch AirPods listening modes from within the app, such as Transparency mode or Noise Cancellation. I reviewed Apple’s official documentation and the macOS SDK public headers for AVFoundation, AVFAudio, CoreBluetooth, IOBluetooth, MediaPlayer, Shortcuts/App Intents, and audio routing APIs, but I could not find a documented public API that allows a third-party macOS app to directly set AirPods listening modes. Is there any public, supported API, entitlement, or Apple-recommended integration path for implementing this feature? If no such public API exists, should third-party macOS apps treat direct AirPods listening mode control as unsupported, and only guide users to change the setting themselves through system UI or Shortcuts? I would like to implement this using supported APIs and avoid relying on undocumented or private APIs. Thank you.

Media Technologies Audio

Issues with monitoring and changing WebRTC audio output device in WKWebView

I am developing a VoIP app that uses WebRTC inside a WKWebView. Question 1: How can I monitor which audio output device WebRTC is currently using? I want to display this information in the UI for the user . Question 2: How can I change the current audio output device for WebRTC? I am using a JS Bridge to Objective-C code, attempting to change the audio device with the following code: void set_speaker(int n) { session = [AVAudioSession sharedInstance]; NSError *err = nil; if (n == 1) { [session overrideOutputAudioPort:AVAudioSessionPortOverrideSpeaker error:&err]; } else { [session overrideOutputAudioPort:AVAudioSessionPortOverrideNone error:&err]; } } However, this approach does not work. I am testing on an iPhone with iOS 16.7. Is a higher iOS version required?

Media Technologies Audio

588

audiomxd PVM misclassifies CarPlay head unit as BTHeadphones — CarPlay never activates despite successful iAP2 auth

iPhone 16 Pro, iOS 26.5 (23F77), Nissan Sentra 2022, USB-C wired CarPlay. CarPlay fails every attempt despite the iAP2 handshake completing successfully. Same cable + same car works with a different iPhone. After digging through sysdiagnose logs I found the root cause in audiomxd. The head unit's Bluetooth MAC (BC:42:8C:B8:06:AF) is permanently stored in the PVM as HeadphonesBT: PVM: Route [58] = HeadphonesBT~BC:42:8C:B8:06:AF This causes FigRoutingManagerProcessCustomizedRouting to return: isPortOfTypeCarPlayAtIndex = NO modelID = BTHeadphones4236,19521 And vaemProcessCarPlayCustomizedRouting never adds a starkPort: portsToAdd[0] = AirPlayHandoffDevice portsToAdd does NOT contain a starkPort Without a starkPort the audio routing never activates. The endpoint gets published under endpointManager=Bluetooth as BluetoothHFPInput/Output instead of CarPlay. Disconnect follows with errors -16723 and -16617. This survives Forget Device, Forget Car, Reset Network Settings, and reboots. The only thing that clears it is Reset All Settings. Two questions: Is there a supported way to purge audiomxd's PVM for a specific MAC address without Reset All Settings? 2. Is BTHeadphones4236,19521 a known misclassification for Nissan head units in iOS 26? Sysdiagnose attached to Feedback Assistant report FB22815215.

Media Technologies Audio IOBluetooth CarPlay Audio

368

Resuming Audio at full volume immediately after Siri command

I'm working on a podcast app and I'm running into a small quirk I'd like to fix. On Apple's Podcast app and on the Spotify app when I say, for example, "Hey Siri, skip" the audio pauses, the app performs the operation, and then immediately resumes playing the audio at the previous volume without waiting for the Siri overlay to dismiss. But my app doesn't do that. When I say "Hey Siri, skip" it pauses the audio, performs the operation, but then audio stays paused until the overlay dismisses or the audio resumes playing at a reduced volume until the overlay dismisses depending on which route I go. What I've tried: Stays paused until overlay dismisses: AVAudioSession.setCategory(.playback, mode: .spokenAudio), setActive(true) Register for AVAudioSession.interruptionNotification On .began interruption capture if audio is currently playing On .ended interruption: if it was playing before, call play() again Plays at reduced volume until the overlay dismisses: Same as above plus: Inside MPRemoteCommandCenter.shared().skipBackwardCommand, I call seek and then: AVAudioSession.sharedInstance().setActive(false, options: .notifyOthersOnDeactivation) AVAudioSession.sharedInstance().setCategory(.playback, mode: .spokenAudio, policy: .longFormAudio, options: []) AVAudioSession.sharedInstance().setActive(true) player.play() player.rate = playbackSpeed player.volume = 1.0 AVAudioSession.interruptionNotification finally arrives with .ended + .shouldResume, at which point volume snaps to normal. I tried that with and without setPrefersNoInterruptionsFromSystemAlerts(true) but there was no difference. Seems like .ended only arrives when the Siri overlay dismisses, and not during Siri's active state? While I was trying things XCode warned me that: Ignoring setPlaybackState because application does not contain entitlement com.apple.mediaremote.set-playback-state for platform Which, of course, I can't add b/c it's a private API. Do I need that to do what I want? Or am I missing something else? Thanks!

Media Technologies Audio AVAudioSession

173

MusicKit playback completely broken after Apple Music “What’s New?” update screen until native app is opened

I’m developing a third-party Apple Music streaming app using MusicKit (ApplicationMusicPlayer + catalog requests). Issue: Whenever Apple releases an Apple Music update that shows the “What’s New?” onboarding/modal screen in the native Apple Music app, MusicKit in our app completely breaks for all users. Attempts to play anything (queue, prepareToPlay, etc.) fail silently or with service-related errors. Playback and most MusicKit operations remain broken until the user opens the native Apple Music app, dismisses the “What’s New?” screen, and returns to our app. After that single native interaction (we deliberately stopped users from going any further within Apple Music to verify this), everything works perfectly again. Reproduction Steps: Apple Music receives an update with “What’s New?” screen. User launches our third-party app and attempts playback. MusicKit fails. User opens Apple Music → dismisses modal → returns to our app. MusicKit works again. Expected Behavior: Third-party MusicKit apps should not become non-functional because the native Apple Music app has a pending onboarding screen. Shared backend services (account readiness, tokens, subscription state, etc.) should initialize independently. Environment: iOS 26.4.2 Devices verified to be affected: iPhone 13 Pro iPhone XR iPhone 15 Workarounds attempted: Re-requesting MusicAuthorization Recreating ApplicationMusicPlayer Stopping/re-queuing Background/foreground app None resolve it without the native Apple Music interaction. This appears to be a recurring integration fragility with shared Apple Music services. Has anyone else seen this? Any recommended recovery path or API to force service initialization? Thanks!

Media Technologies Audio Apple Music API Media Player MusicKit MusicKit JS

1.1k

May ’26

AVCaptureSession runtime error -11800 / 'what' on startRunning() with audio input — what's holding the HAL?

AVCaptureSession.startRunning() triggers AVCaptureSessionRuntimeErrorNotification with AVError.unknown (-11800), underlying OSStatus 2003329396 → fourCC 'what', every cold launch, but only when an audio AVCaptureDeviceInput is attached. Removing only the audio input makes the error disappear. Same code in a fresh project records audio fine — bug only appears in this app's binary. AVAudioApplication.shared.recordPermission == .granted. Info.plist has NSMicrophoneUsageDescription. No interruption notifications fire. Test device: iPhone 16 Pro, iOS 26.4.2. iOS deployment target 17.1. Minimal reproducer import AVFoundation let session = AVCaptureSession() session.beginConfiguration() let camera = AVCaptureDevice.default(.builtInWideAngleCamera, for: .video, position: .back)! session.addInput(try AVCaptureDeviceInput(device: camera)) // Removing ONLY this line makes the error disappear: let mic = AVCaptureDevice.default(for: .audio)! session.addInput(try AVCaptureDeviceInput(device: mic)) session.addOutput(AVCaptureMovieFileOutput()) session.addOutput(AVCapturePhotoOutput()) session.commitConfiguration() NotificationCenter.default.addObserver( forName: .AVCaptureSessionRuntimeError, object: session, queue: nil ) { print($0.userInfo ?? [:]) } session.startRunning() // -11800 / 'what' fires within ~2 sec Observed state at error time AVError.unknown (-11800) underlyingError = NSError(NSOSStatusErrorDomain, 2003329396) userInfo[AVErrorFourCharCode] = 'what' captureSession.isRunning = false ← never came up captureSession.isInterrupted = false captureSession.preset = .high captureSession.inputs = [Back Triple Camera, iPhone Microphone] AVAudioSession.sharedInstance(): category = .playAndRecord mode = .videoRecording sampleRate = 48000.0 isInputAvailable = true isOtherAudioPlaying = false availableInputs = [MicrophoneBuiltIn] (no BT/Continuity/AirPods) currentRoute.inputs = [] ← EMPTY currentRoute.outputs = [Speaker|Speaker] 2003329396 = 0x77686174 = 'what'. From a few SO threads this maps to AURemoteIO::StartIO returning a HAL-bring-up failure. The smoking gun: currentRoute.inputs is empty even though availableInputs contains the built-in mic, isInputAvailable is true, the category is .playAndRecord, and isOtherAudioPlaying is false. The HAL never routes the mic into the session, then 'what' follows. Nothing observable from AVAudioSession indicates a competing client. Environment / SDKs linked Firebase (SPM: Crashlytics, Performance, Messaging, Analytics, AppCheck, RemoteConfig, DynamicLinks), FBSDK, Kingfisher, MetalPetal. Multiple Google ad mediation pods present, but their audio session takeover is already disabled (audioVideoManager.isAudioSessionApplicationManaged = true, IMSdk.shouldAutoManageAVAudioSession(false)). What I've ruled out (all still produce 'what') Audio session config: .playAndRecord/.videoRecording, .playAndRecord/.default, .record/.measurement, .record/.default. With/without .defaultToSpeaker, .allowBluetooth, .allowBluetoothA2DP, .mixWithOthers. setActive(true) before vs. after attaching audio input. setPreferredInput(builtInMic) (verified accepted). 200ms Thread.sleep between setActive(true) and startRunning(). Setting usesApplicationAudioSession = false swaps the fourCC to '!rec' but produces the same outcome. Topology: sessionPreset = .high / .hd1920x1080 / .hd1280x720 / .medium. Camera = .builtInTripleCamera / .builtInDualWideCamera / .builtInWideAngleCamera. AVCam-style always-attached graph. Setting sessionPreset before vs. after adding inputs. Threading: All session mutations on a single dedicated DispatchQueue (vs. Swift actor). 1× and 2× full stopRunning()+startRunning() recovery cycles ("do it twice" pattern) — both re-fail with 'what'. SDK takeover prevention: GoogleMobileAdsMediation pods (Vungle, Mintegral, Pangle, Unity, InMobi), Google-Mobile-Ads-SDK, MediaPipeTasksVision removed via full pod uninstall + clean build — 'what' persists. Notifications during the failure window: 3 × AVAudioSession.routeChangeNotification reason categoryChange before the error fires, even though category stays .playAndRecord/.videoRecording. Disabling automaticallyConfiguresApplicationAudioSession drops this to 1, but the runtime error still fires. No AVAudioSession.interruptionNotification. No AVCaptureSessionWasInterruptedNotification. Symbol audit otool -L and nm of the bundle confirm none of the linked frameworks reference AVAudioRecorder, AudioComponentInstanceNew, AURemoteIO, or AudioUnitInitialize in their symbol tables. Only the app's own files reference any audio API. Yet adding AVCaptureDeviceInput(.audio) reproduces 100% in this binary and 0% in a fresh project. My questions Who is most likely holding the audio HAL in a process where no linked framework references the AudioUnit / HAL APIs directly? Are there framework load-time audio initializations that don't show up in symbol tables (e.g., dynamic dlopen, CFBundleLoadExecutable) that could grab the HAL? Is there an os_log subsystem / category that surfaces the underlying AURemoteIO::StartIO failure reason at runtime? com.apple.coreaudio shows 'what' but not the originating cause. currentRoute.inputs is empty at error time even though availableInputs = [MicrophoneBuiltIn], isInputAvailable = true, and the category is .playAndRecord. What does an empty input route under those conditions imply, and what other system-level holders could be preventing the HAL from routing the mic in? Has anyone seen 'what' resolve with a device reboot, an iOS update, or by removing a specific framework? Happy to share a sysdiagnose. Thanks!

Media Technologies Audio AVAudioSession Core Audio AVFoundation

482

May ’26

Mac (Designed for iPad) cannot access microphone

I have an application that is a VOIP application of sorts that needs access to the microphone. I am using the Mac (Designed for iPad) support to not have to do huge amounts of conditional building and support for all the many iOS specific things my app includes. I never get prompted to allow microphone permissions and I never see my app name appear in Privacy & Security -> Microphone permissions setup. So is it that Mac is just a dead end for any form of an application that needs a microphone and is running under Mac (Designed for iPad) compatibility mode? Why doesn't TCC have some mechanism to notice and grant access to mic use?

Media Technologies Audio Mac Catalyst Audio Entitlements Privacy

641

May ’26

.sf3 sound font

...when will they be supported natively?

Media Technologies Audio

Replies: 1
Boosts: 0
Views: 35
Activity: 6d

AVPlayerNode + AVAudioEngine

Media Technologies Audio

Replies: 1
Boosts: 0
Views: 61
Activity: 6d

kAudioUnitProperty_ParameterList vs AUParameterTree

Media Technologies Audio

Replies: 1
Boosts: 0
Views: 40
Activity: 6d

Is there plans to evolve AUv3 API?

Media Technologies Audio

Replies: 1
Boosts: 0
Views: 61
Activity: 6d

a problem

Media Technologies Audio

Replies: 0
Boosts: 0
Views: 41
Activity: 6d

Native diarization in '27?

Media Technologies Audio

Replies: 0
Boosts: 0
Views: 23
Activity: 6d

How to Fix the Emotionless and Cold Tone of Machine-Read Text?

Media Technologies Audio

Replies: 0
Boosts: 0
Views: 43
Activity: 6d

Native diarization in '27?

I'm working on a macOS transcription utility that uses Apple's Speech framework (SpeechAnalyzer) for speech-to-text. This is for meetings/interviews/podcasts where speaker identification is critical. The current challenge is speaker attribution — I need to identify which speaker is producing each segment of transcribed text, and Apple doesn't support this in '26. I have three questions: Is native diarization coming in the fall release? I've reviewed the WWDC 2026 session catalog and found no mention of diarization in SpeechAnalyzer or elsewhere. I'm probably going to use FluidAudio for speaker attribution, but strongly prefer a native solution if one exists or is planned. Do I need to stay with third-party libraries, or is this coming? Core AI and custom models The new Core AI framework was announced for on-device model deployment. Can I train or integrate a custom diarization model via Core AI? If yes, are there sample implementations or documentation for audio-processing models? Core Audio framework updates Were there any Core Audio API-level additions announced at WWDC 2026 that might support audio analysis or speaker detection downstream? I saw no dedicated session, but wanted to verify. Thanks for any guidance on this.

Media Technologies Audio