Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Posts under Audio subtopic

Post

Replies

Boosts

Views

Activity

Issue with Audio Sample Rate Conversion in Video Calls
Hey everyone, I'm encountering an issue with audio sample rate conversion that I'm hoping someone can help with. Here's the breakdown: Issue Description: I've installed a tap on an input device to convert audio to an optimal sample rate. There's a converter node added on top of this setup. The problem arises when joining Zoom or FaceTime calls—the converter gets deallocated from memory, causing the program to crash. Symptoms: The converter node is being deallocated during video calls. The program crashes entirely when this happens. Traditional methods of monitoring sample rate changes (tracking nominal or actual sample rates) aren't working as expected. The Big Challenge: I can't figure out how to properly monitor sample rate changes. Listeners set up to track these changes don't trigger when the device joins a Zoom or FaceTime call. Please, if anyone has experience with this or knows a solution, I'd really appreciate your help. Thanks in advance! ⁠
0
0
121
Apr ’25
Creating RTP-MIDI Sessions via MIDINetworkSession C API (dlopen/dlsym) on macOS 15?
I’m an amateur developer working on a free utility for composers/producers, for which the macOS release needs to create and name RTP-MIDI sessions in Audio MIDI Setup from the command line (so I can ship a small C helper instead of telling users to click through the UI). Here’s what I’ve tried so far, without luck: • Plist hacks: Injecting entries into ~/Library/Audio/MIDI Configurations/*.mcfg works when AMS is closed, but AMS immediately locks and reverts my changes when it’s open. • CoreMIDI C API: I can create virtual ports with MIDISourceCreate, but attempting MIDIObjectGetDataProperty on the apple.midirtp.session plugin always returns err –10836. • Obj-C & Swift: Loading MIDINetworkSession and calling defaultSession, init, setNetworkName: and setting enabled = YES doesn’t produce a new session object in the Network panel. • dlopen/dlsym: I extracted the real CoreMIDI binary out of the dyld shared cache and tried binding _MIDINetworkSessionCreate, _SetName, _SetEnabled, etc., but all the symbols come back null or my tool segfaults. • Plugin registration: I’ve pulled the factory UUID (70C9C5EA-7C65-11D8-B317-000393A34B5A) from /System/Library/Extensions/AppleMIDIRTPDriver.plugin/Contents/Info.plist and called CFPlugInRegisterFactories, but it still never exposes the session-creation calls. At this point I’m convinced I’m either loading the wrong binary or missing one critical step in registering the RTP-MIDI plugin’s private API. Can anyone point me to: The exact path of the dylib or bundle that actually exports the MIDINetworkSessionCreate/MIDINetworkSessionSetName/MIDINetworkSessionSetEnabled symbols? A minimal working snippet (C or Obj-C) that reliably creates and names a Network-MIDI session? Any pointers, sample code, or even ideas about where Apple hides this functionality on macOS 15 would be hugely appreciated. Thanks!
0
1
218
Jun ’25
Unique identifier of a MIDI device
Hello, I need to know what is a unique identifier of a MIDI device (source/destination). Important note: I want to get the same ID when a device is reconnected (unplugged and then plugged again). The main candidate is kMIDIPropertyUniqueID property. But I don't know if it meets the requirement above or not. Additional question: is it always available for any endpoint? Also there is kMIDIPropertyDeviceID property. What about it? And one more option is just MIDIEndpointRef returned by MIDIGetSource or MIDIGetDestination. So what is the proper way to get ID which persists between device reconnections?
0
0
81
Jan ’26
Is there a way to get lossless music playback on macOS?
I noticed that while playing back the same tracks via MusicKit on different OSes I get different results regarding the audio files being streamed. Playing back a lossless file with 24Bit 48kHz and watching the Console for RemotePlayerService I get: on iPadOS: Lossless; groupID: audio-alac-stereo-48000-24; bitDepth: 24-bit; sampleRate: 48khz; codec: alac; channels: 2; layout: Stereo; on macOS: Creating AudioQueue with format:'paac', framesPerPacket:1024, sampleRate:44100 While the iPad looks perfect, the Mac does not. Is there a way to fix this issue on macOS. BTW: I switched the Audio-Midi Settings before, after and while the macOS App was lunched. I also switched to different output devices. I wasn't able to change the bad audio-output on the mac. I tested this under Sequoia 15.5 and Tahoe beta 1, Xcode 16.4 and 26 beta 1. The AudioVariants of the Album/Tracks are .dolbyAtmos, .lossless, .lossyStereo Apple Music displays Lossless 24 Bit/48 kHz ALAC when clicking on the playercontroll icon on macOS I hope there are only some missing or misconfigured properties to get macOS up to par. Thanks :-)
0
1
162
Jun ’25
Correct way for an Audio Unit v3 to return fewer than requested number of samples given a buffer
I have an AUv3 plugin which uses an FFT - which requires n samples before it can produce any output - so, depending on the relation between the host's buffer size and the FFT window size, it may receive a several buffers of samples, producing no output, and then dumping out what it has once a sufficient number of samples have been received. This means that output is produced in fits and starts, in batches that match the FFT size (modulo oversampling) - e.g. if being fed buffers of 256 samples with an fft size of 1024, the output buffer sizes will be 0 for the first 3 buffers, and upon the fourth, the first 256 processed samples are returned and the remaining 768 cached; the next three buffers will return the remaining cached samples while processing and buffering subsequent ones, and so forth. The internal mechanics of that I have solved, caching output if the current output buffer is too small, and so forth - so it all works as advertised, and the plugin reports its latency correctly. And when run as an app in demo-mode, playback works as expected. In the plugin's render block, it captures the number of frames written, and if it is less than the number of frames passed in, adjusts the mDataByteSize of the output buffers to match the actual quantity of data being returned: unsigned int framesWritten = (unsigned int) processHelper->processWithEvents(inAudioBufferList, outAudioBufferList, timestamp, frameCount, realtimeEventListHead); if (framesWritten < frameCount) { for (UInt32 i = 0; i < outAudioBufferList->mNumberBuffers; ++i) { outAudioBufferList->mBuffers[i].mDataByteSize = framesWritten * 4; // assume 4 byte floats } } However, there are a couple of serious issues: auval -v fails it with - Render Test at 64 frames, sample rate: 22050 Hz ERROR: Output Buffer Size does not match requested When connected to Logic Pro, it appears that mDataByteSize is ignored, and the entire allocated buffer is read - audio has sections of silence snipped into it which corresponds the number of empty buffers being returned If I set Logic's buffer size to 1024 and use a 1024 sample FFT window, the plugin works correctly - but of course a plugin cannot dictate buffer size, and `1024 is too small a window size to be useful for anything but filtering very high frequencies This seems like it has to be a solvable problem, and most likely the issue is in how my code reports the number of usable samples in the returned buffer. So, what is the correct way for a plugin to report that it has no samples to return, but will, uh, real soon now? I know I could convert this plugin to be one that does offline rendering of the entire input, but this is real-time processing, just with a fixed amount of latency, so that should not be necessary.
0
0
391
Nov ’25
Ducking MusicKit output when playing another sound
I am developing an app that uses MusicKit to play music and then I need to have spoken words played to the user, while ducking the audio coming from MusicKit (application music player) the built in Siri voices are not off sufficient quality so I am using an external service to create an mp3 file and then play this back using AVAudioSession Sample code below the problem I am having is that .duckOthers is not ducking the Application Music Player output Is this a bug or am I doing this wrong? // Configure audio session for system-wide ducking try AVAudioSession.sharedInstance().setCategory(.playback, mode: .spokenAudio, options: [.duckOthers, .mixWithOthers]) try AVAudioSession.sharedInstance().setActive(true) // Set the ducking level to maximum try AVAudioSession.sharedInstance().setPreferredIOBufferDuration(0.005) // Create and configure audio player self.audioPlayer = try AVAudioPlayer(data: audioData) self.audioPlayer?.delegate = self self.audioPlayer?.volume = 1.0 // Ensure full volume for speech self.audioPlayer?.prepareToPlay() // Set the audio player's settings for maximum clarity self.audioPlayer?.enableRate = false self.audioPlayer?.pan = 0.0 // Center the audio self.audioPlayer?.play()
0
0
66
Apr ’25
Keeping PiP alive during third-party video recording (camera capture)
I’m building a teleprompter-style app that relies on Picture in Picture. PiP starts correctly on device. Everything works — until another app (e.g. TikTok / Instagram) starts active video recording. When camera capture begins in the foreground app, iOS terminates my PiP session. Some teleprompter apps appear to keep PiP active while recording in other apps, so I’m trying to understand the recommended architectural pattern for this scenario. Is there a documented approach or best practice to keep PiP stable during third-party camera capture? Looking specifically for guidance on the correct AVKit / AVAudioSession configuration for this use case.
0
0
269
Feb ’26
AVAudioPlayer/SKAudioNode audio no longer plays after interruption
Hi 👋! We have a SpriteKit-based app where we play AVAudio sounds in three different ways: Effects (incl. UI sounds) with AVAudioPlayer. Long looping tracks with AVAudioPlayer. Short animation effects on the timeline of SpriteKit's SKScene files (effectively SKAudioNode nodes). We've found that when you exit the app or otherwise interrupt audio plays, future audio plays often fail. For example, there's a WebKit-based video trailer inside the app, and if you play it, our looping background music track (2.) will stop playing, and won't resume as you close the trailer (return from WebKit). This is probably due to us not manually restarting the track (so may well be easily fixed). Periodically played AVAudioPlayer audio (1.) are not affected. However, the more concerning thing is that the audio tracks on SKScene file timelines (3.) will no longer play. My hypothesis is that AVAudioEngine gets interrupted, and needs to be restarted for those AVAudioNode elements to regain functionality. Thing is, we don't deal with AVAudioEngine at all currently in the app, meaning it is never initiated to begin with. Obviously things return to normal when you remove the app from short-term memory and restart it. However, it seems many of our users aren't doing this, and often report audio failing presumably due to some interruption in the past without the app ever being cleared from memory. Any idea why timeline-run SKAudioNodes would fail like this? Should the app react to app backgrounding/foregrounding regarding audio? Any help would be very much appreciated ✌️!
0
1
204
May ’25
Unable to match music with shazamkit for Android
Hello, i can successfully match music using shazamkit on Apple using SwiftUI, a simple app that let user to load an audio file and exctracts the relative match, while i am unable to match music using shamzamkit on Android. I am trying to make the same simple app but i cannot match music as i get MATCH_ATTEMPT_FAILED every time i try to. I don't know what i am doing wrong but the shazam part in the kotlin Android code is in this method : suspend fun processAudioFileInBackground( filePath: String, developerTokenProvider: DeveloperTokenProvider ) = withContext(Dispatchers.IO) { val bufferSize = 1024 * 1024 val audioFile = FileInputStream(filePath) val byteBuffer = ByteBuffer.allocate(bufferSize) byteBuffer.order(ByteOrder.LITTLE_ENDIAN) var bytesRead: Int while (audioFile.read(byteBuffer.array()).also { bytesRead = it } != -1) { val signatureGenerator = (ShazamKit.createSignatureGenerator(AudioSampleRateInHz.SAMPLE_RATE_44100) as ShazamKitResult.Success).data signatureGenerator.append(byteBuffer.array(), bytesRead, System.currentTimeMillis()) val signature = signatureGenerator.generateSignature() println("Signature: ${signature.durationInMs}") val catalog = ShazamKit.createShazamCatalog(developerTokenProvider, Locale.ENGLISH) val session = (ShazamKit.createSession(catalog) as ShazamKitResult.Success).data val matchResult = session.match(signature) println("MatchResult : $matchResult") setMatchResult(matchResult) byteBuffer.clear() } audioFile.close() } I noticed that changing Locale in catalog creation results in different result as i get NoMatch without exception. Can you please help me with this? Do i need to create a custom catalog?
0
0
145
May ’25
App Randomly Crashes During Continuous Sound Playback Using AVAudioPlayer
Environment→ ・Device: iPad 10th generation ・OS:**iOS18.3.2 We're using AVAudioPlayer to play a sound when a button is tapped. In our use case, this button can be tapped very frequently — roughly every 0.1 to 0.2 seconds. Each tap triggers the following function: var audioPlayer: AVAudioPlayer? func soundPlay(resource: String, type: String){ guard let path = Bundle.main.path(forResource: resource, ofType: type) else { return } do { audioPlayer = try AVAudioPlayer(contentsOf: URL(fileURLWithPath: path)) audioPlayer!.delegate = self try audioSession.setCategory(.playback) } catch { return } self.audioPlayer!.play() } The issue is that under high-frequency tapping (especially around 0.1–0.15s intervals), the app occasionally crashes. The crash does not occur every time, but it happens randomly — sometimes within 30 seconds, within 1 minute, or even 3 minutes of continuous tapping. Interestingly, adding a delay of 0.2 seconds between button taps seems to prevent the crash entirely. Delays shorter than 0.2 seconds (e.g.,0.15s,0.18s) still result in occasional crashes. My questions are: **Is this expected behavior from AVAudioPlayer or AVAudioSession? Could this be a known issue or a limitation in AVFoundation? Is there any documentation or guidance on handling frequent sound playback safely?** Any insights or recommendations on how to handle rapid, repeated audio playback more reliably would be appreciated.
0
0
227
May ’25
CMFormatDescription.audioStreamBasicDescription has wrong or unexpected sample rate for audio channels with different sample rates
In my app I use AVAssetReaderTrackOutput to extract PCM audio from a user-provided video or audio file and display it as a waveform. Recently a user reported that the waveform is not in sync with his video, and after receiving the video I noticed that the waveform is in fact double as long as the video duration, i.e. it shows the audio in slow-motion, so to speak. Until now I was using CMFormatDescription.audioStreamBasicDescription.mSampleRate which for this particular user video returns 22'050. But in this case it seems that this value is wrong... because the audio file has two audio channels with different sample rates, as returned by CMFormatDescription.audioFormatList.map({ $0.mASBD.mSampleRate }) The first channel has a sample rate of 44'100, the second one 22'050. If I use the first sample rate, the waveform is perfectly in sync with the video. The problem is given by the fact that the ratio between the audio data length and the sample rate multiplied by the audio duration is 8, double the ratio for the first audio file (4). In the code below this ratio is given by Double(length) / (sampleRate * asset.duration.seconds) When commenting out the line with the sampleRate variable definition in the code below and uncommenting the following line, the ratios for both audio files are 4, which is the expected result. I would expect audioStreamBasicDescription to return the correct sample rate, i.e. the one used by AVAssetReaderTrackOutput, which (I think) somehow merges the stereo tracks. The documentation is sparse, and in particular it’s not documented whether the lower or higher sample rate is used; in this case, it seems like the higher one is used, but audioStreamBasicDescription for some reason returns the lower one. Does anybody know why this is the case or how I should extract the sample rate of the produced PCM audio data? Should I always take the higher one? I created FB19620455. let openPanel = NSOpenPanel() openPanel.allowedContentTypes = [.audiovisualContent] openPanel.runModal() let url = openPanel.urls[0] let asset = AVURLAsset(url: url) let assetTrack = asset.tracks(withMediaType: .audio)[0] let assetReader = try! AVAssetReader(asset: asset) let readerOutput = AVAssetReaderTrackOutput(track: assetTrack, outputSettings: [AVFormatIDKey: Int(kAudioFormatLinearPCM), AVLinearPCMBitDepthKey: 16, AVLinearPCMIsBigEndianKey: false, AVLinearPCMIsFloatKey: false, AVLinearPCMIsNonInterleaved: false]) readerOutput.alwaysCopiesSampleData = false assetReader.add(readerOutput) let formatDescriptions = assetTrack.formatDescriptions as! [CMFormatDescription] let sampleRate = formatDescriptions[0].audioStreamBasicDescription!.mSampleRate //let sampleRate = formatDescriptions[0].audioFormatList.map({ $0.mASBD.mSampleRate }).max()! print(formatDescriptions[0].audioStreamBasicDescription!.mSampleRate) print(formatDescriptions[0].audioFormatList.map({ $0.mASBD.mSampleRate })) if !assetReader.startReading() { preconditionFailure() } var length = 0 while assetReader.status == .reading { guard let sampleBuffer = readerOutput.copyNextSampleBuffer(), let blockBuffer = sampleBuffer.dataBuffer else { break } length += blockBuffer.dataLength } print(Double(length) / (sampleRate * asset.duration.seconds))
0
1
132
Aug ’25
CarPlay outputs no audio
I have an application that includes custom artwork for the album cover and text details setup with the MPRemoteCommandCenter.shared() reference. I need the user to have a full featured "now playing" display to see all of this. My experience is that cannot find a set of parameters for AVAudioSession.setCategory() that route audio successfully, and display the full featured now playing deck. If I use .playAndRecord, the audio I send out plays out on the radio. But, the now-playing deck is empty and nothing I do with the command center seems to change that. If I instead use .playback, I cannot use .defaultToSpeaker option which is the only way I've found to cause the "now-playing" navigation button to appear so that the full featured deck will display. But, of course setCategory() fails with an error about .defaultToSpeaker only available with .playAndRecord, so some default or intermediate state is entered and I see the full featured deck, but no audio goes out to the radio. What combination is supposed to be used here and is this more likely a problem with thread use (@MainActor) and/or some ordering of operations that I've overlooked?
0
0
2
14m
Play Audio and Recognize Speech in Car
Hello, I'm trying to determine the best/recommended AVAudioSession configuration (i.e category, mode, and options) for the following use-case. Essentially, I'd like to switch between periods of playing an audio file and then recognizing speech. The audio file is typically speech and I don't intend for playback and speech recognition to occur simultaneously. I'd like for the user to sill be able to interact with Siri and I'd like for it to work with CarPlay where navigation prompts can occur. I would assume the category to use is 'playAndRecord', but I'm not sure if it's better to just set that once for the entire lifecycle, or set to 'playback' for audio file playback and then switch to 'playAndRecord' for speech recognition . I'm also not sure on the best 'mode' and 'options' to set. Any suggestions would be appreciated. Thanks.
0
0
608
Sep ’25
AVAudioSession setActive(true) fails after phone call when app is in background
I’m seeing what appears to be an iOS audio-session issue that occurs only when a phone call happens while the app is in the background. API: AVAudioSession, AVAudioRecorder Background Modes: Audio enabled (UIBackgroundModes = audio) Category: .playAndRecord Microphone permission: granted Expected Behavior If the app is recording audio in the background and a phone call interrupts it: AVAudioSession.interruptionNotification(.began) fires Call ends AVAudioSession.interruptionNotification(.ended) fires App should be able to re-activate its audio session and resume or restart recording Apple documentation suggests this should be supported for background audio apps. Actual Behavior When the app is in the background and phone call is ended: AVAudioSession.interruptionNotification(.ended) does fire Attempting to reactivate the audio session always fails: Error Domain=NSOSStatusErrorDomain Code=560557684 ("!int") "Session activation failed" The session appears to remain permanently “interrupted” Retrying activation (with delays) does not help Recreating AVAudioRecorder does not help Reactivation works only after the app is opened again
0
0
166
Jan ’26
How to inform Logic Pro that AU view does not have a fixed aspect ratio?
I have an AUv3 that passes all validation and can be loaded into Logic Pro without issue. The UI for the plug in can be any aspect ratio but Logic insists on presenting it in a view with a fixed aspect ratio. That is when resizing, both the height and width are resized. I have never managed to work out what it is I need to do specify to Logic to allow the user to resize width or height independently of each other. Can anyone tell me what I need to specify in the AU code that will inform Logic that the view can be resized from any side of the window/panel?
0
0
231
Apr ’25
Issue with Audio Sample Rate Conversion in Video Calls
Hey everyone, I'm encountering an issue with audio sample rate conversion that I'm hoping someone can help with. Here's the breakdown: Issue Description: I've installed a tap on an input device to convert audio to an optimal sample rate. There's a converter node added on top of this setup. The problem arises when joining Zoom or FaceTime calls—the converter gets deallocated from memory, causing the program to crash. Symptoms: The converter node is being deallocated during video calls. The program crashes entirely when this happens. Traditional methods of monitoring sample rate changes (tracking nominal or actual sample rates) aren't working as expected. The Big Challenge: I can't figure out how to properly monitor sample rate changes. Listeners set up to track these changes don't trigger when the device joins a Zoom or FaceTime call. Please, if anyone has experience with this or knows a solution, I'd really appreciate your help. Thanks in advance! ⁠
Replies
0
Boosts
0
Views
121
Activity
Apr ’25
Creating RTP-MIDI Sessions via MIDINetworkSession C API (dlopen/dlsym) on macOS 15?
I’m an amateur developer working on a free utility for composers/producers, for which the macOS release needs to create and name RTP-MIDI sessions in Audio MIDI Setup from the command line (so I can ship a small C helper instead of telling users to click through the UI). Here’s what I’ve tried so far, without luck: • Plist hacks: Injecting entries into ~/Library/Audio/MIDI Configurations/*.mcfg works when AMS is closed, but AMS immediately locks and reverts my changes when it’s open. • CoreMIDI C API: I can create virtual ports with MIDISourceCreate, but attempting MIDIObjectGetDataProperty on the apple.midirtp.session plugin always returns err –10836. • Obj-C & Swift: Loading MIDINetworkSession and calling defaultSession, init, setNetworkName: and setting enabled = YES doesn’t produce a new session object in the Network panel. • dlopen/dlsym: I extracted the real CoreMIDI binary out of the dyld shared cache and tried binding _MIDINetworkSessionCreate, _SetName, _SetEnabled, etc., but all the symbols come back null or my tool segfaults. • Plugin registration: I’ve pulled the factory UUID (70C9C5EA-7C65-11D8-B317-000393A34B5A) from /System/Library/Extensions/AppleMIDIRTPDriver.plugin/Contents/Info.plist and called CFPlugInRegisterFactories, but it still never exposes the session-creation calls. At this point I’m convinced I’m either loading the wrong binary or missing one critical step in registering the RTP-MIDI plugin’s private API. Can anyone point me to: The exact path of the dylib or bundle that actually exports the MIDINetworkSessionCreate/MIDINetworkSessionSetName/MIDINetworkSessionSetEnabled symbols? A minimal working snippet (C or Obj-C) that reliably creates and names a Network-MIDI session? Any pointers, sample code, or even ideas about where Apple hides this functionality on macOS 15 would be hugely appreciated. Thanks!
Replies
0
Boosts
1
Views
218
Activity
Jun ’25
Regarding the issue of obtaining input channels for aggregated devices
I found that the aggregated device correctly obtains input channels in the standard microphone mode. However, in voice isolation mode, it only retrieves channels from the first sub-device in the aggregated device's list. If I want to properly obtain channel information in voice isolation mode, how should I do it?
Replies
0
Boosts
0
Views
334
Activity
Jun ’25
Background audio player issue
I am work an app development on an app which request an audio function in background as an alert sound. during debug testing , the function work fine, but once I testing standalone without debugging , The function not work , it will play out the sound when I back to app. does any way to trace the issues ?
Replies
0
Boosts
0
Views
162
Activity
May ’25
Unique identifier of a MIDI device
Hello, I need to know what is a unique identifier of a MIDI device (source/destination). Important note: I want to get the same ID when a device is reconnected (unplugged and then plugged again). The main candidate is kMIDIPropertyUniqueID property. But I don't know if it meets the requirement above or not. Additional question: is it always available for any endpoint? Also there is kMIDIPropertyDeviceID property. What about it? And one more option is just MIDIEndpointRef returned by MIDIGetSource or MIDIGetDestination. So what is the proper way to get ID which persists between device reconnections?
Replies
0
Boosts
0
Views
81
Activity
Jan ’26
CoreAudio: Specification of Private Aggregate or Tap
If a Tap or AggregateDevice with the Private property set is created, does it automatically disappear when the process ends? If not, how can I remove the Tap or AggregateDevice before the main process terminates?
Replies
0
Boosts
0
Views
196
Activity
1w
Is there a way to get lossless music playback on macOS?
I noticed that while playing back the same tracks via MusicKit on different OSes I get different results regarding the audio files being streamed. Playing back a lossless file with 24Bit 48kHz and watching the Console for RemotePlayerService I get: on iPadOS: Lossless; groupID: audio-alac-stereo-48000-24; bitDepth: 24-bit; sampleRate: 48khz; codec: alac; channels: 2; layout: Stereo; on macOS: Creating AudioQueue with format:'paac', framesPerPacket:1024, sampleRate:44100 While the iPad looks perfect, the Mac does not. Is there a way to fix this issue on macOS. BTW: I switched the Audio-Midi Settings before, after and while the macOS App was lunched. I also switched to different output devices. I wasn't able to change the bad audio-output on the mac. I tested this under Sequoia 15.5 and Tahoe beta 1, Xcode 16.4 and 26 beta 1. The AudioVariants of the Album/Tracks are .dolbyAtmos, .lossless, .lossyStereo Apple Music displays Lossless 24 Bit/48 kHz ALAC when clicking on the playercontroll icon on macOS I hope there are only some missing or misconfigured properties to get macOS up to par. Thanks :-)
Replies
0
Boosts
1
Views
162
Activity
Jun ’25
Correct way for an Audio Unit v3 to return fewer than requested number of samples given a buffer
I have an AUv3 plugin which uses an FFT - which requires n samples before it can produce any output - so, depending on the relation between the host's buffer size and the FFT window size, it may receive a several buffers of samples, producing no output, and then dumping out what it has once a sufficient number of samples have been received. This means that output is produced in fits and starts, in batches that match the FFT size (modulo oversampling) - e.g. if being fed buffers of 256 samples with an fft size of 1024, the output buffer sizes will be 0 for the first 3 buffers, and upon the fourth, the first 256 processed samples are returned and the remaining 768 cached; the next three buffers will return the remaining cached samples while processing and buffering subsequent ones, and so forth. The internal mechanics of that I have solved, caching output if the current output buffer is too small, and so forth - so it all works as advertised, and the plugin reports its latency correctly. And when run as an app in demo-mode, playback works as expected. In the plugin's render block, it captures the number of frames written, and if it is less than the number of frames passed in, adjusts the mDataByteSize of the output buffers to match the actual quantity of data being returned: unsigned int framesWritten = (unsigned int) processHelper->processWithEvents(inAudioBufferList, outAudioBufferList, timestamp, frameCount, realtimeEventListHead); if (framesWritten < frameCount) { for (UInt32 i = 0; i < outAudioBufferList->mNumberBuffers; ++i) { outAudioBufferList->mBuffers[i].mDataByteSize = framesWritten * 4; // assume 4 byte floats } } However, there are a couple of serious issues: auval -v fails it with - Render Test at 64 frames, sample rate: 22050 Hz ERROR: Output Buffer Size does not match requested When connected to Logic Pro, it appears that mDataByteSize is ignored, and the entire allocated buffer is read - audio has sections of silence snipped into it which corresponds the number of empty buffers being returned If I set Logic's buffer size to 1024 and use a 1024 sample FFT window, the plugin works correctly - but of course a plugin cannot dictate buffer size, and `1024 is too small a window size to be useful for anything but filtering very high frequencies This seems like it has to be a solvable problem, and most likely the issue is in how my code reports the number of usable samples in the returned buffer. So, what is the correct way for a plugin to report that it has no samples to return, but will, uh, real soon now? I know I could convert this plugin to be one that does offline rendering of the entire input, but this is real-time processing, just with a fixed amount of latency, so that should not be necessary.
Replies
0
Boosts
0
Views
391
Activity
Nov ’25
Ducking MusicKit output when playing another sound
I am developing an app that uses MusicKit to play music and then I need to have spoken words played to the user, while ducking the audio coming from MusicKit (application music player) the built in Siri voices are not off sufficient quality so I am using an external service to create an mp3 file and then play this back using AVAudioSession Sample code below the problem I am having is that .duckOthers is not ducking the Application Music Player output Is this a bug or am I doing this wrong? // Configure audio session for system-wide ducking try AVAudioSession.sharedInstance().setCategory(.playback, mode: .spokenAudio, options: [.duckOthers, .mixWithOthers]) try AVAudioSession.sharedInstance().setActive(true) // Set the ducking level to maximum try AVAudioSession.sharedInstance().setPreferredIOBufferDuration(0.005) // Create and configure audio player self.audioPlayer = try AVAudioPlayer(data: audioData) self.audioPlayer?.delegate = self self.audioPlayer?.volume = 1.0 // Ensure full volume for speech self.audioPlayer?.prepareToPlay() // Set the audio player's settings for maximum clarity self.audioPlayer?.enableRate = false self.audioPlayer?.pan = 0.0 // Center the audio self.audioPlayer?.play()
Replies
0
Boosts
0
Views
66
Activity
Apr ’25
Keeping PiP alive during third-party video recording (camera capture)
I’m building a teleprompter-style app that relies on Picture in Picture. PiP starts correctly on device. Everything works — until another app (e.g. TikTok / Instagram) starts active video recording. When camera capture begins in the foreground app, iOS terminates my PiP session. Some teleprompter apps appear to keep PiP active while recording in other apps, so I’m trying to understand the recommended architectural pattern for this scenario. Is there a documented approach or best practice to keep PiP stable during third-party camera capture? Looking specifically for guidance on the correct AVKit / AVAudioSession configuration for this use case.
Replies
0
Boosts
0
Views
269
Activity
Feb ’26
AVAudioPlayer/SKAudioNode audio no longer plays after interruption
Hi 👋! We have a SpriteKit-based app where we play AVAudio sounds in three different ways: Effects (incl. UI sounds) with AVAudioPlayer. Long looping tracks with AVAudioPlayer. Short animation effects on the timeline of SpriteKit's SKScene files (effectively SKAudioNode nodes). We've found that when you exit the app or otherwise interrupt audio plays, future audio plays often fail. For example, there's a WebKit-based video trailer inside the app, and if you play it, our looping background music track (2.) will stop playing, and won't resume as you close the trailer (return from WebKit). This is probably due to us not manually restarting the track (so may well be easily fixed). Periodically played AVAudioPlayer audio (1.) are not affected. However, the more concerning thing is that the audio tracks on SKScene file timelines (3.) will no longer play. My hypothesis is that AVAudioEngine gets interrupted, and needs to be restarted for those AVAudioNode elements to regain functionality. Thing is, we don't deal with AVAudioEngine at all currently in the app, meaning it is never initiated to begin with. Obviously things return to normal when you remove the app from short-term memory and restart it. However, it seems many of our users aren't doing this, and often report audio failing presumably due to some interruption in the past without the app ever being cleared from memory. Any idea why timeline-run SKAudioNodes would fail like this? Should the app react to app backgrounding/foregrounding regarding audio? Any help would be very much appreciated ✌️!
Replies
0
Boosts
1
Views
204
Activity
May ’25
Unable to match music with shazamkit for Android
Hello, i can successfully match music using shazamkit on Apple using SwiftUI, a simple app that let user to load an audio file and exctracts the relative match, while i am unable to match music using shamzamkit on Android. I am trying to make the same simple app but i cannot match music as i get MATCH_ATTEMPT_FAILED every time i try to. I don't know what i am doing wrong but the shazam part in the kotlin Android code is in this method : suspend fun processAudioFileInBackground( filePath: String, developerTokenProvider: DeveloperTokenProvider ) = withContext(Dispatchers.IO) { val bufferSize = 1024 * 1024 val audioFile = FileInputStream(filePath) val byteBuffer = ByteBuffer.allocate(bufferSize) byteBuffer.order(ByteOrder.LITTLE_ENDIAN) var bytesRead: Int while (audioFile.read(byteBuffer.array()).also { bytesRead = it } != -1) { val signatureGenerator = (ShazamKit.createSignatureGenerator(AudioSampleRateInHz.SAMPLE_RATE_44100) as ShazamKitResult.Success).data signatureGenerator.append(byteBuffer.array(), bytesRead, System.currentTimeMillis()) val signature = signatureGenerator.generateSignature() println("Signature: ${signature.durationInMs}") val catalog = ShazamKit.createShazamCatalog(developerTokenProvider, Locale.ENGLISH) val session = (ShazamKit.createSession(catalog) as ShazamKitResult.Success).data val matchResult = session.match(signature) println("MatchResult : $matchResult") setMatchResult(matchResult) byteBuffer.clear() } audioFile.close() } I noticed that changing Locale in catalog creation results in different result as i get NoMatch without exception. Can you please help me with this? Do i need to create a custom catalog?
Replies
0
Boosts
0
Views
145
Activity
May ’25
Is there any way to disable PHASE/CoreAudio logging?
Is there a way to permanently disable PHASE SDK logging? It seems to be a lot chattier than Apple's other SDKs. While developing a RealityKit app that uses AudioPlaybackController, I must manually hide the PHASE SDK log output several times each day so I can see my app's log messages. Thank you.
Replies
0
Boosts
0
Views
356
Activity
Jun ’25
App Randomly Crashes During Continuous Sound Playback Using AVAudioPlayer
Environment→ ・Device: iPad 10th generation ・OS:**iOS18.3.2 We're using AVAudioPlayer to play a sound when a button is tapped. In our use case, this button can be tapped very frequently — roughly every 0.1 to 0.2 seconds. Each tap triggers the following function: var audioPlayer: AVAudioPlayer? func soundPlay(resource: String, type: String){ guard let path = Bundle.main.path(forResource: resource, ofType: type) else { return } do { audioPlayer = try AVAudioPlayer(contentsOf: URL(fileURLWithPath: path)) audioPlayer!.delegate = self try audioSession.setCategory(.playback) } catch { return } self.audioPlayer!.play() } The issue is that under high-frequency tapping (especially around 0.1–0.15s intervals), the app occasionally crashes. The crash does not occur every time, but it happens randomly — sometimes within 30 seconds, within 1 minute, or even 3 minutes of continuous tapping. Interestingly, adding a delay of 0.2 seconds between button taps seems to prevent the crash entirely. Delays shorter than 0.2 seconds (e.g.,0.15s,0.18s) still result in occasional crashes. My questions are: **Is this expected behavior from AVAudioPlayer or AVAudioSession? Could this be a known issue or a limitation in AVFoundation? Is there any documentation or guidance on handling frequent sound playback safely?** Any insights or recommendations on how to handle rapid, repeated audio playback more reliably would be appreciated.
Replies
0
Boosts
0
Views
227
Activity
May ’25
CMFormatDescription.audioStreamBasicDescription has wrong or unexpected sample rate for audio channels with different sample rates
In my app I use AVAssetReaderTrackOutput to extract PCM audio from a user-provided video or audio file and display it as a waveform. Recently a user reported that the waveform is not in sync with his video, and after receiving the video I noticed that the waveform is in fact double as long as the video duration, i.e. it shows the audio in slow-motion, so to speak. Until now I was using CMFormatDescription.audioStreamBasicDescription.mSampleRate which for this particular user video returns 22'050. But in this case it seems that this value is wrong... because the audio file has two audio channels with different sample rates, as returned by CMFormatDescription.audioFormatList.map({ $0.mASBD.mSampleRate }) The first channel has a sample rate of 44'100, the second one 22'050. If I use the first sample rate, the waveform is perfectly in sync with the video. The problem is given by the fact that the ratio between the audio data length and the sample rate multiplied by the audio duration is 8, double the ratio for the first audio file (4). In the code below this ratio is given by Double(length) / (sampleRate * asset.duration.seconds) When commenting out the line with the sampleRate variable definition in the code below and uncommenting the following line, the ratios for both audio files are 4, which is the expected result. I would expect audioStreamBasicDescription to return the correct sample rate, i.e. the one used by AVAssetReaderTrackOutput, which (I think) somehow merges the stereo tracks. The documentation is sparse, and in particular it’s not documented whether the lower or higher sample rate is used; in this case, it seems like the higher one is used, but audioStreamBasicDescription for some reason returns the lower one. Does anybody know why this is the case or how I should extract the sample rate of the produced PCM audio data? Should I always take the higher one? I created FB19620455. let openPanel = NSOpenPanel() openPanel.allowedContentTypes = [.audiovisualContent] openPanel.runModal() let url = openPanel.urls[0] let asset = AVURLAsset(url: url) let assetTrack = asset.tracks(withMediaType: .audio)[0] let assetReader = try! AVAssetReader(asset: asset) let readerOutput = AVAssetReaderTrackOutput(track: assetTrack, outputSettings: [AVFormatIDKey: Int(kAudioFormatLinearPCM), AVLinearPCMBitDepthKey: 16, AVLinearPCMIsBigEndianKey: false, AVLinearPCMIsFloatKey: false, AVLinearPCMIsNonInterleaved: false]) readerOutput.alwaysCopiesSampleData = false assetReader.add(readerOutput) let formatDescriptions = assetTrack.formatDescriptions as! [CMFormatDescription] let sampleRate = formatDescriptions[0].audioStreamBasicDescription!.mSampleRate //let sampleRate = formatDescriptions[0].audioFormatList.map({ $0.mASBD.mSampleRate }).max()! print(formatDescriptions[0].audioStreamBasicDescription!.mSampleRate) print(formatDescriptions[0].audioFormatList.map({ $0.mASBD.mSampleRate })) if !assetReader.startReading() { preconditionFailure() } var length = 0 while assetReader.status == .reading { guard let sampleBuffer = readerOutput.copyNextSampleBuffer(), let blockBuffer = sampleBuffer.dataBuffer else { break } length += blockBuffer.dataLength } print(Double(length) / (sampleRate * asset.duration.seconds))
Replies
0
Boosts
1
Views
132
Activity
Aug ’25
Ability to programatically download Premium and Enhanced voices
Please consider adding the ability to programatically download Premium and Enhanced voices. At the moment it is extremely inconvenient for our users, as they have to navigate to settings themselves to download voices. Our app relies heavily on SpeechSynthesis integration, and it would greatly benefit from this feature. FB16307193
Replies
0
Boosts
1
Views
86
Activity
Jun ’25
CarPlay outputs no audio
I have an application that includes custom artwork for the album cover and text details setup with the MPRemoteCommandCenter.shared() reference. I need the user to have a full featured "now playing" display to see all of this. My experience is that cannot find a set of parameters for AVAudioSession.setCategory() that route audio successfully, and display the full featured now playing deck. If I use .playAndRecord, the audio I send out plays out on the radio. But, the now-playing deck is empty and nothing I do with the command center seems to change that. If I instead use .playback, I cannot use .defaultToSpeaker option which is the only way I've found to cause the "now-playing" navigation button to appear so that the full featured deck will display. But, of course setCategory() fails with an error about .defaultToSpeaker only available with .playAndRecord, so some default or intermediate state is entered and I see the full featured deck, but no audio goes out to the radio. What combination is supposed to be used here and is this more likely a problem with thread use (@MainActor) and/or some ordering of operations that I've overlooked?
Replies
0
Boosts
0
Views
2
Activity
14m
Play Audio and Recognize Speech in Car
Hello, I'm trying to determine the best/recommended AVAudioSession configuration (i.e category, mode, and options) for the following use-case. Essentially, I'd like to switch between periods of playing an audio file and then recognizing speech. The audio file is typically speech and I don't intend for playback and speech recognition to occur simultaneously. I'd like for the user to sill be able to interact with Siri and I'd like for it to work with CarPlay where navigation prompts can occur. I would assume the category to use is 'playAndRecord', but I'm not sure if it's better to just set that once for the entire lifecycle, or set to 'playback' for audio file playback and then switch to 'playAndRecord' for speech recognition . I'm also not sure on the best 'mode' and 'options' to set. Any suggestions would be appreciated. Thanks.
Replies
0
Boosts
0
Views
608
Activity
Sep ’25
AVAudioSession setActive(true) fails after phone call when app is in background
I’m seeing what appears to be an iOS audio-session issue that occurs only when a phone call happens while the app is in the background. API: AVAudioSession, AVAudioRecorder Background Modes: Audio enabled (UIBackgroundModes = audio) Category: .playAndRecord Microphone permission: granted Expected Behavior If the app is recording audio in the background and a phone call interrupts it: AVAudioSession.interruptionNotification(.began) fires Call ends AVAudioSession.interruptionNotification(.ended) fires App should be able to re-activate its audio session and resume or restart recording Apple documentation suggests this should be supported for background audio apps. Actual Behavior When the app is in the background and phone call is ended: AVAudioSession.interruptionNotification(.ended) does fire Attempting to reactivate the audio session always fails: Error Domain=NSOSStatusErrorDomain Code=560557684 ("!int") "Session activation failed" The session appears to remain permanently “interrupted” Retrying activation (with delays) does not help Recreating AVAudioRecorder does not help Reactivation works only after the app is opened again
Replies
0
Boosts
0
Views
166
Activity
Jan ’26
How to inform Logic Pro that AU view does not have a fixed aspect ratio?
I have an AUv3 that passes all validation and can be loaded into Logic Pro without issue. The UI for the plug in can be any aspect ratio but Logic insists on presenting it in a view with a fixed aspect ratio. That is when resizing, both the height and width are resized. I have never managed to work out what it is I need to do specify to Logic to allow the user to resize width or height independently of each other. Can anyone tell me what I need to specify in the AU code that will inform Logic that the view can be resized from any side of the window/panel?
Replies
0
Boosts
0
Views
231
Activity
Apr ’25