Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Posts under Audio subtopic

Post

Replies

Boosts

Views

Activity

Convert CoreAudio AudioObjectID to IOUSB LocationID
Is there a recommended way on macOS 26 Tahoe to take a CoreAudio AudioObjectID and use it to lookup the underlying USB LocationID? I previously used AudioObjectID to query the corresponding DeviceUID with kAudioDevicePropertyDeviceUID. Then I queried for the IOService matching kIOAudioEngineClassName with property kIOAudioEngineGlobalUniqueIDKey matching DeviceUID, and I loaded kUSBDevicePropertyLocationID from the result. This fails on macOS 26, because the IO Registry for the device has an entry for usbaudiod rather than AppleUSBAudioEngine, and usbaudiod does not include a kIOAudioEngineGlobalUniqueIDKey property (or any other property to map it to a CoreAudio DeviceUID). My use-case here is a piece of audio recording software that allows configuring a set of supported audio devices via USB HID prior to recording. I present the user with a list of CoreAudio devices to use, but without a way to lookup the underlying USB LocationID, I cannot guarantee that the configured device matches the selected device (e.g. if the user plugged in two identical microphones).
2
0
524
Sep ’25
Is AVAudioPCMFormatFloat32 required for playing a buffer with AVAudioEngine / AVAudioPlayerNode
I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown: "[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868 (related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022) If I convert the buffer to AVAudioPCMFormatFloat32 playback works. My questions are: Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application? If 1 is YES is this documented anywhere? If 1 is YES is this required format subject to change at any point? Thanks! I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).
1
0
1k
Oct ’25
AVAudioSession.outputVolume not reporting correctly in iOS 18+ devices
I’m using the shared instance of AVAudioSession. After activating it with .setActive(true), I observe the outputVolume, and it correctly reports the device’s volume. However, after deactivating the session using .setActive(false), changing the volume, and then reactivating it again, the outputVolume returns the previous volume (before deactivation), not the current device volume. The correct volume is only reported after the user manually changes it again using physical buttons or Control Center, which triggers the observer. What I need is a way to retrieve the actual current device volume immediately after reactivating the audio session, even on the second and subsequent activations. Disabling and re-enabling the audio session is essential to how my application functions. I’ve tested this behavior with my colleagues, and the issue is consistently reproducible on iOS 18.0.1, iOS 18.1, iOS 18.3, iOS 18.5 and iOS 18.6.2. On devices running iOS 17.6.1 and iOS 16.0.3, outputVolume correctly reflects the current volume immediately after calling .setActive(true) multiple times.
2
0
190
Sep ’25
Is Call Translation API available for VOIP?
I might have misunderstood the docs, but is Call Translation going to be available for VOIP applications? Eg in an already connected VOIP call, would it be possible for Call Translations to be enabled on an iOS 26 and Apple Intelligence supported device? I have personally tried it and it doesn’t look like it supported VOIP but would love to confirm this. reference: https://developer.apple.com/documentation/callkit/cxsettranslatingcallaction/
1
0
74
Jun ’25
Appleデバイスの内蔵楽器音について
iPhoneやiPadにおいて、画面上のボタンなどをタップした際に、特定の楽器音を発音させる方法をご存知の方いらっしゃいませんか? 現在音楽学習アプリを作成途中で、画面上の鍵盤や指板のボタン状のframeに、単音又は和音を割当て発音させる事を考えております SwiftUIのcodeのみで実現できないでしょうか 嘗て、MIDIのlevel1の楽器の発音機能があった様に記憶していますが、現在のOS上では同様の機能を実装してないように思えます 皆様のお知恵をお貸しください
1
0
381
Mar ’25
iOS AUv3 extension: no Icon shown in host
Hi, I'm working on an AUv3 project. The app itself displays my icon. However the Auv3 extension does not display any icon in any host app (AUM, Drambo, etc.0). I thought that the extension would inherit the host app icon but that it does not appear to be the case. I tried to add the icon as a 1024x1024 file to the extension target and the update my extension plist file withe a CFBundleIconFile key but no luck either. It must surely be really easy. What am I missing? Thanks in advance for your help!
5
0
135
May ’25
ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS
Bug Report: ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS Summary When using ScreenCaptureKit to capture system audio for extended periods, the application crashes with EXC_BAD_ACCESS in Swift's error handling runtime. The crash occurs in swift_getErrorValue when trying to process an error from the SCStream delegate method didStopWithError. This appears to be a framework-level issue in ScreenCaptureKit or its underlying ReplayKit implementation. Environment macOS Sonoma 14.6.1 Swift 5.8 ScreenCaptureKit framework Detailed Description Our application captures system audio using ScreenCaptureKit's audio capture capabilities. After successfully capturing for several minutes (typically after 3-4 segments of 60-second recordings), the application crashes with an EXC_BAD_ACCESS error. The crash happens when the Swift runtime attempts to process an error in the SCStreamDelegate.stream(_:didStopWithError:) method. The crash consistently occurs in swift_getErrorValue when attempting to access the class of what appears to be a null object. This suggests that the error being passed from the system framework to our delegate method is malformed or contains invalid memory. Steps to Reproduce Create an SCStream with audio capture enabled Add audio output to the stream Start capture and write audio data to disk Allow the capture to run for several minutes (3-5 minutes typically triggers the issue) The app will crash with EXC_BAD_ACCESS in swift_getErrorValue Code Sample func stream(_ stream: SCStream, didStopWithError error: Error) { print("Stream stopped with error: \(error)") // Crash occurs before this line executes } func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) { guard type == .audio, sampleBuffer.isValid else { return } // Process audio data... } Expected Behavior The error should be properly propagated to the delegate method, allowing for graceful error handling and recovery. Actual Behavior The application crashes with EXC_BAD_ACCESS when the Swift runtime attempts to process the error in swift_getErrorValue. Crash Log Details Thread #35, queue = 'com.apple.NSXPCConnection.m-user.com.apple.replayd', stop reason = EXC_BAD_ACCESS (code=1, address=0x0) frame #0: 0x0000000194c3088c libswiftCore.dylib`swift::_swift_getClass(void const*) + 8 frame #1: 0x0000000194c30104 libswiftCore.dylib`swift_getErrorValue + 40 frame #2: 0x00000001057fba30 shadow`NewScreenCaptureService.stream(stream=0x0000600002de6700, error=Swift.Error @ 0x000000016b7b5e30) at NEW+ScreenCaptureService.swift:365:15 frame #3: 0x00000001057fc050 shadow`@objc NewScreenCaptureService.stream(_:didStopWithError:) at <compiler-generated>:0 frame #4: 0x0000000219ec5ca0 ScreenCaptureKit`-[SCStreamManager stream:didStopWithError:] + 456 frame #5: 0x00000001ca68a5cc ReplayKit`-[RPScreenRecorder stream:didStopWithError:] + 84 frame #6: 0x00000001ca696ff8 ReplayKit`-[RPDaemonProxy stream:didStopWithError:] + 224 Printing description of stream._streamQueue: error: ObjectiveC.id:4294967281:18: note: 'id' has been explicitly marked unavailable here public typealias id = AnyObject ^ error: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:1:65: 'id' is unavailable in Swift: 'id' is not available in Swift; use 'Any' Swift._DebuggerSupport.stringForPrintObject(Swift.UnsafePointer<id>(bitPattern: 0x104ae08c0)!.pointee) ^~ ObjectiveC.id:2:18: note: 'id' has been explicitly marked unavailable here public typealias id = AnyObject ^ warning: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:5:7: initialization of variable '$__lldb_error_result' was never used; consider replacing with assignment to '_' or removing it var $__lldb_error_result = __lldb_tmp_error ~~~~^~~~~~~~~~~~~~~~~~~~ _ Before the crash, we observed this error message in the console: [ERROR] *****SCStream*****RemoteAudioQueueOperationHandlerWithError:1015 Error received from the remote queue -16665 Additional Context The issue occurs consistently after approximately 3-4 successful audio segment recordings of 60 seconds each Commenting out custom segment rotation logic does not prevent the crash The crash involves XPC communication with Apple's ReplayKit daemon The error appears to be corrupted or malformed when crossing the XPC boundary Workarounds Attempted Added proper thread safety for all published properties using DispatchQueue.main.async Implemented more robust error handling in the delegate methods None of these approaches prevented the crash since it occurs at the Swift runtime level before our code executes. Impact This issue prevents reliable long-duration audio capture using ScreenCaptureKit. This bug significantly limits the usefulness of ScreenCaptureKit for any application requiring continuous system audio capture for more than a few minutes. Perhaps this issue might be related to a macOS bug where the system dialog indicates that the screen is being shared, even though nothing is actually being shared. Moreover, when attempting to stop sharing, nothing happens.
2
0
661
Mar ’25
Can Logic Pro load an Audio Unit v3 in-process?
After investing more than a week into getting a bunch of audio unit projects converted into app + appex + framework, they all are now correctly loaded in-process in the demo host app that is part of Xcode's template. However, Logic Pro adamantly refuses to load them in-process. Does Logic Pro simply not do that ever, or is there some hint or configuration my plugins need to provide to enable that? If it is unsupported, will it be supported in some future version of Logic? The entire point of investing that week was performance, which is moot if it is impossible to test the impact of loading in-process in a real-world usage scenario.
1
0
676
Jan ’25
Distorted Audio When Recording External Mics With AVCaptureSession and AVAssetWriter
I’m working on a macOS app, written in Swift. My goal is to record audio from an external microphone, e.g., one connected via USB. For this, I’m using an AVCaptureSession and recording its output with an AVAssetWriter. This works perfectly in principle (and reliably with internal microphones, for example). The problem occurs after my app has successfully completed the first recording and I then want to make additional recordings (which makes me think it might be process-dependent, because it works again after restarting the app). The problem: Noisy or distorted-sounding audio files. In addition, the following error message appears in the Console from CoreAudio / its AudioConverter: Input data proc returned inconsistent 512 packets for 2048 bytes; at 3 bytes per packet, that is actually 682 packets It is easy to reproduce. This problem is reproducible even if I don’t configure the AVAssetWriter manually and instead let it receive its audioSettings using a preset from an AVOutputSettingsAssistant. I’m running on macOS 15.0 (24A335). I’ve filed a feedback including a demo project → FB15333298 🎟️ I would greatly appreciate any help! Have a great day, Martin
6
0
879
Feb ’25
Apple Music iOS 26 features in Android
Since many users like me use Apple Music on Android, the app is almost as feature-rich as iOS. It would be fantastic if the developers could add the new iOS 26 features to the Android app, along with a minor UI change. I know it’s challenging to implement liquid glass on Android hardware or design, but features like auto-mix, pronunciation, and translation could be added. kindly consider this request !!!!
1
0
133
Jul ’25
Audio player app is silent if device connected via CarPlay
I have a SwiftUI app - (https://youtu.be/VbAfUk_eYl0?si=JxUBh0Bpb-vc1E1U) - which I thought was almost ready for release - a manager for airdropped audio files from Logic Pro or other music creation applications. It uses AVAudioEngine and AVAudioPlayerNode to play audio, and the MediaPlayer API to integrate with car audio and similar, all of which works well. It does not currently have an explicit CarPlay integration (and I'm slightly horrified at the amount of work that is going to require). I had the good or bad luck of getting a loaner car with carplay while mine is being repaired yesterday, and lo and behold, when connected to the vehicle via CarPlay, there is no audio output in the vehicle at all. The now playing panel correctly shows the information my app provides about the currently playing song; the player node believes it is playing, the AVAudioSession is configured as it should be. But there is no sound. Obviously I cannot ship it in this state. I've tried fiddling with the parameters the AVAudioSession is configured with, in case there was some parameter that was preventing audio output, to no avail - currently: var options = AVAudioSession.CategoryOptions() options.insert(.allowAirPlay) options.insert(.allowBluetooth) options.insert(.allowBluetoothA2DP) try session.setCategory(.playback, mode: .default, options: options) try? session.setPreferredIOBufferDuration(0.002) // ~96 samples at 44.1kHz try? session.setPrefersNoInterruptionsFromSystemAlerts(true) try? session.setPrefersInterruptionOnRouteDisconnect(false) try session.setActive(true, options: [.notifyOthersOnDeactivation]) All diagnostics within the app show the player operating correctly - files are played and flushed; AVAudioPlayerNodeCompletionCallbacks are called when they should be. But the output is not audible in the vehicle. I would much prefer to ship this app without full-blown CarPlay integration, but with working audio when connected via CarPlay, and work on full CarPlay integration for the next release. Is there some secret handshake I am just missing to make this work?
1
0
124
Mar ’25
Execution breakpoint when trying to play a music library file with AVAudioEngine
Hi all, I'm working on an audio visualizer app that plays files from the user's music library utilizing MediaPlayer and AVAudioEngine. I'm working on getting the music library functionality working before the visualizer aspect. After setting up the engine for file playback, my app inexplicably crashes with an EXC_BREAKPOINT with code = 1. Usually this means I'm unwrapping a nil value, but I think I'm handling the optionals correctly with guard statements. I'm not able to pinpoint where it's crashing. I think it's either in the play function or the setupAudioEngine function. I removed the processAudioBuffer function and my code still crashes the same way, so it's not that. The device that I'm testing this on is running iOS 26 beta 3, although my app is designed for iOS 18 and above. After commenting out code, it seems that the app crashes at the scheduleFile call in the play function, but I'm not fully sure. Here is the setupAudioEngine function: private func setupAudioEngine() { do { try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default) try AVAudioSession.sharedInstance().setActive(true) } catch { print("Audio session error: \(error)") } engine.attach(playerNode) engine.attach(analyzer) engine.connect(playerNode, to: analyzer, format: nil) engine.connect(analyzer, to: engine.mainMixerNode, format: nil) analyzer.installTap(onBus: 0, bufferSize: 1024, format: nil) { [weak self] buffer, _ in self?.processAudioBuffer(buffer) } } Here is the play function: func play(_ mediaItem: MPMediaItem) { guard let assetURL = mediaItem.assetURL else { print("No asset URL for media item") return } stop() do { audioFile = try AVAudioFile(forReading: assetURL) guard let audioFile else { print("Failed to create audio file") return } duration = Double(audioFile.length) / audioFile.fileFormat.sampleRate if !engine.isRunning { try engine.start() } playerNode.scheduleFile(audioFile, at: nil) playerNode.play() DispatchQueue.main.async { [weak self] in self?.isPlaying = true self?.startDisplayLink() } } catch { print("Error playing audio: \(error)") DispatchQueue.main.async { [weak self] in self?.isPlaying = false self?.stopDisplayLink() } } } Here is a link to my test project if you want to try it out for yourself: https://github.com/aabagdi/VisualMan-example Thanks!
8
0
653
Jul ’25
[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber
So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
6
0
697
Nov ’25
iOS 26 HLS Audio Track Display Behavior: EXT-X-MEDIA NAME vs LANGUAGE Attributes
Hello Apple Developer Community, I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag. In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so: #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8" #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8" Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English"). We would like to understand the official or intended behavior regarding this. Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label? If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI? Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute? Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated. Thank you for your time and assistance.
2
0
404
Aug ’25
AVAudioRecorder records silence
We have application using PTT Framework to record audio messages when app is backgrounded. Right now we are using AVAudioRecorder for that purpose. And problem is one specific user has frequent issue - recorded audio contains only silence. I've checked almost everything I can imagine but didn't find any possible reason of issue. Conditions: AVAudioRecorder uses following configuration: [ AVEncoderAudioQualityKey: AVAudioQuality.low.rawValue, AVFormatIDKey : kAudioFormatMPEG4AAC, AVNumberOfChannelsKey: 1, AVSampleRateKey: 16000.0 ] App waits both didBeginTransmitting and didActivate audioSession from PTChannelManager (audio session has playback category at that moment) App does AVAudioSession category change to playAndRecord App gets routeChangeNotification with categoryChange and category = playAndRecord There is no any interruption notifications from AVAudioSession during recording There is no any error notification from AVAudioRecorder Any idea what exactly I do wrong? Is there anything else I should check? Thanks in advance. P.S. it looks like recording audio with AudioUnit has the same issue, but let's exclude it from question atm for simplicity.
3
0
397
Mar ’25
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
4
0
383
Aug ’25
SpeechTranscriber/SpeechAnalyzer being relatively slow compared to FoundationModel and TTS
So, I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be. Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS. E.g. InteractionStatistics: - listeningStarted: 21:24:23 4480 2423 - timeTillFirstAboveNoiseFloor: 01.794 - timeTillLastNoiseAboveFloor: 02.383 - timeTillFirstSpeechDetected: 02.399 - timeTillTranscriptFinalized: 04.510 - timeTillFirstMLModelResponse: 04.938 - timeTillMLModelResponse: 05.379 - timeTillTTSStarted: 04.962 - timeTillTTSFinished: 11.016 - speechLength: 06.054 - timeToResponse: 02.578 - transcript: This is a test. - mlModelResponse: Sure! I'm ready to help with your test. What do you need help with? Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s. Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly). I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now? I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
2
0
572
Aug ’25