I have a memory leak, when using AVAudioPlayer. I managed to narrow down the issue into a very simple app, which code I paste in at the end.
The memory leak start immediately when I start playing sound, but only in the emylator. On the real iPhone there is no memory leak.
The memory leak on the Simulator looks like this:
import SwiftUI
import AVFoundation
struct ContentView_Audio: View {
var sound: AVAudioPlayer?
init() {
guard let path = Bundle.main.path(forResource: "cd201", ofType: "mp3") else { return }
let url = URL(fileURLWithPath: path)
do {
try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default, options: [.mixWithOthers])
} catch {
return
}
do {
try AVAudioSession.sharedInstance().setActive(true)
} catch {
return
}
do {
sound = try AVAudioPlayer(contentsOf: url)
} catch {
return
}
}
var body: some View {
HStack {
Button {
playSound()
} label: {
ZStack {
Circle()
.fill(.mint.opacity(0.3))
.frame(width: 44, height: 44)
.shadow(radius: 8)
Image(systemName: "play.fill")
.resizable()
.frame(width: 20, height: 20)
}
}
.padding()
Button {
stopSound()
} label: {
ZStack {
Circle()
.fill(.mint.opacity(0.3))
.frame(width: 44, height: 44)
.shadow(radius: 8)
Image(systemName: "stop.fill")
.resizable()
.frame(width: 20, height: 20)
}
}
.padding()
}
}
private func playSound() {
guard sound != nil else { return }
sound?.volume = 1
// sound?.numberOfLoops = -1
sound?.play()
}
func stopSound() {
sound?.stop()
}
}
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I’m using the shared instance of AVAudioSession. After activating it with .setActive(true), I observe the outputVolume, and it correctly reports the device’s volume.
However, after deactivating the session using .setActive(false), changing the volume, and then reactivating it again, the outputVolume returns the previous volume (before deactivation), not the current device volume. The correct volume is only reported after the user manually changes it again using physical buttons or Control Center, which triggers the observer.
What I need is a way to retrieve the actual current device volume immediately after reactivating the audio session, even on the second and subsequent activations.
Disabling and re-enabling the audio session is essential to how my application functions.
I’ve tested this behavior with my colleagues, and the issue is consistently reproducible on iOS 18.0.1, iOS 18.1, iOS 18.3, iOS 18.5 and iOS 18.6.2. On devices running iOS 17.6.1 and iOS 16.0.3, outputVolume correctly reflects the current volume immediately after calling .setActive(true) multiple times.
Hello,
I have an iOS app that is recording audio that is working fine on iPads/iPhones. It asks for microphone permission and after that recording works.
I installed the same app on my M3 MacBook via TestFlight since iPad apps are supposed to work without a change that way. The app starts fine and everything, but it never asks for Microphone permission, so I can't record.
Do I need to do something to make this happen (this is not macCatalyst, its running the arm64 iPhone binary on macOS)
thanks
I'm writing some camera functionality that uses AVCaptureVideoDataOutput.
I've set it up so that it calls my AVCaptureVideoDataOutputSampleBufferDelegate on a background thread, by making my own dispatch_queue and configuring the AVCaptureVideoDataOutput.
My question is then, if I configure my AVCaptureSession differently, or even stop it altogether, is this guaranteed to flush all pending jobs on my background thread? For example, does [AVCaptureSession stopRunning] imply a blocking call until all pending frame-callbacks are done?
I have a more practical example below, showing how I am accessing something from the foreground thread from the background thread, but I wonder when/how it's safe to clean up that resource.
I have setup similar to the following:
// Foreground thread logic
dispatch_queue_t queue = dispatch_queue_create("qt_avf_camera_queue", nullptr);
AVCaptureSession *captureSession = [[AVCaptureSession alloc] init];
setupInputDevice(captureSession); // Connects the AVCaptureDevice...
// Store some arbitrary data to be attached to the frame, stored on the foreground thread
FrameMetaData frameMetaData = ...;
MySampleBufferDelegate *sampleBufferDelegate = [MySampleBufferDelegate alloc];
// Capture frameMetaData by reference in lambda
[sampleBufferDelegate setFrameMetaDataGetter: [&frameMetaData]() { return &frameMetaData; }];
AVCaptureVideoDataOutput *captureVideoDataOutput = [[AVCaptureVideoDataOutput alloc] init];
[captureVideoDataOutput setSampleBufferDelegate:sampleBufferDelegate
queue:queue];
[captureSession addOutput:captureVideoDataOutput];
[captureSession startRunning];
[captureSession stopRunning];
// Is it now safe to destroy frameMetaData, or do we need manual barrier?
And then in MySampleBufferDelegate:
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
// Invokes the callback set above
FrameMetaData *frameMetaData = frameMetaDataGetter();
emitSampleBuffer(sampleBuffer, frameMetaData);
}
Hi everyone,
I noticed that Apple recently added a few new beta sample codes related to video encoding:
Encoding video for low-latency conferencing
Encoding video for live streaming
While experimenting with H.264 encoding, I came across some questions regarding certain configurations:
When I enable kVTVideoEncoderSpecification_EnableLowLatencyRateControl, is it still possible to use kVTCompressionPropertyKey_VariableBitRate? In my tests, I get an error.
It also seems that kVTVideoEncoderSpecification_EnableLowLatencyRateControl cannot be used together with kVTCompressionPropertyKey_ConstantBitRate when encoding H264. Is that expected?
When using kVTCompressionPropertyKey_ConstantBitRate with kVTCompressionPropertyKey_MaxKeyFrameInterval set to 2, the encoder outputs only keyframes, and the frame size keeps increasing, which doesn’t seem like the intended behavior.
Regarding the following code from the sample:
let byteLimit = (Double(bitrate) / 8) * 1.5 as CFNumber
let secLimit = Double(1.0) as CFNumber
let limitsArray = [ byteLimit, secLimit ] as CFArray // Each 1 second limit byte as bitrate
err = VTSessionSetProperty(session, key: kVTCompressionPropertyKey_DataRateLimits, value: limitsArray)
This DataRateLimits setting doesn’t seem to have any effect in my tests. Whether I set it or not, the values remain unchanged.
Since the documentation on developer.apple.com/documentation
doesn’t clearly explain these cases, I’d like to ask if anyone has insights or recommendations about the proper usage of these settings.
Thanks in advance!
In MusicKit Web the playback states are provided as numbers.
For example the playbackStateDidChange event listener will return:
{oldState: 2, state: 3, item:...}
When the state changes from playing (2) to paused (3).
Those are pretty easy to guess, but I'm having a hard time with some of the others: completed,
ended,
loading,
none,
paused,
playing,
seeking,
stalled,
stopped,
waiting.
I cannot find a mapping of states to numbers documented anywhere. I got the above states from an enum in a d.ts file that is often incorrect/incomplete.
Can someone help out pointing to the docs or provide a mapping?
Thanks.
If new photo is added to library and app is not running in foreground or was not opened after the new photo was added but the app is having full access to gallery, can it access, read the new photo - If the app is not specifically a cloud syncing app, can it have this attached function, suppose it is a game app or beauty camera app?
Topic:
Media Technologies
SubTopic:
Photos & Camera
Tags:
Privacy
PhotoKit
Background Tasks
Background Assets
On macOS Sequoia, I'm having the hardest time getting this basic audio output to work correctly. I'm compiling in XCode using C99, and when I run this, I get audio for a split second, and then nothing, indefinitely.
Any ideas what could be going wrong?
Here's a minimum code example to demonstrate:
#include <AudioToolbox/AudioToolbox.h>
#include <stdint.h>
#define RENDER_BUFFER_COUNT 2
#define RENDER_FRAMES_PER_BUFFER 128
// mono linear PCM audio data at 48kHz
#define RENDER_SAMPLE_RATE 48000
#define RENDER_CHANNEL_COUNT 1
#define RENDER_BUFFER_BYTE_COUNT (RENDER_FRAMES_PER_BUFFER * RENDER_CHANNEL_COUNT * sizeof(f32))
void RenderAudioSaw(float* outBuffer, uint32_t frameCount, uint32_t channelCount)
{
static bool isInverted = false;
float scalar = isInverted ? -1.f : 1.f;
for (uint32_t frame = 0; frame < frameCount; ++frame)
{
for (uint32_t channel = 0; channel < channelCount; ++channel)
{
// series of ramps, alternating up and down.
outBuffer[frame * channelCount + channel] = 0.1f * scalar * ((float)frame / frameCount);
}
}
isInverted = !isInverted;
}
AudioStreamBasicDescription coreAudioDesc = { 0 };
AudioQueueRef coreAudioQueue = NULL;
AudioQueueBufferRef coreAudioBuffers[RENDER_BUFFER_COUNT] = { NULL };
void coreAudioCallback(void* unused, AudioQueueRef queue, AudioQueueBufferRef buffer)
{
// 0's here indicate no fancy packet magic
AudioQueueEnqueueBuffer(queue, buffer, 0, 0);
}
int main(void)
{
const UInt32 BytesPerSample = sizeof(float);
coreAudioDesc.mSampleRate = RENDER_SAMPLE_RATE;
coreAudioDesc.mFormatID = kAudioFormatLinearPCM;
coreAudioDesc.mFormatFlags = kLinearPCMFormatFlagIsFloat | kLinearPCMFormatFlagIsPacked;
coreAudioDesc.mBytesPerPacket = RENDER_CHANNEL_COUNT * BytesPerSample;
coreAudioDesc.mFramesPerPacket = 1;
coreAudioDesc.mBytesPerFrame = RENDER_CHANNEL_COUNT * BytesPerSample;
coreAudioDesc.mChannelsPerFrame = RENDER_CHANNEL_COUNT;
coreAudioDesc.mBitsPerChannel = BytesPerSample * 8;
coreAudioQueue = NULL;
OSStatus result;
// most of the 0 and NULL params here are for compressed sound formats etc.
result = AudioQueueNewOutput(&coreAudioDesc, &coreAudioCallback, NULL, 0, 0, 0, &coreAudioQueue);
if (result != noErr)
{
assert(false == "AudioQueueNewOutput failed!");
abort();
}
for (int i = 0; i < RENDER_BUFFER_COUNT; ++i)
{
uint32_t bufferSize = coreAudioDesc.mBytesPerFrame * RENDER_FRAMES_PER_BUFFER;
result = AudioQueueAllocateBuffer(coreAudioQueue, bufferSize, &(coreAudioBuffers[i]));
if (result != noErr)
{
assert(false == "AudioQueueAllocateBuffer failed!");
abort();
}
}
for (int i = 0; i < RENDER_BUFFER_COUNT; ++i)
{
RenderAudioSaw(coreAudioBuffers[i]->mAudioData, RENDER_FRAMES_PER_BUFFER, RENDER_CHANNEL_COUNT);
coreAudioBuffers[i]->mAudioDataByteSize = coreAudioBuffers[i]->mAudioDataBytesCapacity;
AudioQueueEnqueueBuffer(coreAudioQueue, coreAudioBuffers[i], 0, 0);
}
AudioQueueStart(coreAudioQueue, NULL);
sleep(10); // some time to hear the audio
AudioQueueStop(coreAudioQueue, true);
AudioQueueDispose(coreAudioQueue, true);
return 0;
}
I am trying to debug the AAX version of my plugin (MIDI effect) on Pro Tools, but I am getting the following error (Mac console) when attempting to load it:
dlsym cannot find symbol g_dwILResult in CFBundle etc..
I used Xcode 16.4 to build the plugin.
Has anybody come across the same or a similar message?
Best,
Achillefs
Axart Labs
Hi all,
with my app ScreenFloat, you can record your screen, along with system- and microphone audio.
Those two audio feeds are recorded into separate audio tracks in order to individually remove or edit them later on.
Now, these recordings you create with ScreenFloat can be drag-and-dropped to other apps instantly. So far, so good, but some apps, like Slack, or VLC, or even websites like YouTube, do not play back multiple audio tracks, just one.
So what I'm trying to do is, on dragging the video recording file out of ScreenFloat, instantly baking together the two individual audio tracks into one, and offering that new file as the drag and drop file, so that all audio is played in the target app.
But it's slow. I mean, it's actually quite fast, but for drag and drop, it's slow.
My approach is this:
"Bake together" the two audio tracks into a one-track m4a audio file using AVMutableAudioMix and AVAssetExportSession
Take the video track, add the new audio file as an audio track to it, and render that out using AVAssetExportSession
For a quick benchmark, a 3'40'' movie, step 1 takes ~1.7 seconds, and step two adds another ~1.5 seconds, so we're at ~3.2 seconds. That's an eternity for a drag and drop, where the user might cancel if there's no immediate feedback.
I could also do it in one step, but then I couldn't use the AV*Passthrough preset, and that makes it take around 32 seconds then, because I assume it touches the video data (which is unnecessary in this case, so I think the two-step approach here is the fastest).
So, my question is, is there a faster way?
The best idea I can come up with right now is, when initially recording the screen with system- and microphone audio as separate tracks, to also record both of them into a third, muted, "hidden" track I could use later on, basically eliminating the need for step one and just ripping the two single audio tracks out of the movie and only have the video and the "hidden" track (then unmuted), but I'd still have a ~1.5 second delay there. Also, there's the processing and data overhead (basically doubling the movie's audio data).
All this would be great for an export operation (where one expects it to take a little time), but for a drag-and-drop operation, it's not ideal.
I've discarded the idea of doing a promise file drag, because many apps do not accept those, and I want to keep wide compatibility with all sorts of apps.
I'd appreciate any ideas or pointers.
Thank you kindly,
Matthias
I'm developing the VisionOS app. I want to know how to play spatial audio in addition to RealityKit? If it's iOS or macOS, how to play spatial audio in addition to RealityKit?
FairPlay-Protected HLS Files Not Transferred via Quick Start I have an iOS app that downloads HLS files, which are protected by FairPlay. These files are stored locally, and their locations are managed using Core Data. When playing these tracks, I use AVURLAsset to access the stored file paths.
Recently, a client upgraded to a new iPhone and used Quick Start to transfer data from his old device. While all other app data was successfully transferred, including Core Data records and UserDefaults, the actual HLS files were missing. As a result, the app retained metadata about the downloaded content, but the files themselves were gone, causing playback failures.
Does Quick Start exclude certain types of locally stored files, especially DRM-protected HLS downloads, or is the issue related to how FairPlay-protected content is handled during the transfer of locally stored files?
Topic:
Media Technologies
SubTopic:
Streaming
Tags:
FairPlay Streaming
HTTP Live Streaming
AVFoundation
Some users reported that their images are not loading correctly in our app. After a lot of debugging we identified the following:
This only happens when the app is build for Mac Catalyst. Not on iOS, iPadOS, or “real” macOS (AppKit).
The images in question have unusual color spaces. We observed the issue for uRGB and eciRGB v2.
Those images are rendered correctly in Photos and Preview on all platforms.
When displaying the image inside of a UIImageView or in a SwiftUI Image, they render correctly.
The issue only occurs when loading the image via Core Image.
When comparing the different Core Image render graphs between AppKit (working) and Catalyst (faulty) builds, they look identical—except for the result.
Mac (AppKit):
Catalyst:
Something seems to be off when Core Image tries to load an image with foreign color space in Catalyst.
We identified a workaround: By using a CGImageDestination to transcode the image using the kCGImageDestinationOptimizeColorForSharing option, Image I/O will convert the image to sRGB (or similar) and Core Image is able to load the image correctly. However, one potentially loses fidelity this way.
Or might there be a better workaround?
Topic:
Media Technologies
SubTopic:
Photos & Camera
Tags:
Image I/O
Photos and Imaging
Core Image
Core Graphics
Safari is supposed to support animated AVIF images since version 16, but the ones I've tested perform very poorly, even on an M4 Mac Mini running Sequoia 15.1.1.
I believe Safari delegates decoding to the operating system itself, so this issue also happens in Live Preview in the finder, when I try to preview a file.
Sample file here: https://s3.us-west-2.amazonaws.com/cdn.paintera.org/test/sample.avif
322KB file, 5 seconds long, 12fps
This plays perfectly on Chrome on Mac OS, but is slow and laggy on Safari and Live Preview (it takes about 6.5 seconds to finish the 5 second video).
Does anyone know how to fix this or workaround this issue?
Topic:
Media Technologies
SubTopic:
Video
My current app implements a custom video player, based on a AVSampleBufferRenderSynchronizer synchronising two renderers:
an AVSampleBufferDisplayLayer receiving decoded CVPixelBuffer-based video CMSampleBuffers,
and an AVSampleBufferAudioRenderer receiving decoded lpcm-based audio CMSampleBuffers.
The AVSampleBufferRenderSynchronizer is started when the first image (in presentation order) is decoded and enqueued, using avSynchronizer.setRate(_ rate: Float, time: CMTime), with rate = 1 and time the presentation timestamp of the first decoded image.
Presentation timestamps of video and audio sample buffers are consistent, and on most streams, the audio and video are correctly synchronized.
However on some network streams, on iOS, the audio and video aren't synchronized, with a time difference that seems to increase with time.
On the other hand, with the same player code and network streams on macOS, the synchronization always works fine.
This reminds me of something I've read, about cases where an AVSampleBufferRenderSynchronizer could not synchronize audio and video, causing them to run with independent and potentially drifting clocks, but I cannot find it again.
So, any help / hints on this sync problem will be greatly appreciated! :)
I've got a web app built with MusicKit that displays a list of songs.
I have player controls for play, pause, skip next, skip, previous, toggle shuffle and set repeat mode.
All of these work by using music.
The play button, when nothing is playing and nothing is in the queue, will enqueue all the tracks and start playing with the below, for example:
await music.setQueue({ songs, startPlaying: true });
I've implemented a progress slider based on feedback from the "playbackProgressDidChange" listener.
Now, how in the world can I set the volume? This seems like it should be simple, but I am at a complete loss here.
The docs say:
"The volume of audio playback, which is set directly on the HTMLMediaElement as the HTMLMediaElement.volume property. This value ranges between 0, which would be muting the audio, and 1, which would be the loudest possible."
Given that all my controls work off the music instance, I don't understand how I can do that.
In this video from WWDC 2022, music web components are touched on briefly. These are also documented very sparsely. The volume docs are here.
For the life of me, I can't even get the volume web component to display in the UI.
It appears that MusicKit Web is hobbled compared to the native implementation, but surely adjusting volume shouldn't be that hard right?
I'd appreciate any insight on how to do this, including how to get web components to work (in a Next JS app).
Thanks.
The same H265 encrypted Fairplay content can be played in all Apple devices except A1625.
The clear H265 content is played in A1625.
The question is: will this model (A1625) support H265 Fairplay encrypted content?
A ticket was created here:
https://discussions.apple.com/thread/255658006?sortBy=best
Topic:
Media Technologies
SubTopic:
Streaming
Tags:
FairPlay Streaming
Media
Video
HTTP Live Streaming
In iOS 26 (Developer Beta), the AVCaptureMetadataOutputObjectsDelegate no longer receives callbacks when metadataOutput.metadataObjectTypes = [.face] is set. On earlier iOS versions the issue does not occur. Interestingly, face detection works if I set the sessionPreset to .medium, but not with .high — except on the iPhone 16 Pro Max, where it works regardless.
Hi everyone,
We are working on a prototype app for Apple Vision Pro that is similar in functionality to Omegle or Chatroulette, but exclusively for Vision Pro owners.
The core idea is:
– a matching system where one user connects to another through a virtual persona;
– real-time video and audio transmission;
– time limits for sessions with the ability to extend them;
– users can skip a match and move on to the next one.
We have explored WebRTC and Twilio, but unfortunately, they don’t fit our use case.
Question:
What alternative services or SDKs are available for implementing real-time video/audio communication on Vision Pro that would work with this scenario?
Has anyone encountered a similar challenge and can recommend which technologies or tools to use?
Thanks in advance!
AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right?
It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"?
Thank you.