Mixing ScreenCaptureKit audio with microphone audio

Question

tarmac94 OP

Created Feb ’24

Replies 1

Boosts 0

Views 1.2k

Participants 2

Hi,

I'm new to AVAudioEngine(and macOS programming in general).

I'm trying to mix microphone audio with ScreenCaptureKit audio using AVAudioEngine without playing it back. I've created a AVAudioPlayerNode and scheduling buffers in my SCStream handler:

                playerNode.scheduleBuffer(samples)

and have connected the playerNode to the mainMixerNode.

        audioEngine.connect(audioEngine.inputNode, to: audioEngine.mainMixerNode, format: micFormat)
        audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: format)

The problem is that mainMixerNode plays the audio to the speaker creating a feedback loop. How can I prevent the mixer output from being played back.

Also: Is this the best way of mixing microphone input with some other input? I ran into AVAudioEngine's manual rendering mode, which seems like the way to go for mixing audio without playing it back. However, I couldn't figure out how to connect microphone input to the AVAudioEngine in manual rendering mode?

Boost

Answer 1

eddiewangyw OP

5d

I ran into exactly this problem when building an audio pipeline that mixes system audio (via ScreenCaptureKit) with microphone input for real-time speech processing. The core issue is that mainMixerNode is connected to outputNode by default, which routes everything to speakers. You have two approaches:

In manual rendering mode, AVAudioEngine does not play back to hardware — you pull rendered buffers on your own schedule. Enable manual rendering, attach a player node for your SCK audio, connect it to the main mixer, then call renderOffline() to pull mixed audio on demand. The catch: inputNode does not work in offline mode on macOS. The workaround is to capture mic samples separately (via AVCaptureSession or a tap on a separate realtime engine), then schedule those buffers into a second AVAudioPlayerNode.

Keep the engine in realtime mode but prevent playback by setting mainMixerNode.outputVolume = 0. Then install a tap on mainMixerNode to capture the mixed audio without speaker feedback. I tried disconnecting mainMixerNode from outputNode entirely, but on some macOS versions (13.x specifically) this causes the engine to stop pulling audio from its inputs. Setting volume to 0 is more reliable across macOS 13–15. For the sample rate mismatch between SCK output (typically 48kHz) and mic input (sometimes 44.1kHz), let the mixer handle the conversion — connect each source in its native format and set the mixer output format to your target rate.

0