Metal

Why slower with larger threadgroup memory?

I'm implementing optimized matmul on metal: https://github.com/crynux-ai/metal-matmul/blob/main/metal/1_shared_mem.metal I notice that performance is significantly different with different threadgroup memory set in [computeEncoder setThreadgroupMemoryLength] All other lines are exactly same, the only difference is this parameter. Matmul performance is roughly 250 GFLops if I set 32768 (max bytes allowed on this M1 Max), but 400 GFLops if I set 8192. Why does this happen? How can I optimize it?

Graphics & Games Metal

2

0

126

Apr ’25

Physics bug in WWE 2K25 with GPTK2.1

The game physics work as expected using GTPK 2.0 using Crossover 24 or Whisky. However, using GPTK 2.1 with Crossover 25, the player and camera physics misbehave. See https://www.reddit.com/r/WWEGames/comments/1jx9mph/the_siamese_elbow/ and https://www.reddit.com/r/WWEGames/comments/1jx9ow4/camera_glitch/ Full video also linked in the Reddit post. I have also submitted this bug via the feedback assistant.

Graphics & Games Metal Game Porting Toolkit

2

0

196

Apr ’25

WWDC25 Metal & game technologies group lab

Hello, Thank you for attending today’s Metal & game technologies group lab at WWDC25! We were delighted to answer many questions from developers and energized by the community engagement. We hope you enjoyed it and welcome your feedback. We invite you to carry on the conversation here, particularly if your question appeared in Slido and we were unable to answer it during the lab. If your question received feedback let us know if you need clarification. You may want to ask your question again in a different lab e.g. visionOS tomorrow. (We realize that this can be confusing when frameworks interoperate) We have a lot to learn from each other so let’s get to Q&A and make the best of WWDC25! 😃 Looking forward to your questions posted in new threads.

2

0

387

Jul ’25

MetalFX upscaler/denoiser and instant changes

Hi, What's the best way to handle drastic changes in scene charateristics with the new MTLFXTemporalDenoisedScaler? Let's say a visible object of the scene radically changes its material properties. I can modify the albedo and roughness textures consequently. But I suspect the history will be corrupted. Blending visual information between the new frame and the previous ones might be a nonsense. I guess the problem should be the same when objects appear or disappear instantly. Is the upsacler manage these events for us (by lowering blending), or should we use the reactive or the denoise strength mask or something like that to handle them?

Graphics & Games Metal MetalFX

2

0

134

Jul ’25

Unable to package in UE5.6

Im new in the Mac area but for sure not UE. Windows is a long process to packaging but it could be done. All the documentation for Epic and from the internet is basically non existent with exactly how to package a project within UE. I have Xcode installed which makes sense, agreed to terms and install for MacOS, I've been able to make a project for several weeks now and want to package for a test run for my friends to play on Windows. Now I just get this in the log: UATHelper: Packaging (Mac): ERROR: Failed to finalize the .app with Xcode. Check the log for more information UATHelper: Packaging (Mac): Trace written to file /Users/rileysleger/Library/Logs/Unreal Engine/LocalBuildLogs/UBA-ProjectNightTerror-Mac-Development.uba with size 12.6kb UATHelper: Packaging (Mac): Total time in Unreal Build Accelerator local executor: 8.12 seconds UATHelper: Packaging (Mac): Result: Failed (OtherCompilationError) UATHelper: Packaging (Mac): Total execution time: 9.71 seconds PackagingResults: Error: Failed to finalize the .app with Xcode. Check the log for more information UATHelper: Packaging (Mac): Took 9.77s to run dotnet, ExitCode=6 UATHelper: Packaging (Mac): UnrealBuildTool failed. See log for more details. (/Users/rileysleger/Library/Logs/Unreal Engine/LocalBuildLogs/UBA-ProjectNightTerror-Mac-Development.txt) UATHelper: Packaging (Mac): AutomationTool executed for 0h 0m 10s UATHelper: Packaging (Mac): AutomationTool exiting with ExitCode=6 (6) UATHelper: Packaging (Mac): RunUAT ERROR: AutomationTool was unable to run successfully. Exited with code: 6 PackagingResults: Error: AutomationTool was unable to run successfully. Exited with code: 6 PackagingResults: Error: Unknown Error This absolutely makes no sense to me. Anyone have ideas?

Graphics & Games Metal Xcode macOS

2

0

276

Jul ’25

CustomMetalView sample uses deprecated functions - update?

The sample code here, has code like: // Create a display link capable of being used with all active displays cvReturn = CVDisplayLinkCreateWithActiveCGDisplays(&_displayLink); But that function's doc says it's deprecated and to use NSView/NSWindow/NSScreen displayLink instead. That returns CADisplayLink, not CVDisplayLink. Also the documentation for that displayLink method is completely empty. I'm not sure if I'm supposed to add it to run loop, or what, after I get it. It would be nice to get an updated version of this sample project and/or have some documentation in NSView.displayLink

Graphics & Games Metal

2

0

363

Jul ’25

Problem running Unreal 5.6

I am using Unreal Engine 5.6 on a MacBook Pro with an M3 chip and macOS 15.5. I’ve installed Xcode and accepted the license, but Unreal is not detecting the latest Metal Shader Standard (Metal v3.0). The maximum version Unreal sees is Metal v2.4, even though the hardware and OS should support Metal 3.0. I’ve also run sudo xcode-select -s /Applications/Xcode.app and accepted the license via Terminal. Is there anything in Xcode settings, SDK availability, or system permissions that could be preventing access to Metal 3.0 features?"

Graphics & Games Metal Xcode

2

0

486

Jul ’25

Metal IR reference

Hello! I'm developing a GPU (shader) language, where I aim to target multiple backends with a common frontend. I wanted to avoid having to round trip through Metal, and go straight to IR just like I have with SPIRV, in order to have a fast and efficient compilation process. I've been looking for a reference page where I can read about Metals IR, and as far as I'm aware, it exists, but I can't seem to find it anywhere. Furthermore, if such a reference is available, is there also a toolkit where I can run validation on the output IR, and perhaps even run optimizations, much like spv-tools for SPIRV? Any help would be appreciated! Thanks, Gustav

Graphics & Games Metal Metal Metal Performance Shaders Compiler

2

0

285

Jul ’25

Warning code: 00000006 will affect background survival

Our APP has integrated 3D function, in order to reduce the memory occupation of the APP in the background, we will uninstall the 3D after the APP enters the background. However, the uninstall also causes problems. When the uninstall process is executed in the background, the app will briefly trigger the background GPU rendering error warning with the error warning code: OGPUMetalError: Insufficient Permission (to submit GPU work from background) (00000006:kIOGPUCommandBufferCallbackErrorBackgroundExecutionNotPermitted) Execution of the command buffer was aborted due to an error during execution. Insufficient Permission (to submit GPU Work from background) (00000006: kIOGPUCommandBufferCallbackErrorBackgroundExecutionNotPermitted) excuse me this warning system will tighten APP permissions background? For example, limit or shorten the background survival time of the APP. In addition, will the background refresh function fail, resulting in the failure of Bluetooth Ibeacon activation?

Graphics & Games Metal

2

0

255

Aug ’25

Float64 (Double Precision) Support on MPS with PyTorch on Apple Silicon?

Hi everyone, This project uses PyTorch on an Apple Silicon Mac (M1/M2/etc.), and the goal is to use the MPS backend for GPU acceleration, notes Apple Developer. However, the workflow depends on Float64 (double-precision) floating-point numbers for certain computations, notes PyTorch Forums. The error "Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead" has been encountered, notes GitHub. It seems that the MPS backend doesn't currently support Float64 for direct GPU computation. Questions for the community: Are there any known workarounds or best practices for handling Float64-dependent operations when using the MPS backend with PyTorch? For those working with high-precision tasks on Apple Silicon, what strategies are being used to balance performance with the need for Float64? Offloading to the CPU is an option, and it's of interest to know if there are any specific techniques or libraries within the Apple ecosystem that could streamline this process while aiming for optimal performance. Any insights, tips, or experiences would be appreciated. Thanks in advance, Jonaid MacBook Pro M3 Max

Graphics & Games Metal

2

1

403

Oct ’25

iPhone limited to 60hz frame rate

Just wondering if anyone knows what it will take to hit greater than 60hz when targeting iPhone. If I set the preferredFramesPerSecond of an MTKView to 120, it works on the iPad, but on iPhone it never goes over 60hz, even with a simple hello triangle sample app... is this a limitation of targeting iPhone?

Graphics & Games Metal

2

0

174

Sep ’25

Metal HUD Display Value Range

Can't seem to get the Metal HUD to display value range's (pre 26 Tahoe). The documented environment variable MTL_HUD_SHOW_VALUE_RANGE doesn't seem to work. https://developer.apple.com/documentation/xcode/monitoring-your-metal-apps-graphics-performance#Display-the-value-range-of-metrics Anyone having any luck?

Graphics & Games Metal

2

0

318

Sep ’25

Pink screen on MTLCommandBuffer.presentDrawable.

I rewrote my graphics pipeline to use Load/Store better for clearing and don't care cases. All my tests pass, and in the Metal debugger, all the draw calls succeed. But when I present drawables (before [commandBuffer commit]) I only get a pink screen. I've tried everything I can think of: making sure the pixel formats are the same for the back buffer as my render targets, etc. But it's still pink. Could you point me in the right direction so I can fix this, or help describe why it's pink. That would be really helpful. Thank you, Brian Hapgood

Graphics & Games Metal

2

0

381

Sep ’25

After updating CAMetalLayer.drawableSize, [CAMetalLayer nextDrawable:] frequently takes ~1s

I have a bare-bones Metal app setup where I attach a CAMetalLayer to a window that inherits from a NSWindow with a custom delegate. Everything else is vanilla. I'm also using metal-cpp and metal shader converter. I'm running into a issue where the application runs fine in the beginning, but once I resize the window, it starts hitching. It turns out that [CAMetalLayer nextDrawable:] frequently (but not always) takes around a full second (plus or minus a few milliseconds) to return once drawableSize has been updated. I've tried setting allowsNextDrawableTimeout to false which doesn't work; it returns a valid drawable after a second instead of nil. Setting displaySyncEnabled to false reduces the likelihood of this happening to around 50% from 90%+ but does not eliminate it. Setting maximumDrawableCount to 2 or 3 does not seem to make a difference. By dumping the resource IDs of the returned textures I've noticed something interesting: Before resizing, the layer seems to shuffle between 2 textures or at least 2 resource IDs, but after resizing it starts to create new textures for each returned drawable. Occasionally it seems to reuse a previous resource ID, but it does not seem to have anything to do with whether the method returns quickly or not. Why does this happen, and how can I fix it? Should I create a new CAMetalLayer when resizing the window instead of updating drawableSize?

Graphics & Games Metal Metal Core Animation metal-cpp Metal Shader Converter

3

0

770

Jan ’25

some functions from NS::Bundle in metal-cppnot

*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[NSBundle allFrameworks]: unrecognized selector sent to instance NS::Bundle* Bundle = NS::Bundle::mainBundle(); Bundle->allFrameworks(); call to allFrameworks() and allBundles() will throw exception, but other functions work well.

Graphics & Games Metal Foundation Metal metal-cpp

3

0

619

Feb ’25

Xcode Playground - The LLDB RPC server has crashed.

I am trying to learn Metal development on my MacBook Pro M1 Pro (Sequoia 15.3.1) on Xcode Playground, but when I write these two lines of code: import Metal let device = MTLCreateSystemDefaultDevice()! I get the error The LLDB RPC server has crashed. Any ideas as to what I can do to solve this? I have rebooted the machine and reinstalled Xcode...

Graphics & Games Metal Metal LLDB

3

0

507

Mar ’25

MetalFx

Recently, I adopted MetalFX for Upscale feature. However, I have encountered a persistent build failure for the iOS Simulator with the error message, 'MetalFX is not available when building for iOS Simulator.' To address this, I modified the MetalFX.framework status to 'Optional' within Build Phases > Link Binary With Libraries, adding the linker option (-weak_framework). Despite this adjustment, the build process continues to fail. Furthermore, I observed that the MetalFX sample application provided by Apple, specifically the one found at https://developer.apple.com/documentation/metalfx/applying-temporal-antialiasing-and-upscaling-using-metalfx, also fails to build for the iOS Simulator target. Has anyone encountered this issue?

Graphics & Games Metal MetalFX

3

0

768

Mar ’25

iOS Metal system delayed one Vsync period to really display the frame on the screen

View Layout Add the following views in a view controller: Label View A, with a subview of the same size: MTKView A View B, with a subview of the same size: MTKView B Refresh Rates of Each View The label view refreshes at 60fps (driven by CADisplayLink). MTKView A and B refresh at 15fps. MTKView Implementation Details The corresponding CAMetalLayer's maximumDrawableCount is set to 2, changed to double buffering. The scheduling mechanism is modified; drawing is not driven by the internal loop but is done manually. The draw call is triggered immediately upon receiving a frame. self.metalView.enableSetNeedsDisplay = NO; self.metalView.paused = YES; A new high-priority queue is created for drawing, instead of handling it on the main queue. MTKView Latency Tracking The GPU completion time T1 is observed through the addCompletedHandler callback of the CommandBuffer. The presentation time T2 of the frame is observed through the addPresentedHandler callback of the currentDrawable in MTKView. Testing shows that T2 - T1 > 16.6ms (the Vsync period at 60Hz). This means that after the GPU rendering in MTLView is finished, the frame is not actually displayed at the next Vsync instruction but only at the Vsync instruction after that. I believe there is an extra 16.6ms of latency here, which I want to eliminate by adjusting the rendering mechanism. Observation from Instruments From Instruments, the Surface presentation aligns with the above test results. After the Metal encoder finishes, the Surface in Display switches only after the next-next Vsync instruction. See the image in the link for details. Questions According to a beginner's understanding, after MTKView's GPU rendering is finished, the next Vsync instruction should officially display (make it visible). However, this is not what is observed. Does the subview MTKView need to wait for another Vsync cycle to be drawn to the actual display buffer? The label updates its text at 60fps, so the entire interface should be displayed at 60fps. Is the content of MTKView not synchronized when the display happens? Explanation of the Reasoning Behind Some MTKView Code Details Changing from the default triple buffering to double buffering helps reduce the latency introduced by rendering. Not using MTKView's own scheduling mechanism but using manual triggering of the draw method is because MTKView's own scheduling mechanism is driven by CADisplayLink. Therefore, if a frame falls within a Vsync window, it needs to wait for the next Vsync window to trigger the draw operation, which introduces waiting latency.

Graphics & Games Metal Metal

3

0

519

3w

Bug Report - Incorrect trackingAreaIdentifier in visionOS 26 Hover Effect Sample Code

Description: In the official visionOS 26 Hover Effect sample code project , I encountered an issue where the event.trackingAreaIdentifier returned by onSpatialEvent does not reset as expected. Steps to Reproduce: Select an object with trackingAreaID = 6 in the sample app. Look at a blank space (outside any tracking area) and perform a pinch gesture . Expected Behavior: The event.trackingAreaIdentifier should return 0 when interacting with a non-tracking area. Actual Behavior: The event.trackingAreaIdentifier still returns 6, even after restarting the app or killing the process. This persists regardless of where the pinch gesture is performed

Graphics & Games Metal Reality Composer Pro visionOS

3

0

276

Jul ’25

CAMetalLayer nextDrawable crash

Hi , My application meet below crash backtrace at very low repro rate from the public users, i do not see it relate to a specific iOS version or iPhone model. The last code line from my application is calling CAMetalLayer nextDrawable API. I did some basic studying, suppose it may relate to the wrong CAMetaLayer configuration, like frame property w or h <= 0.0 bounds property w or h <= 0.0 drawableSize w or h <= 0.0 or w or h > max value (like 16384) Not sure my above thinking is right or not? Will the UIView which my CAMetaLayer attached will cause such nextDrawable crash or not ? Thanks a lot Main Thread - Crashed libsystem_kernel.dylib __pthread_kill libsystem_c.dylib abort libsystem_c.dylib __assert_rtn Metal MTLReportFailure.cold.1 Metal MTLReportFailure Metal _MTLMessageContextEnd Metal -[MTLTextureDescriptorInternal validateWithDevice:] AGXMetalA13 0x245b1a000 + 4522096 QuartzCore allocate_drawable_texture(id<MTLDevice>, __IOSurface*, unsigned int, unsigned int, MTLPixelFormat, unsigned long long, CAMetalLayerRotation, bool, NSString*, unsigned long) QuartzCore get_unused_drawable(_CAMetalLayerPrivate*, CAMetalLayerRotation, bool, bool) QuartzCore CAMetalLayerPrivateNextDrawableLocked(CAMetalLayer*, CAMetalDrawable**, unsigned long*) QuartzCore -[CAMetalLayer nextDrawable] SpaceApp -[MetalRender renderFrame:] MetalRenderer.mm:167 SpaceApp -[FrameBuffer acceptFrame:] VideoRender.mm:173 QuartzCore CA::Display::DisplayLinkItem::dispatch_(CA::SignPost::Interval<(CA::SignPost::CAEventCode)835322056>&) QuartzCore CA::Display::DisplayLink::dispatch_items(unsigned long long, unsigned long long, unsigned long long) QuartzCore CA::Display::DisplayLink::dispatch_deferred_display_links(unsigned int) UIKitCore _UIUpdateSequenceRun UIKitCore schedulerStepScheduledMainSection UIKitCore runloopSourceCallback CoreFoundation __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ CoreFoundation __CFRunLoopDoSource0 CoreFoundation __CFRunLoopDoSources0 CoreFoundation __CFRunLoopRun CoreFoundation CFRunLoopRunSpecific GraphicsServices GSEventRunModal UIKitCore -[UIApplication _run] UIKitCore UIApplicationMain

Graphics & Games Metal Metal

3

0

363

Jul ’25

Post

Replies

Boosts

Views

Activity