Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.

All subtopics
Posts under Spatial Computing topic

Post

Replies

Boosts

Views

Activity

Combining ARKit Face Tracking with High-Resolution AVCapture and Perspective Rendering on Front Camera
Subject: Combining ARKit Face Tracking with High-Resolution AVCapture and Perspective Rendering on Front Camera Message: Hello Apple Developer Community, We’re developing an application using the front camera that requires both real-time ARKit face tracking/guidance and the capture of high-resolution still images via AVCaptureSession. Our goal is to leverage ARKit’s depth and face data to render a captured image from another perspective post-capture, maintaining high image quality. Our Approach: Real-Time ARKit Guidance: Utilize ARKit (e.g., ARFaceTrackingConfiguration) for continuous face tracking, depth, and scene understanding to guide the user in real time. High-Resolution Capture Transition: At the moment of capture, we plan to pause the ARKit session and switch to an AVCaptureSession to take a high-resolution image. We assume that for a front-facing image, the subject’s face is directly front-on, and the relative pose between the face and camera remains the same during the transition. The only variation we expect is a change in distance. Our intention is to minimize the delay between the last ARKit frame and the high-res capture to maintain temporal consistency, assuming that aside from distance, the face-camera relative pose remains unchanged. Post-Processing Perspective Rendering: Using the last ARKit face data (depth, pose, and landmarks) along with the high-resolution 2D image, we aim to render the scene from another perspective. We want to correct the perspective of the 2D image using SceneKit or RealityKit, leveraging the collected ARKit scene information to achieve a natural, high-quality rendering from a different viewpoint. The rendering should match the quality of a normally captured high-resolution image, adjusting for the difference in distance while using the stored ARKit data to correct perspective. Our Questions: Session Transition Best Practices: What are the recommended best practices to seamlessly pause ARKit and switch to a high-resolution AVCapture session on the front camera How can we minimize user movement or other issues during this brief transition, given our assumption that the face-camera pose remains largely consistent except for distance changes? Data Integration for Perspective Rendering: How can we effectively integrate stored ARKit face, depth, and pose data with the high-res image to perform accurate perspective correction or rendering from another viewpoint? Given that we assume the relative pose is constant except for distance, are there strategies or APIs to leverage this assumption for simplifying the perspective transformation? Perspective Correction with SceneKit/RealityKit: What techniques or workflows using SceneKit or RealityKit are recommended for correcting the perspective of a captured 2D image based on ARKit scene data? How can we use these frameworks to render the high-resolution image from an alternative perspective, while maintaining image quality and fidelity? 4. Pitfalls and Guidelines: What common pitfalls should we be aware of when combining ARKit tracking data with high-res capture and post-processing for perspective rendering? Are there performance considerations, recommended thresholds for acceptable temporal consistency, or validation techniques to ensure the ARKit data remains applicable at the moment of high-res capture? We appreciate any advice, sample code references, or documentation pointers that could assist us in implementing this workflow effectively. Thank you!
2
0
753
Jan ’25
Pass Video/ Frames to a Shader Graph?
Wondering if this is even possible without using CVImageBuffer and passing each frame as an image which I imagine will be very expensive. Have a PoC of a shader graph that applies a radial zoom effect to an image. In RealityKit I'm passing the image as a resource: if let textureResource = try? await TextureResource(named: "fuji") { let value = MaterialParameters.Value.textureResource(textureResource) try? material.setParameter(name: "MyImage", value: value) model.model?.materials = [material] } Thanks in advance
0
0
103
Apr ’25
Is it possible to create a sentence hover effect in Vision Pro?
I want a sentence custom hover effect, not a button. I want a hover effect when you look at one sentence out of many sentences. So I searched for reference videos https://youtu.be/DftRTx1oX6E , https://developer.apple.com/videos/play/wwdc2023/10110/ on apple youtube and visionOS documentation. But I haven't gotten anywhere near my wish feature yet. I respectfully request someone to help me. :)
1
0
438
Jan ’25
Reality kit Entities Appearing to Lag in a Full or Progressive Style Immersive Space When Opened with Environment Turned On
PLATFORM AND VERSION Vision OS Development environment: Xcode 16.2, macOS 15.2 Run-time configuration: visionOS 2.3 (On Real Device, Not simulator) Please someone confirm I'm not crazy and this issue is actually out of my control. Spent hours trying to fix my app and running profiles because thought it was an issue related to my apps performance. Finally considered chance it was issue with API itself and made sample app to isolate problem, and it still existed in it. The issue is when a model entity moves around in a full space that was launched when the system environment immersion was turned up before opening it, the entities looks very choppy as they move around. If you take off the headset while still in the space, and put it back on, this fixes it and then they move smoothly as they should. In addition, you can also leave the space, and then turn the system environment immersion all the way down before launching the full space again, this will also make the entity moves smoothly as it should. If you launch a mixed immersion style instead of a full immersion style, this issue never arrises. The issue only arrises if you launch the space with either a full style, or progressive style, while the system immersion level is turned on. STEPS TO REPRODUCE https://github.com/nathan-707/ChoppyEntitySample Open my test project, its a small, modified vision os project template that shows it clearly. otherwise: create immersive space with either full or progressive immersion style. setup a entity in kinematic mode, apply a velocity to it to make it pass over your head when the space appears. if you opened the space while the Apple Vision Pros system environment was turned up, the entity will look choppy. if you take the headset off while in the same space, and put it back on, it will fix the issue and it will look smooth. alternatively if you open the space with the system immersion environment all the way down, you will also not run into the issue. Again, issue also does not happen if space launched is in mixed style.
1
0
544
Jan ’25
RealityView camera feed not shown
I have two RealityView: ParentView and When click the button in ParentView, ChildView will be shown as full screen cover, but the camera feed in ChildView will not be shown, only black screen. If I show ChildView directly, it works with camera feed. Please help me on this issue? Thanks. import RealityKit import SwiftUI struct ParentView: View{ @State private var showIt = false var body: some View{ ZStack{ RealityView{content in content.camera = .virtual let box = ModelEntity(mesh: MeshResource.generateSphere(radius: 0.2),materials: [createSimpleMaterial(color: .red)]) content.add(box) } Button("Click here"){ showIt = true } } .fullScreenCover(isPresented: $showIt){ ChildView() .overlay( Button("Close"){ showIt = false }.padding(20), alignment: .bottomLeading ) } .ignoresSafeArea(.all) } } import ARKit import RealityKit import SwiftUI struct ChildView: View{ var body: some View{ RealityView{content in content.camera = .spatialTracking } } }
0
0
161
Jun ’25
Request: More Fine-Grained Control in Object Capture (PhotogrammetrySession)
Hi Apple Team and Developers, First of all, I’d like to express my appreciation for the incredible results achieved using PhotogrammetrySession. I’ve been developing a portrait scanning app using Object Capture, and in many tests—especially with human models—I’ve found the reconstructed body surfaces are remarkably smooth and clean, often outperforming tools like Metashape and RealityCapture in terms of aesthetic results. However, I’ve encountered some challenges when working with complex areas like long hair overlapping the face. For instance, with female models where strands of hair partially occlude the face, the resulting mesh tends to merge the hair and facial geometry. This leads to distorted or “melted” facial features, likely due to ambiguity in the geometry estimation phase. Feature Suggestion: Would it be possible to allow developers to supply two versions of the input images: • One version (original) for texture generation • A pre-processed version (e.g., contrast-enhanced or CLAHE filtered) to guide mesh reconstruction only This would give us the flexibility to enhance edge features or shadow detail without affecting the final texture appearance. In other photogrammetry pipelines, applying image enhancement selectively before dense reconstruction improves geometry quality in low-contrast areas. Question: Is there any plan to support this kind of two-path workflow in future versions of PhotogrammetrySession? Or perhaps expose more intermediate stages or tunable parameters to developers? Also, any hints on what we can expect from WWDC 2025 regarding improvements to Object Capture or related vision/3D technologies? Thanks again for this powerful API. Looking forward to hearing insights from the team and other developers. Warm regards, KitCheng
0
0
79
May ’25
What is the first reliable position of the apple vision pro device?
In several visionOS apps, we readjust our scenes to the user's eye level (their heads). But, we have encountered issues whereby the WorldTrackingProvider returns bad/incorrect positions for the first x number of frames. See below code which you can copy paste in any Immersive Space. Relaunch the space and observe the numberOfBadWorldInfos value is inconsistent. a. what is the most reliable way to get the devices's position? b. is this indeed a bug? c. are we using worldInfo improperly? d. as a workaround, in our apps we set to 10 the number of frames to let pass before using worldInfo, should we set our threshold differently? import ARKit import Combine import OSLog import SwiftUI import RealityKit import RealityKitContent let SUBSYSTEM = Bundle.main.bundleIdentifier! struct ImmersiveView: View { let logger = Logger(subsystem: SUBSYSTEM, category: "ImmersiveView") let session = ARKitSession() let worldInfo = WorldTrackingProvider() @State var sceneUpdateSubscription: EventSubscription? = nil @State var deviceTransform: simd_float4x4? = nil @State var numberOfBadWorldInfos = 0 @State var isBadWorldInfoLoged = false var body: some View { RealityView { content in try? await session.run([worldInfo]) sceneUpdateSubscription = content.subscribe(to: SceneEvents.Update.self) { event in guard let pose = worldInfo.queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) else { return } // `worldInfo` does not return correct values for the first few frames (exact number of frames is unknown) // - known SO: https://stackoverflow.com/questions/78396187/how-to-determine-the-first-reliable-position-of-the-apple-vision-pro-device deviceTransform = pose.originFromAnchorTransform if deviceTransform!.columns.3.y < 1.6 { numberOfBadWorldInfos += 1 logger.warning("\(#function) \(#line) deviceTransform.columns.3.y \(deviceTransform!.columns.3.y), numberOfBadWorldInfos \(numberOfBadWorldInfos)") } else { if !isBadWorldInfoLoged { logger.info("\(#function) \(#line) deviceTransform.columns.3.y \(deviceTransform!.columns.3.y), numberOfBadWorldInfos \(numberOfBadWorldInfos)") } isBadWorldInfoLoged = true // stop logging. } } } } }
3
0
122
May ’25
How to add visual thickness to a glass background view
Hi guys, In visionOS, when using a ZStack decorated with .glassBackgroundEffect(), you can see the 3D glass background from the front, but when viewed from the side, the view appears to have no thickness. However, I noticed that in an app built by Apple, when viewing a glass background view from the side, it appears to have thickness. I tried adding .frame(depth:) to a glass background view, but it appears as two separate layers spaced by the depth value. My question is: Is there a view modifier that adds visual thickness to a glass background view, as shown in the picture? Or, if not, how should I write a custom view modifier to achieve this effect? Thanks!
0
0
101
May ’25
Is it Possible to Place a 3D Model at the Exact Position of a QR Code in AR with ARKit ?
I'm working on an iOS app using ARKit and RealityKit where I scan QR codes and want to place 3D models at the exact position of the QR code in the real world. Is it possible to accurately place a 3D model at the exact position of a QR code in AR using ARKit and RealityKit? Specifically, I want the model to appear at the precise location where the QR code is detected, rather than just somewhere in the AR space. If this is possible, could you point me in the right direction or recommend the best approach to achieve this? Thank you for your help!
0
0
107
May ’25
RealityKit System update and timing
Hi, I'm playing now with hand tracking. I want to get position of hand inside a system update function. I was not sure if transform I'm getting from hand attached AnchorEntity (with trackingMode: .predicted) would give same results as handAnchors(at:) from hand tracking provider, so I started to read them both and compare. For handAnchors i tried using context.scene.timebase.sourceTimebase!.sourceClock!.time.seconds and CACurrentMediaTime() as timestamp source. They seem to use exactly same clock, so that doesn't matter, but: for some reason update handler is always called twice with same context.deltaTime, but first time the query finds 0 entities, second time it finds them all. The query is the standard EntityQuery(where: .has(MyComponent.self)) and in update (matching: Self.query, updatingSystemWhen: .rendering). Here's part of logs: System update called, entity count: 0, dt: 0.01000458374619484, absTime: 4654.222593541 System update called, entity count: 11, dt: 0.01000458374619484, absTime: 4654.22262525 System update called, entity count: 0, dt: 0.009999999776482582, absTime: 4654.249390875 System update called, entity count: 11, dt: 0.009999999776482582, absTime: 4654.249425 accounting for the double update calling I started to calculate time delta of absolute time between calls and they're most of the time much bigger, or much smaller than advertised by system's context.deltaTime, only sometimes they kind of match, for example: system: (dt: 0.01000458374619484) scene : (dt: 0.021419291667371) (absTime: 4654.222628125001) and the very next call system: (dt: 0.010009 166784584522) scene : (dt: 0.0013097083328830195) (absTime: 4654.223937833334) but sometimes system: (dt: 0.009999999776482582) scene : (dt: 0.009 112249999816413) (absTime: 4654.351299 166668) Shouldn't those be more or less equal, or am I missing something? In the end it seems that getting hand position from AnchorEntity and with handAnchors(at:) gives kind of same results, but at different time points, so I'd love to understand what's the correct way to use them and why time flows differently :). --Edit-- P.S. Had to put spaces everywhere in logs between "9" and "1", otherwise post was blocked due to "sensitive content" :D
2
0
122
May ’25
xform called "Scene" breaks animations on Quicklook starting with iOS15
Hello, We discovered that a bunch of our old animated models were no longer animated on iOS15 and onwards. After a few days of playing spot the difference between usda files I noticed that all the broken models had an xform called "Scene". Lo and behold, changing the name of that xform fixed the issue on all the models. Even lowercase "scene" makes the animations work again. Is "Scene" a reserved keyword or something? What other keywords do we need to avoid so we can create more robust USDZ files? I'm surprised this issue isn't more widespread considering Blender wraps models in a "Scene" node. At the drive link below you can find two animated cube USDZs. The only difference is the name of one of the xforms. The one with a "Scene" xform is not animated in quicklook (replicated on iPhone 13 iOS v15.2, iPhone 13 iOS v 18.3, and various devices on Browserstack including iPhone 16 iOS v18.3). https://drive.google.com/drive/folders/1dch1WaM9O6mbHy29S6NGWgnSHkZkPiBf?usp=sharing
1
0
365
Mar ’25
Convert entity to image
VisionPro 开发,XCode,我想在窗口中找到一个显示模型的图片。这个模型可以改变它的材料,它不是唯一的形象,它正在改变。如何将此模型转换为图像
1
0
74
Apr ’25
ManipulationComponent in both parent and child entities
Hello, In my project, I have attached a ManipulationComponent to Entity A and as expected, I'm able interact with it using the built-in gestures. I have another Entity B which is a child of A that I would like to interact with as well, so I attempted to add a ManipulationComponent to B. However, no gestures seem to be registered on B; I can still interact with A but B cannot be interacted with despite having ManipulationComponents on both entities. So I'm wondering if I'm just doing something wrong, if this is an issue with the ManipulationComponent, or if this is a limitation of the API. Attached is the code used to add the ManipulationComponent to an Entity and it was done on both A and B: let mc = ManipulationComponent() model.components.set(mc) var boxShape = ShapeResource.generateBox(width: 0.25, height: 0.05, depth: 0.25) boxShape = boxShape.offsetBy(translation: simd_float3(0, -0.05, -0.25)) ManipulationComponent.configureEntity(model, collisionShapes: [boxShape]) if var mc = model.components[ManipulationComponent.self] { mc.releaseBehavior = .stay mc.dynamics.inertia = .low model.components.set(mc) } I am using visionOS 26.0; let me know if there's any additional information needed.
1
0
361
Oct ’25
Template Project Entity Overlapping and Sticking Issues
Hello, There are three issues I am running into with a default template project + additional minimal code changes: the Sphere_Left entity always overlaps the Sphere_Right entity. when I release the Sphere_Left entity, it does not remain sticking to the Sphere_Right entity when I release the Sphere_Left entity, it distances itself from the Sphere_Right entity When I manipulate the Sphere_Right entity, these above 3 issues do not occur: I get a correct and expected behavior. These issues are simple to replicate: Create a new project in XCode Choose visionOS -> App, then click Next Name your project, and leave all other options as defaults: Initial Scene: Window, Immersive Space Renderer: RealityKit, Immersive Space: Mixed, then click Next Save you project anywhere... Replace the entire ImmersiveView.swift file with the below code. Run. Try to manipulate the left sphere, you should get the same issues I mentioned above If you restart the project, and manipulate only the right sphere, you should get the correct expected behaviors, and no issues. I am running this in macOS 26, XCode 26, on visionOS 26, all released lately. ImmersiveView Code: // // ImmersiveView.swift // import OSLog import SwiftUI import RealityKit import RealityKitContent struct ImmersiveView: View { private let logger = Logger(subsystem: "com.testentitiessticktogether", category: "ImmersiveView") @State var collisionBeganUnfiltered: EventSubscription? var body: some View { RealityView { content in // Add the initial RealityKit content if let immersiveContentEntity = try? await Entity(named: "Immersive", in: realityKitContentBundle) { content.add(immersiveContentEntity) // Add manipulation components setupManipulationComponents(in: immersiveContentEntity) collisionBeganUnfiltered = content.subscribe(to: CollisionEvents.Began.self) { collisionEvent in Task { @MainActor in handleCollision(entityA: collisionEvent.entityA, entityB: collisionEvent.entityB) } } } } } private func setupManipulationComponents(in rootEntity: Entity) { logger.info("\(#function) \(#line) ") let sphereNames = ["Sphere_Left", "Sphere_Right"] for name in sphereNames { guard let sphere = rootEntity.findEntity(named: name) else { logger.error("\(#function) \(#line) Failed to find \(name) entity") assertionFailure("Failed to find \(name) entity") continue } ManipulationComponent.configureEntity(sphere) var manipulationComponent = ManipulationComponent() manipulationComponent.releaseBehavior = .stay sphere.components.set(manipulationComponent) } logger.info("\(#function) \(#line) Successfully set up manipulation components") } private func handleCollision(entityA: Entity, entityB: Entity) { logger.info("\(#function) \(#line) Collision between \(entityA.name) and \(entityB.name)") guard entityA !== entityB else { return } if entityB.isAncestor(of: entityA) { logger.debug("\(#function) \(#line) \(entityA.name) already under \(entityB.name); skipping reparent") return } if entityA.isAncestor(of: entityB) { logger.info("\(#function) \(#line) Skip reparent: \(entityA.name) is an ancestor of \(entityB.name)") return } reparentEntities(child: entityA, parent: entityB) entityA.components[ParticleEmitterComponent.self]?.burst() } private func reparentEntities(child: Entity, parent: Entity) { let childBounds = child.visualBounds(relativeTo: nil) let parentBounds = parent.visualBounds(relativeTo: nil) let maxEntityWidth = max(childBounds.extents.x, parentBounds.extents.x) let childPosition = child.position(relativeTo: nil) let parentPosition = parent.position(relativeTo: nil) let currentDistance = distance(childPosition, parentPosition) child.setParent(parent, preservingWorldTransform: true) logger.info("\(#function) \(#line) Set \(child.name) parent to \(parent.name)") child.components.remove(ManipulationComponent.self) logger.info("\(#function) \(#line) Removed ManipulationComponent from child \(child.name)") if currentDistance > maxEntityWidth { let direction = normalize(childPosition - parentPosition) let newPosition = parentPosition + direction * maxEntityWidth child.setPosition(newPosition - parentPosition, relativeTo: parent) logger.info("\(#function) \(#line) Adjusted position: distance was \(currentDistance), now \(maxEntityWidth)") } } } fileprivate extension Entity { func isAncestor(of other: Entity) -> Bool { var current: Entity? = other.parent while let node = current { if node === self { return true } current = node.parent } return false } } #Preview(immersionStyle: .mixed) { ImmersiveView() .environment(AppModel()) }
8
0
395
Sep ’25