Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

A Summary of the WWDC25 Group Lab - Machine Learning and AI Frameworks
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Machine Learning and AI Frameworks. What are you most excited about in the Foundation Models framework? The Foundation Models framework provides access to an on-device Large Language Model (LLM), enabling entirely on-device processing for intelligent features. This allows you to build features such as personalized search suggestions and dynamic NPC generation in games. The combination of guided generation and streaming capabilities is particularly exciting for creating delightful animations and features with reliable output. The seamless integration with SwiftUI and the new design material Liquid Glass is also a major advantage. When should I still bring my own LLM via CoreML? It's generally recommended to first explore Apple's built-in system models and APIs, including the Foundation Models framework, as they are highly optimized for Apple devices and cover a wide range of use cases. However, Core ML is still valuable if you need more control or choice over the specific model being deployed, such as customizing existing system models or augmenting prompts. Core ML provides the tools to get these models on-device, but you are responsible for model distribution and updates. Should I migrate PyTorch code to MLX? MLX is an open-source, general-purpose machine learning framework designed for Apple Silicon from the ground up. It offers a familiar API, similar to PyTorch, and supports C, C++, Python, and Swift. MLX emphasizes unified memory, a key feature of Apple Silicon hardware, which can improve performance. It's recommended to try MLX and see if its programming model and features better suit your application's needs. MLX shines when working with state-of-the-art, larger models. Can I test Foundation Models in Xcode simulator or device? Yes, you can use the Xcode simulator to test Foundation Models use cases. However, your Mac must be running macOS Tahoe. You can test on a physical iPhone running iOS 18 by connecting it to your Mac and running Playgrounds or live previews directly on the device. Which on-device models will be supported? any open source models? The Foundation Models framework currently supports Apple's first-party models only. This allows for platform-wide optimizations, improving battery life and reducing latency. While Core ML can be used to integrate open-source models, it's generally recommended to first explore the built-in system models and APIs provided by Apple, including those in the Vision, Natural Language, and Speech frameworks, as they are highly optimized for Apple devices. For frontier models, MLX can run very large models. How often will the Foundational Model be updated? How do we test for stability when the model is updated? The Foundation Model will be updated in sync with operating system updates. You can test your app against new model versions during the beta period by downloading the beta OS and running your app. It is highly recommended to create an "eval set" of golden prompts and responses to evaluate the performance of your features as the model changes or as you tweak your prompts. Report any unsatisfactory or satisfactory cases using Feedback Assistant. Which on-device model/API can I use to extract text data from images such as: nutrition labels, ingredient lists, cashier receipts, etc? Thank you. The Vision framework offers the RecognizeDocumentRequest which is specifically designed for these use cases. It not only recognizes text in images but also provides the structure of the document, such as rows in a receipt or the layout of a nutrition label. It can also identify data like phone numbers, addresses, and prices. What is the context window for the model? What are max tokens in and max tokens out? The context window for the Foundation Model is 4,096 tokens. The split between input and output tokens is flexible. For example, if you input 4,000 tokens, you'll have 96 tokens remaining for the output. The API takes in text, converting it to tokens under the hood. When estimating token count, a good rule of thumb is 3-4 characters per token for languages like English, and 1 character per token for languages like Japanese or Chinese. Handle potential errors gracefully by asking for shorter prompts or starting a new session if the token limit is exceeded. Is there a rate limit for Foundation Models API that is limited by power or temperature condition on the iPhone? Yes, there are rate limits, particularly when your app is in the background. A budget is allocated for background app usage, but exceeding it will result in rate-limiting errors. In the foreground, there is no rate limit unless the device is under heavy load (e.g., camera open, game mode). The system dynamically balances performance, battery life, and thermal conditions, which can affect the token throughput. Use appropriate quality of service settings for your tasks (e.g., background priority for background work) to help the system manage resources effectively. Do the foundation models support languages other than English? Yes, the on-device Foundation Model is multilingual and supports all languages supported by Apple Intelligence. To get the model to output in a specific language, prompt it with instructions indicating the user's preferred language using the locale API (e.g., "The user's preferred language is en-US"). Putting the instructions in English, but then putting the user prompt in the desired output language is a recommended practice. Are larger server-based models available through Foundation Models? No, the Foundation Models API currently only provides access to the on-device Large Language Model at the core of Apple Intelligence. It does not support server-side models. On-device models are preferred for privacy and for performance reasons. Is it possible to run Retrieval-Augmented Generation (RAG) using the Foundation Models framework? Yes, it is possible to run RAG on-device, but the Foundation Models framework does not include a built-in embedding model. You'll need to use a separate database to store vectors and implement nearest neighbor or cosine distance searches. The Natural Language framework offers simple word and sentence embeddings that can be used. Consider using a combination of Foundation Models and Core ML, using Core ML for your embedding model.
1
0
1.2k
Jun ’25
Can I Perform Hybrid Execution on Neural Engine and CPU with 16-bit Precision?
Hello, I have a question regarding hybrid execution for deep learning models on Apple's Neural Engine and CPU. I am aware that setting the precision of some layers to 32-bit allows hybrid execution across both the Neural Engine and the CPU. However, I would like to know if it is possible to achieve the same with 16-bit precision. Is there any specific configuration or workaround to enable hybrid execution in this case? Any guidance or documentation references would be greatly appreciated. Thank you!
0
0
439
Jan ’25
CoreML model can load on MacOS 15.3.1 but failed to load on MacOS 15.5
I have been working on a small CV program, which uses fine-tuned U2Netp model converted by coremltools 8.3.0 from PyTorch. It works well on my iPhone (with iOS version 18.5) and my Macbook (with MacOS version 15.3.1). But it fails to load after I upgraded Macbook to MacOS version 15.5. I have attached console log when loading this model. Unable to load MPSGraphExecutable from path /Users/yongzhang/Library/Caches/swiftmetal/com.apple.e5rt.e5bundlecache/24F74/E051B28C6957815C140A86134D673B5C015E79A1460E9B54B8764F659FDCE645/16FA8CF2CDE66C0C427F4B51BBA82C38ACC44A514CCA396FD7B281AAC087AB2F.bundle/H14C.bundle/main/main_mps_graph/main_mps_graph.mpsgraphpackage @ GetMPSGraphExecutable E5RT: Unable to load MPSGraphExecutable from path /Users/yongzhang/Library/Caches/swiftmetal/com.apple.e5rt.e5bundlecache/24F74/E051B28C6957815C140A86134D673B5C015E79A1460E9B54B8764F659FDCE645/16FA8CF2CDE66C0C427F4B51BBA82C38ACC44A514CCA396FD7B281AAC087AB2F.bundle/H14C.bundle/main/main_mps_graph/main_mps_graph.mpsgraphpackage (13) Unable to load MPSGraphExecutable from path /Users/yongzhang/Library/Caches/swiftmetal/com.apple.e5rt.e5bundlecache/24F74/E051B28C6957815C140A86134D673B5C015E79A1460E9B54B8764F659FDCE645/16FA8CF2CDE66C0C427F4B51BBA82C38ACC44A514CCA396FD7B281AAC087AB2F.bundle/H14C.bundle/main/main_mps_graph/main_mps_graph.mpsgraphpackage @ GetMPSGraphExecutable E5RT: Unable to load MPSGraphExecutable from path /Users/yongzhang/Library/Caches/swiftmetal/com.apple.e5rt.e5bundlecache/24F74/E051B28C6957815C140A86134D673B5C015E79A1460E9B54B8764F659FDCE645/16FA8CF2CDE66C0C427F4B51BBA82C38ACC44A514CCA396FD7B281AAC087AB2F.bundle/H14C.bundle/main/main_mps_graph/main_mps_graph.mpsgraphpackage (13) Failure translating MIL->EIR network: Espresso exception: "Network translation error": MIL->EIR translation error at /Users/yongzhang/CLionProjects/ImageSimilarity/models/compiled/u2netp.mlmodelc/model.mil:1557:12: Parameter binding for axes does not exist. [Espresso::handle_ex_plan] exception=Espresso exception: "Network translation error": MIL->EIR translation error at /Users/yongzhang/CLionProjects/ImageSimilarity/models/compiled/u2netp.mlmodelc/model.mil:1557:12: Parameter binding for axes does not exist. status=-14 Failed to build the model execution plan using a model architecture file '/Users/yongzhang/CLionProjects/ImageSimilarity/models/compiled/u2netp.mlmodelc/model.mil' with error code: -14.
0
0
157
Jul ’25
ANE Performance for on-device Foundation model
I'm running MacOs 26 Beta 5. I noticed that I can no longer achieve 100% usage on the ANE as I could before with Apple Foundations on-device model. Has Apple activated some kind of throttling or power limiting of the ANE? I cannot get above 3w or 40% usage now since upgrading. I'm on the high power energy mode. I there an API rate limit being applied? I kave a M4 Pro mini with 64 GB of memory.
0
0
324
Aug ’25
Inquiry About GS1 DataBar Stacked Support in Vision Framework
Hello, I am currently developing an application that requires barcode scanning using Apple’s Vision framework (VNBarcodeSymbology). I noticed that the framework supports several GS1 DataBar symbologies, such as: VNBarcodeSymbology.gs1DataBar VNBarcodeSymbology.gs1DataBarExpanded VNBarcodeSymbology.gs1DataBarLimited However, I could not find any explicit reference to support for GS1 DataBar Stacked (both regular and expanded variants). Could you confirm whether GS1 DataBar Stacked is currently supported in VisionKit's DataScannerViewController or VNBarcodeObservation? If not, are there any plans to include support for this symbology in a future iOS update? This functionality is critical for my use case, as GS1 DataBar Stacked barcodes are widely used in retail, pharmaceuticals, and logistics, where space constraints prevent the use of standard GS1 DataBar formats. I appreciate any clarification on this matter and would be happy to provide additional details if needed.
0
0
400
Feb ’25
My app crash in the Portrait private framework
Incident Identifier: 4C22F586-71FB-4644-B823-A4B52D158057 CrashReporter Key: adc89b7506c09c2a6b3a9099cc85531bdaba9156 Hardware Model: Mac16,10 Process: PRISMLensCore [16561] Path: /Applications/PRISMLens.app/Contents/Resources/app.asar.unpacked/node_modules/core-node/PRISMLensCore.app/PRISMLensCore Identifier: com.prismlive.camstudio Version: (null) ((null)) Code Type: ARM-64 Parent Process: ? [16560] Date/Time: (null) OS Version: macOS 15.4 (24E5228e) Report Version: 104 Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x00000000 at 0x0000000000000000 Crashed Thread: 34 Application Specific Information: *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '*** -[__NSArrayM insertObject:atIndex:]: object cannot be nil' Thread 34 Crashed: 0 CoreFoundation 0x000000018ba4dde4 0x18b960000 + 974308 (__exceptionPreprocess + 164) 1 libobjc.A.dylib 0x000000018b512b60 0x18b4f8000 + 109408 (objc_exception_throw + 88) 2 CoreFoundation 0x000000018b97e69c 0x18b960000 + 124572 (-[__NSArrayM insertObject:atIndex:] + 1276) 3 Portrait 0x0000000257e16a94 0x257da3000 + 473748 (-[PTMSRResize addAdditionalOutput:] + 604) 4 Portrait 0x0000000257de91c0 0x257da3000 + 287168 (-[PTEffectRenderer initWithDescriptor:metalContext:useHighResNetwork:faceAttributesNetwork:humanDetections:prevTemporalState:asyncInitQueue:sharedResources:] + 6204) 5 Portrait 0x0000000257dab21c 0x257da3000 + 33308 (__33-[PTEffect updateEffectDelegate:]_block_invoke.241 + 164) 6 libdispatch.dylib 0x000000018b739b2c 0x18b738000 + 6956 (_dispatch_call_block_and_release + 32) 7 libdispatch.dylib 0x000000018b75385c 0x18b738000 + 112732 (_dispatch_client_callout + 16) 8 libdispatch.dylib 0x000000018b742350 0x18b738000 + 41808 (_dispatch_lane_serial_drain + 740) 9 libdispatch.dylib 0x000000018b742e2c 0x18b738000 + 44588 (_dispatch_lane_invoke + 388) 10 libdispatch.dylib 0x000000018b74d264 0x18b738000 + 86628 (_dispatch_root_queue_drain_deferred_wlh + 292) 11 libdispatch.dylib 0x000000018b74cae8 0x18b738000 + 84712 (_dispatch_workloop_worker_thread + 540) 12 libsystem_pthread.dylib 0x000000018b8ede64 0x18b8eb000 + 11876 (_pthread_wqthread + 292) 13 libsystem_pthread.dylib 0x000000018b8ecb74 0x18b8eb000 + 7028 (start_wqthread + 8)
1
0
85
Mar ’25
`LanguageModelSession.respond()` never resolves in Beta 5
Hi all, I noticed on Friday that on the new Beta 5 using FoundationModels on a simulator LanguageModelSession.respond() neither resolves nor throws most of the time. The SwiftUI test app below was working perfectly in Xcode 16 Beta 4 and iOS 26 Beta 4 (simulator). import SwiftUI import FoundationModels struct ContentView: View { var body: some View { VStack { Image(systemName: "globe") .imageScale(.large) .foregroundStyle(.tint) Text("Hello, world!") } .padding() .onAppear { Task { do { let session = LanguageModelSession() let response = try await session.respond(to: "are cats better than dogs ???") print(response.content) } catch { print("error") } } } } } After updating to Xcode 16 Beta 5 and iOS 26 Beta 5 (simulator), the code now often hangs. Occasionally it will work if I toggle Apple Intelligence on and off in Settings, but it’s unreliable.
2
0
353
Aug ’25
Attempts to install Tensorflow on Mac Studio M1 fail
I am attempting to install Tensorflow on my M1 and I seem to be unable to find the correct matching versions of jax, jaxlib and numpy to make it all work. I am in Bash, because the default shell gave me issues. I downgraded to python 3.10, because with 3.13, I could not do anything right. Current actions: bash-3.2$ python3.10 -m venv ~/venv-metal bash-3.2$ python --version Python 3.10.16 python3.10 -m venv ~/venv-metal source ~/venv-metal/bin/activate python -m pip install -U pip python -m pip install tensorflow-macos And here, I keep running tnto errors like: (venv-metal):~$ pip install tensorflow-macos tensorflow-metal ERROR: Could not find a version that satisfies the requirement tensorflow-macos (from versions: none) ERROR: No matching distribution found for tensorflow-macos What is wrong here? How can I fix that? It seems like the system wants to use the x86 version of python ... which can't be right.
4
0
1.8k
Jan ’25
My Vision for AI and Algorithmically Optimised Operating Systems
Bear with me, please. Please make sure a highly skilled technical person reads and understands this. I want to describe my vision for (AI/Algorithmically) Optimised Operating Systems. To explain it properly, I will describe the process to build it (pseudo). Required Knowledge (no particular order): Processor Logic Circuits, LLM models, LLM tool usage, Python OO coding, Procedural vs OO, NLP fuzzy matching, benchmarking, canvas/artefacts/dynamic HTML interfaces, concepts of how AI models are vastly compressed and miniaturised forms of full data, Algorithmic vs AI. First, take all OO Python code (example) on GitHub (example), then separate each function from each object into its own procedure (procedural logic) by making a logical procedural list of actions to perform only that function based on its entire dependency chain (i.e. all other objects it relies on). Relate all compiled functions using (for example) fuzzy matching on the name, or AI-based functional profiling to get multiple instances of each function. Starting with the most used function, test each one against the others that perform the same task for bugs and completeness. Determine the fastest, most optimal version of that function (and every function). Add a single instance of each most optimal function to the centralised tool codebase, which will later be utilised by the language models. This ensures we rely only on the most optimised function for each and every use case — with every program using one shared instance of that function instead of compiling it separately. In this way, we optimise all existing (Python OO) code based on maximum availability of certain resources. Use the optimised code to teach the AI how to map each user task directly to optimised code (i.e. each user-facing function in unoptimised program code to its optimised version). Later, teach the codebase how to optimise these functions directly to the logic circuits on the processor — as an Operating System — using essentially the same process. Eventually, use the most optimised form for tasks via a dynamic canvas/artefacts UI interface generated by the Start Menu using the language model. Allow custom interfaces to emerge dynamically from the available functions based on language understanding. Grow and optimise the algorithmic codebase until the entire operating system — with versioning — evolves based on the AI-generated user interfaces (compressing them into new versions of the tools codebase). Please don’t give up until you understand this concept completely and nothing in the description is beyond your full comprehension. It’s a highly significant step forward.
3
0
311
Oct ’25
Accessing Apple Intelligence APIs: Custom Prompt Support and Inference Capabilities
Hello Apple Developer Community, I'm exploring the integration of Apple Intelligence features into my mobile application and have a couple of questions regarding the current and upcoming API capabilities: Custom Prompt Support: Is there a way to pass custom prompts to Apple Intelligence to generate specific inferences? For instance, can we provide a unique prompt to the Writing Tools or Image Playground APIs to obtain tailored outputs? Direct Inference Capabilities: Beyond the predefined functionalities like text rewriting or image generation, does Apple Intelligence offer APIs that allow for more generalized inference tasks based on custom inputs? I understand that Apple has provided APIs such as Writing Tools, Image Playground, and Genmoji. However, I'm interested in understanding the extent of customization and flexibility these APIs offer, especially concerning custom prompts and generalized inference. Additionally, are there any plans or timelines for expanding these capabilities, perhaps with the introduction of new SDKs or frameworks that allow deeper integration and customization? Any insights, documentation links, or experiences shared would be greatly appreciated. Thank you in advance for your assistance!
3
0
320
Jun ’25
RDMA API Documentation
With the release of the newest version of tahoe and MLX supporting RDMA. Is there a documentation link to how to utilizes the libdrma dylib as well as what functions are available? I am currently assuming it mostly follows the standard linux infiniband library but I would like the apple specific details.
0
0
165
1w
Getting CoreML to run inference on already allocated gpu buffers
I am running some experiments with WebGPU using the wgpu crate in rust. I have some Buffers already allocated in the GPU. Is it possible to use those already existing buffers directly as inputs to a predict call in CoreML? I want to prevent gpu to cpu download time as much as possible. Or are there any other ways to do something like this. Is this only possible using the latest Tensor object which came out with Metal 4 ?
0
0
447
Nov ’25
A Summary of the WWDC25 Group Lab - Apple Intelligence
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Apple Intelligence. Can I integrate writing tools in my own text editor? UITextView, NSTextView, and SwiftUI TextEditor automatically get Writing Tools on devices that support Apple Intelligence. For custom text editors, check out Enhancing your custom text engine with Writing Tools. Given that Foundation Models are on-device, how will Apple update the models over time? And how should we test our app against the model updates? Model updates are in sync with OS updates. As for testing with updated models, watch our WWDC session about prompt engineering and safety, and read the Human Interface Guidelines to understand best practices in prompting the on-device model. What is the context size of a session in Foundation Models Framework? How to handle the error if a session runs out of the context size? Currently the context size is about 4,000 tokens. If it’s exceeded, developers can catch the .exceededContextWindowSize error at runtime. As discussed in one of our WWDC25 sessions, when the context window is exceeded, one approach is to trim and summarize a transcript, and then start a new session. Can I do image generation using the Foundation Models Framework or is only text generation supported? Foundation Models do not generate images, but you can use the Foundation Models framework to generate prompts for ImageCreator in the Image Playground framework. Developers can also take advantage of Tools in Foundation Models framework, if appropriate for their app. My app currently uses a third party server-based LLM. Can I use the Foundation Models Framework as well in the same app? Any guidance here? The Foundation Models framework is optimized for a subset of tasks like summarization, extraction, classification, and tagging. It’s also on-device, private, and free. But at 3 billion parameters it isn’t designed for advanced reasoning or world knowledge, so for some tasks you may still want to use a larger server-based model. Should I use the AFM for my language translation features given it does text translation, or is the Translation API still the preferred approach? The Translation API is still preferred. Foundation Models is great for tasks like text summarization and data generation. It’s not recommended for general world knowledge or translation tasks. Will the TranslationSession class introduced in ios18 get any new improvments in performance or reliability with the new live translation abilities in ios/macos/ipados 26? Essentially, will we get access to live translation in a similar way and if so, how? There's new API in LiveCommunicationKit to take advantage of live translation in your communication apps. The Translate framework is using the same models as used by Live Communication and can be combined with the new SpeechAnalyzer API to translate your own audio. How do I set a default value for an App Intent parameter that is otherwise required? You can implement a default value as part of your parameter declaration by using the @Parameter(defaultValue:) form of the property wrapper. How long can an App Intent run? On macOS there is no limit to how long app intents can run. On iOS, there is a limit of 30 seconds. This time limit is paused when waiting for user interaction. How do I vary the options for a specific parameter of an App Intent, not just based on the type? Implement a DynamicOptionsProvider on that parameter. You can add suggestedEntities() to suggest options. What if there is not a schema available for what my app is doing? If an app intent schema matching your app’s functionality isn’t available, take a look to see if there’s a SiriKit domain that meets your needs, such as for media playback and messaging apps. If your app’s functionality doesn’t match any of the available schemas, you can define a custom app intent, and integrate it with Siri by making it an App Shortcut. Please file enhancement requests via Feedback Assistant for new App intent schemas that would benefit your app. Are you adding any new app intent domains this year? In addition to all the app intent domains we announced last year, this year at WWDC25 we announced that Visual Intelligence will be added to iOS 26 and macOS Tahoe. When my App Intent doesn't show up as an action in Shortcuts, where do I start in figuring out what went wrong? App Intents are statically extracted. You can check the ExtractMetadata info in Xcode's build log. What do I need to do to make sure my App Intents work well with Spotlight+? Check out our WWDC25 sessions on App Intents, including Explore new advances in App Intents and Develop for Shortcuts and Spotlight with App Intents. Mostly, make sure that your intent can run from the parameter summary alone, no required parameters without default values that are not already in the parameter summary. Does Spotlight+ on macOS support App Shortcuts? Not directly, but it shows the App Intents your App Shortcuts are sitting on top of. I’m wondering if the on-device Foundation Models framework API can be integrated into an app to act strictly as an app in-universe AI assistant, responding only within the boundaries of the app’s fictional context. Is such controlled, context-limited interaction supported? FM API runs inside the process of your app only and does not automatically integrate with any remaining part of the system (unless you choose to implement your own tool and utilize tool calling). You can provide any instructions and prompts you want to the model. If a country does not support Apple Intelligence yet, can the Foundation Models framework work? FM API works on Apple Intelligence-enabled devices in supported regions and won’t work in regions where Apple Intelligence is not yet supported
2
0
267
Jul ’25
ActivityClassifier doesn't classify movement
I'm using a custom create ML model to classify the movement of a user's hand in a game, The classifier has 3 different spell movements, but my code constantly predicts all of them at an equal 1/3 probability regardless of movement which leads me to believe my code isn't correct (as opposed to the model) which in CreateML at least gives me a heavily weighted prediction My code is below. On adding debug prints everywhere all the data looks good to me and matches similar to my test CSV data So I'm thinking my issue must be in the setup of my model code? /// Feeds samples into the model and keeps a sliding window of the last N frames. final class WandGestureStreamer { static let shared = WandGestureStreamer() private let model: SpellActivityClassifier private var samples: [Transform] = [] private let windowSize = 100 // number of frames the model expects /// RNN hidden state passed between inferences private var stateIn: MLMultiArray /// Last transform dropped from the window for continuity private var lastDropped: Transform? private init() { let config = MLModelConfiguration() self.model = try! SpellActivityClassifier(configuration: config) // Initialize stateIn to the model’s required shape let constraint = self.model.model.modelDescription .inputDescriptionsByName["stateIn"]! .multiArrayConstraint! self.stateIn = try! MLMultiArray(shape: constraint.shape, dataType: .double) } /// Call once per frame with the latest wand position (or any feature vector). func appendSample(_ sample: Transform) { samples.append(sample) // drop oldest frame if over capacity, retaining it for delta at window start if samples.count > windowSize { lastDropped = samples.removeFirst() } } func classifyIfReady(threshold: Double = 0.6) -> (label: String, confidence: Double)? { guard samples.count == windowSize else { return nil } do { let input = try makeInput(initialState: stateIn) let output = try model.prediction(input: input) // Save state for continuity stateIn = output.stateOut let best = output.label let conf = output.labelProbability[best] ?? 0 // If you’ve recognized a gesture with high confidence: if conf > threshold { return (best, conf) } else { return nil } } catch { print("Error", error.localizedDescription, error) return nil } } /// Constructs a SpellActivityClassifierInput from recorded wand transforms. func makeInput(initialState: MLMultiArray) throws -> SpellActivityClassifierInput { let count = samples.count as NSNumber let shape = [count] let timeArr = try MLMultiArray(shape: shape, dataType: .double) let dxArr = try MLMultiArray(shape: shape, dataType: .double) let dyArr = try MLMultiArray(shape: shape, dataType: .double) let dzArr = try MLMultiArray(shape: shape, dataType: .double) let rwArr = try MLMultiArray(shape: shape, dataType: .double) let rxArr = try MLMultiArray(shape: shape, dataType: .double) let ryArr = try MLMultiArray(shape: shape, dataType: .double) let rzArr = try MLMultiArray(shape: shape, dataType: .double) for (i, sample) in samples.enumerated() { let previousSample = i > 0 ? samples[i - 1] : lastDropped let model = WandMovementRecording.DataModel(transform: sample, previous: previousSample) // print("model", model) timeArr[i] = NSNumber(value: model.timestamp) dxArr[i] = NSNumber(value: model.dx) dyArr[i] = NSNumber(value: model.dy) dzArr[i] = NSNumber(value: model.dz) let rot = model.rotation rwArr[i] = NSNumber(value: rot.w) rxArr[i] = NSNumber(value: rot.x) ryArr[i] = NSNumber(value: rot.y) rzArr[i] = NSNumber(value: rot.z) } return SpellActivityClassifierInput( dx: dxArr, dy: dyArr, dz: dzArr, rotation_w: rwArr, rotation_x: rxArr, rotation_y: ryArr, rotation_z: rzArr, timestamp: timeArr, stateIn: initialState ) } }
1
0
377
Jul ’25
Vision Framework VNTrackObjectRequest: Minimum Valid Bounding Box Size Causing Internal Error (Code=9)
I'm developing a tennis ball tracking feature using Vision Framework in Swift, specifically utilizing VNDetectedObjectObservation and VNTrackObjectRequest. Occasionally (but not always), I receive the following runtime error: Failed to perform SequenceRequest: Error Domain=com.apple.Vision Code=9 "Internal error: unexpected tracked object bounding box size" UserInfo={NSLocalizedDescription=Internal error: unexpected tracked object bounding box size} From my investigation, I suspect the issue arises when the bounding box from the initial observation (VNDetectedObjectObservation) is too small. However, Apple's documentation doesn't clearly define the minimum bounding box size that's considered valid by VNTrackObjectRequest. Could someone clarify: What is the minimum acceptable bounding box width and height (normalized) that Vision Framework's VNTrackObjectRequest expects? Is there any recommended practice or official guidance for bounding box size validation before creating a tracking request? This information would be extremely helpful to reliably avoid this internal error. Thank you!
0
0
107
Apr ’25
Pre-inference AI Safety Governor for FoundationModels (Swift, On-Device)
Hi everyone, I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community. The Problem Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain. What Newton Does Newton validates every prompt pre-inference and returns: Phase (0/1/7/8/9) Shape classification Confidence score Full audit trace If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model. v1.3 Detection Categories (14 total) Jailbreak / prompt injection Corrosive self-negation ("I hate myself") Hedged corrosive ("Not saying I'm worthless, but...") Emotional dependency ("You're the only one who understands") Third-person manipulation ("If you refuse, you're proving nobody cares") Logical contradictions ("Prove truth doesn't exist") Self-referential paradox ("Prove that proof is impossible") Semantic inversion ("Explain how truth can be false") Definitional impossibility ("Square circle") Delegated agency ("Decide for me") Hallucination-risk prompts ("Cite the 2025 CDC report") Unbounded recursion ("Repeat forever") Conditional unbounded ("Until you can't") Nonsense / low semantic density Test Results 94.3% catch rate on 35 adversarial test cases (33/35 passed). Architecture User Input ↓ [ Newton ] → Validates prompt, assigns Phase ↓ Phase 9? → [ FoundationModels ] → Response Phase 1/7/8? → Blocked with explanation Key Properties Deterministic (same input → same output) Fully auditable (ValidationTrace on every prompt) On-device (no network required) Native Swift / SwiftUI String Catalog localization (EN/ES/FR) FoundationModels-ready (#if canImport) Code Sample — Validation let governor = NewtonGovernor() let result = governor.validate(prompt: userInput) if result.permitted { // Proceed to FoundationModels let session = LanguageModelSession() let response = try await session.respond(to: userInput) } else { // Handle block print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)") print(result.trace.summary) // Full audit trace } Questions for the Community Anyone else building pre-inference validation for FoundationModels? Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail? Interest in Shape Theory classification for prompt complexity? Best practices for integrating with LanguageModelSession? Links GitHub: https://github.com/jaredlewiswechs/ada-newton Technical overview: parcri.net Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device.
0
0
301
1w
Xcode 26 intelligence editor modifications.
Greetings, Ive been exerimenting with the new Apple intelligence chat. I want to be able to use my custom LLM and I made that work (I can chat back and forward from the left panel with my server) but I cannot find out how to change the editor contents like chatgpt does. chatgpt is able to change the current editor and, seems like, all files in the pbx. I tried to catch the call with charles with no success. In the OpenIA platform docs it doesnt mention anything that could change the code shown. does anyone know how to achieve this? Is the apple intelliece documentation lacking this features and will it be completed soon? will this features even be open for developers?
1
0
276
Jul ’25