Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

A Summary of the WWDC25 Group Lab - Machine Learning and AI Frameworks
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Machine Learning and AI Frameworks. What are you most excited about in the Foundation Models framework? The Foundation Models framework provides access to an on-device Large Language Model (LLM), enabling entirely on-device processing for intelligent features. This allows you to build features such as personalized search suggestions and dynamic NPC generation in games. The combination of guided generation and streaming capabilities is particularly exciting for creating delightful animations and features with reliable output. The seamless integration with SwiftUI and the new design material Liquid Glass is also a major advantage. When should I still bring my own LLM via CoreML? It's generally recommended to first explore Apple's built-in system models and APIs, including the Foundation Models framework, as they are highly optimized for Apple devices and cover a wide range of use cases. However, Core ML is still valuable if you need more control or choice over the specific model being deployed, such as customizing existing system models or augmenting prompts. Core ML provides the tools to get these models on-device, but you are responsible for model distribution and updates. Should I migrate PyTorch code to MLX? MLX is an open-source, general-purpose machine learning framework designed for Apple Silicon from the ground up. It offers a familiar API, similar to PyTorch, and supports C, C++, Python, and Swift. MLX emphasizes unified memory, a key feature of Apple Silicon hardware, which can improve performance. It's recommended to try MLX and see if its programming model and features better suit your application's needs. MLX shines when working with state-of-the-art, larger models. Can I test Foundation Models in Xcode simulator or device? Yes, you can use the Xcode simulator to test Foundation Models use cases. However, your Mac must be running macOS Tahoe. You can test on a physical iPhone running iOS 18 by connecting it to your Mac and running Playgrounds or live previews directly on the device. Which on-device models will be supported? any open source models? The Foundation Models framework currently supports Apple's first-party models only. This allows for platform-wide optimizations, improving battery life and reducing latency. While Core ML can be used to integrate open-source models, it's generally recommended to first explore the built-in system models and APIs provided by Apple, including those in the Vision, Natural Language, and Speech frameworks, as they are highly optimized for Apple devices. For frontier models, MLX can run very large models. How often will the Foundational Model be updated? How do we test for stability when the model is updated? The Foundation Model will be updated in sync with operating system updates. You can test your app against new model versions during the beta period by downloading the beta OS and running your app. It is highly recommended to create an "eval set" of golden prompts and responses to evaluate the performance of your features as the model changes or as you tweak your prompts. Report any unsatisfactory or satisfactory cases using Feedback Assistant. Which on-device model/API can I use to extract text data from images such as: nutrition labels, ingredient lists, cashier receipts, etc? Thank you. The Vision framework offers the RecognizeDocumentRequest which is specifically designed for these use cases. It not only recognizes text in images but also provides the structure of the document, such as rows in a receipt or the layout of a nutrition label. It can also identify data like phone numbers, addresses, and prices. What is the context window for the model? What are max tokens in and max tokens out? The context window for the Foundation Model is 4,096 tokens. The split between input and output tokens is flexible. For example, if you input 4,000 tokens, you'll have 96 tokens remaining for the output. The API takes in text, converting it to tokens under the hood. When estimating token count, a good rule of thumb is 3-4 characters per token for languages like English, and 1 character per token for languages like Japanese or Chinese. Handle potential errors gracefully by asking for shorter prompts or starting a new session if the token limit is exceeded. Is there a rate limit for Foundation Models API that is limited by power or temperature condition on the iPhone? Yes, there are rate limits, particularly when your app is in the background. A budget is allocated for background app usage, but exceeding it will result in rate-limiting errors. In the foreground, there is no rate limit unless the device is under heavy load (e.g., camera open, game mode). The system dynamically balances performance, battery life, and thermal conditions, which can affect the token throughput. Use appropriate quality of service settings for your tasks (e.g., background priority for background work) to help the system manage resources effectively. Do the foundation models support languages other than English? Yes, the on-device Foundation Model is multilingual and supports all languages supported by Apple Intelligence. To get the model to output in a specific language, prompt it with instructions indicating the user's preferred language using the locale API (e.g., "The user's preferred language is en-US"). Putting the instructions in English, but then putting the user prompt in the desired output language is a recommended practice. Are larger server-based models available through Foundation Models? No, the Foundation Models API currently only provides access to the on-device Large Language Model at the core of Apple Intelligence. It does not support server-side models. On-device models are preferred for privacy and for performance reasons. Is it possible to run Retrieval-Augmented Generation (RAG) using the Foundation Models framework? Yes, it is possible to run RAG on-device, but the Foundation Models framework does not include a built-in embedding model. You'll need to use a separate database to store vectors and implement nearest neighbor or cosine distance searches. The Natural Language framework offers simple word and sentence embeddings that can be used. Consider using a combination of Foundation Models and Core ML, using Core ML for your embedding model.
1
0
1.2k
Jun ’25
Attempts to install Tensorflow on Mac Studio M1 fail
I am attempting to install Tensorflow on my M1 and I seem to be unable to find the correct matching versions of jax, jaxlib and numpy to make it all work. I am in Bash, because the default shell gave me issues. I downgraded to python 3.10, because with 3.13, I could not do anything right. Current actions: bash-3.2$ python3.10 -m venv ~/venv-metal bash-3.2$ python --version Python 3.10.16 python3.10 -m venv ~/venv-metal source ~/venv-metal/bin/activate python -m pip install -U pip python -m pip install tensorflow-macos And here, I keep running tnto errors like: (venv-metal):~$ pip install tensorflow-macos tensorflow-metal ERROR: Could not find a version that satisfies the requirement tensorflow-macos (from versions: none) ERROR: No matching distribution found for tensorflow-macos What is wrong here? How can I fix that? It seems like the system wants to use the x86 version of python ... which can't be right.
4
0
1.8k
Jan ’25
NLModel won't initialize in MessageFilterExtension
i'm trying to create an NLModel within a MessageFilterExtension handler. The code works fine in the main app, but when I try to use it in the extension it fails to initialize. Just this doesn't even work and gets the error below. Single line that fails. SMS_Classifier is the class xcode generated for my model. This line works fine in the main app. let mlModel = try SMS_Classifier(configuration: MLModelConfiguration()).model Error Unable to locate Asset for contextual word embedding model for local en. MLModelAsset: load failed with error Error Domain=com.apple.CoreML Code=0 "initialization of text classifier model with model data failed" UserInfo={NSLocalizedDescription=initialization of text classifier model with model data failed} Any ideas?
3
1
1.1k
Jan ’25
Create ML how to handle polygon annotations?
I have images, and I annotated with polygon, actually simple trapezoid, so 4 points. I have been trying and trying but can't get Create ML to work. I am trying Object Detection. I am not a real programmer so really would greatly appreciate some guidance to help to get this model created. I think I made a Detectron2 model, and tried to get that converted into a mlmodel I need for xcode but had troubles there also. thank you. { "annotation": "IMG_1803.JPG", "annotations": [ { "label": "court", "coordinates": { "x": [ 187, 3710, 2780, 929 ], "y": [ 1689, 1770, 478, 508 ] } } ] },
2
0
732
Jan ’25
Help Needed: SwiftUI View with Camera Integration and Core ML Object Recognition
Hi everyone, I'm working on a SwiftUI app and need help building a view that integrates the device's camera and uses a pre-trained Core ML model for real-time object recognition. Here's what I want to achieve: Open the device's camera from a SwiftUI view. Capture frames from the camera feed and analyze them using a Create ML-trained Core ML model. If a specific figure/object is recognized, automatically close the camera view and navigate to another screen in my app. I'm looking for guidance on: Setting up live camera capture in SwiftUI. Using Core ML and Vision frameworks for real-time object recognition in this context. Managing navigation between views when the recognition condition is met. Any advice, code snippets, or examples would be greatly appreciated! Thanks in advance!
1
0
735
Jan ’25
How to Retrieve VisualLookUp Results (e.g., Object Name) in VisionKit?
Hi everyone, I'm working on an iOS app that uses VisionKit and I'm exploring the .visualLookUp feature. Specifically, I want to extract the detailed information that Visual Look Up provides after identifying an object in an image (e.g., if the object is a flower, retrieve its name; if it’s a clothing tag, get the tag's content).
1
0
625
Jan ’25
BarcodeObservation Orientation
Hi, I'm working with vision framework to detect barcodes. I tested both ean13 and data matrix detection and both are working fine except for the QuadrilateralProviding values in the returned BarcodeObservation. TopLeft, topRight, bottomRight and bottomLeft coordinates are rotated 90° counter clockwise (physical bottom left of data Matrix, the corner of the "L" is returned as the topLeft point in observation). The same behaviour is happening with EAN13 Barcode. Did someone else experienced the same issue with orientation? Is it normal behaviour or should we expect a fix in next releases of the Vision Framework?
4
0
619
Jan ’25
Can I Perform Hybrid Execution on Neural Engine and CPU with 16-bit Precision?
Hello, I have a question regarding hybrid execution for deep learning models on Apple's Neural Engine and CPU. I am aware that setting the precision of some layers to 32-bit allows hybrid execution across both the Neural Engine and the CPU. However, I would like to know if it is possible to achieve the same with 16-bit precision. Is there any specific configuration or workaround to enable hybrid execution in this case? Any guidance or documentation references would be greatly appreciated. Thank you!
0
0
439
Jan ’25
What is experimentalMLE5EngineUsage?
@property (assign,nonatomic) long long experimentalMLE5EngineUsage; //@synthesize experimentalMLE5EngineUsage=_experimentalMLE5EngineUsage - In the implementation block What is it, and why would disabling it fix NMS for a MLProgram? Is there anyway to signal this flag from model metadata? Is there anyway to signal or disable from a global, system-level scope? It's extremely easy to reproduce, but do not know how to investigate the drastic regression between toggling this flag let config = MLModelConfiguration() config.setValue(1, forKey: "experimentalMLE5EngineUsage")
0
1
669
Jan ’25
Resize Image Playground sheet
When using the imagePlaygroundSheet modifier in SwiftUI, the system presets an image playground in a fixed size. Especially on macOS, this modal is rather small and doesn't utilize the available space efficiently. Is there a way to make the modal bigger, or allow the user to resize the dialog? I tried presentationDetents, but this would need to be applied to the content of the sheet, which is provided by the system... I guess this question applies to other system-provided sheets like the photo picker as well.
2
0
758
Jan ’25
About VisionKit DataScannerViewController
Hi I'm having a problem with DataScannerViewController, I'm using the volume barcode scanning feature in my app, prior to that I was using an AVCaptureDevice with the UltraWideAngle set. After discovering DataScannerViewController, we planned to replace the previous obsolete code with DataScannerViewController, all together it was ok, when I want to set the ultra wide angle, I don't know how to start. I tried to get the minZoomFactor and I realized that I get 0.0 I tried to set zoomFactor to 1.0 and I found that he is not valid Note: func dataScannerDidZoom(_ dataScanner: DataScannerViewController), when I try to get the minZoomFactor, set the zoomFactor in this proxy method, I find that it is valid! What should I do next, I want to use only DataScannerViewController and implement ultra wide angle Thanks a lot.
1
0
639
Jan ’25
How to confirm whether CreatML is training
I am currently training a Tabular Classification model in CreatML. The dataset comprises 30 features, including 1,000,000 training data points and 1,000,000 verification data points. Could you please estimate the approximate training time for an M4Max MacBook Pro? During the training process, CreatML has been displaying the “Processing” status, but there is no progress bar. I would like to ascertain whether the training is still ongoing, as I have often suspected that it has ceased.
1
0
653
Jan ’25
Using Core ML in a .swiftpm file
Hi everyone, I've been struggling for a few weeks to integrate my Core ML Image Classifier model into my .swiftpm project, and I’m hoping someone can help. Here’s what I’ve done so far: I converted my .mlmodel file to .mlmodelc manually via the terminal. In my Package.swift file, I tried both "copy" and "process" options for the resource. The issues I’m facing: When using "process", Xcode gives me the error: "multiple resources named 'coremldata.bin' in target 'AppModule'." When using "copy", the app runs, but the model doesn’t work, and the terminal shows: "A valid manifest does not exist at path: .../Manifest.json." I even tried creating a Manifest.json manually to test, but this led to more errors, such as: "File format version must be in the form of major.minor.patch." "Failed to look up root model." To check if the problem was specific to my model, I tested other Core ML models in the same setup, but none of them worked either. I feel stuck and unsure of how to resolve these issues. Any guidance or suggestions would be greatly appreciated. Thanks in advance! :)
2
2
1.2k
Jan ’25
Create ML Trouble Loading CSV to Train Word Tagger With Commas in Training Data
I'm using Numbers to build a spreadsheet that I'm exporting as a CSV. I then import this file into Create ML to train a word tagger model. Everything has been working fine for all the models I've trained so far, but now I'm coming across a use case that has been breaking the import process: commas within the training data. This is a case that none of Apple's examples show. My project takes Navajo text that has been tokenized by syllables and labels the parts-of-speech. Case that works... Raw text: Naaltsoos yídéeshtah. Tokens column: Naal,tsoos, ,yí,déesh,tah,. Labels column: NObj,NObj,Space,Verb,Verb,VStem,Punct Case that breaks... Raw text: óola, béésh łigaii, tłʼoh naadą́ą́ʼ, wáin, akʼah, dóó á,shįįh Tokens column with tokenized text (commas quoted): óo,la,",", ,béésh, ,łi,gaii,",", ,tłʼoh, ,naa,dą́ą́ʼ,",", ,wáin,",", ,a,kʼah,",", ,dóó, ,á,shįįh (Create ML reports mismatched columns) Tokens column with tokenized text (commas escaped): óo,la,\,, ,béésh, ,łi,gaii,\,, ,tłʼoh, ,naa,dą́ą́ʼ,\,, ,wáin,\,, ,a,kʼah,\,, ,dóó, ,á,shįįh (Create ML reports mismatched columns) Tokens column with tokenized text (commas escape-quoted): óo,la,\",\", ,béésh, ,łi,gaii,\",\", ,tłʼoh, ,naa,dą́ą́ʼ,\",\", ,wáin,\",\", ,a,kʼah,\",\", ,dóó, ,á,shįįh (record not detected by Create ML) Tokens column with tokenized text (commas escape-quoted): óo,la,"","", ,béésh, ,łi,gaii,"","", ,tłʼoh, ,naa,dą́ą́ʼ,"","", ,wáin,"","", ,a,kʼah,"","", ,dóó, ,á,shįįh (Create ML reports mismatched columns) Labels column: NSub,NSub,Punct,Space,NSub,Space,NSub,NSub,Punct,Space,NSub,Space,NSub,NSub,Punct,Space,NSub,Punct,Space,NSub,NSub,Punct,Space,Conj,Space,NSub,NSub Sample From Spreadsheet Solution Needed It's simple enough to escape commas within CSV files, but the format needed by Create ML essentially combines entire CSV records into single columns, so I'm ending up needing a CSV record that contains a mixture of commas to use for parsing and ones to use as character literals. That's where this gets complicated. For this particular use case (which seems like it would frequently arise when training a word tagger model), how should I properly escape a comma literal?
6
0
814
Jan ’25
CreateML
I'm trying to use the Spatial model to perform Object Tracking on a .usdz file that I create. After loading the file, which I can view correctly in the console, I start the training. Initially, I notice that the disk usage on my PC increases. After several GB, the usage stops, but the training progress remains for hours at 0.00% with the message "About 8hr." How can I understand what the issue is? Has anyone else experienced the same problem? Thanks Diego
1
1
634
Jan ’25
develop app in Europe using image playground
Hi, I want to develop an app which makes use of Image Playground. However, I am located in Europe which makes it impossible for me as Image Playground is not available for me. Even if I would like to distribute the app in the US. Nor the simulator, nor a physical device will always return that support for ImagePlayground is not supported (@Environment(.supportsImagePlayground) private var supportsImagePlayground) How to set my environment such that I can test the feature in my iOS application
1
0
697
Jan ’25
CoreML Model Instantiation Crashes
Some of my users are experiencing crashes on instantiation of a CoreML model I've bundled with my app. I haven't been able to reproduce the crash on any of my devices. Crashes happen across all iOS 18 releases. Seems like something internal in CoreML is causing an issue. Full stack trace: 6646631296fb42128ddc340b2d4322f7-symbolicated.crash
1
0
553
Jan ’25
CreatML stop training
It appears that there is a size limit when training the Tabular Classification model in CreatML. When the training data is small, the training process completes smoothly after a specified period. However, as the data volume increases, the following issues occur: initially, the training process indicates that it is in progress, but after approximately 24 hours, it is automatically terminated after an hour. I am certain that this is not a manual termination by myself or others, but rather an automatic termination by the machine. This issue persists despite numerous attempts, and the only message displayed is “Training Canceled.” I would appreciate it if someone could explain the reason behind this behavior and provide a solution. Thank you for your assistance.
1
0
606
Jan ’25