Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Created

CoreML: Model loading utilities
Hello, We find that models sometimes load very fast (<< 1 second) and sometimes encounter very long load times (>> 120 seconds). During such slow load times, the model is being compiled. We would greatly appreciate the ability to check cache validity via CoreML and determine that we are about to encounter long load times so that we can mitigate and provide a good user experience. A secondary issue: sometimes the cache is corrupted (typically .mpsgraphpackage yielding Metal cold asserts). This yields load failures and OS errors that persist between launches, and we have to manually nuke the cache (~/Library/..../my-app/...) for the CoreML assets. A CoreML API for clearing caches and hardening from asserts across the load paths would be appreciated
1
0
118
Jun ’25
Various On-Device Frameworks API & ChatGPT
Posting a follow up question after the WWDC 2025 Machine Learning AI & Frameworks Group Lab on June 12. In regards to the on-device API of any of the AI frameworks (foundation model, vision framework, ect.), is there a response condition or path where the API outsources it's input to ChatGPT if the user has allowed this like Siri does? Ignore this if it's a no: is this handled behind the scenes or by the developer?
0
0
261
Jun ’25
Foundation Models Adaptors for Generable output?
Is it possible to train an Adaptor for the Foundation Models to produce Generable output? If so what would the response part of the training data need to look like? Presumably, under the hood, the model is outputting JSON (or some other similar structure) that can be decoded to a Generable type. Would the response part of the training data for an Adaptor need to be in that structured format?
2
0
202
Jun ’25
What is the Foundation Models support for basic math?
I am experimenting with Foundation Models in my time tracking app to analyze users tracked events, but I am finding that the model struggles with even basic computation of time. Specifically converting from seconds to hours and minutes. To give just one example, when I prompt: "Convert 3672 seconds to hours, minutes, and seconds. Don't include the calculations in the resulting output" I get this: "3672 seconds is equal to 1 hour, 0 minutes, and 36 seconds". Which is clearly wrong - it should be 1 hour, 1 minute, and 12 seconds. Another issue that I saw a lot is that seconds were considered to be minutes, or that the hours were just completely off. What can I do to make the support for math better? Or is that just something that the model is not meant to be used for?
1
0
195
Jun ’25
Guardrail configuration options?
Is anything configurable for LanguageModelSession.Guardrails besides the default? I'm prototyping a camping app, and it's constantly slamming into guardrail errors when I use the new foundation model interface. Any subjects relating to fishing, survival, etc. won't generate. For example the prompt "How can I kill deer ticks using a clothing treatment?" returns a generation error. The results that I get are great when it works, but so far the local model sessions are extremely unreliable.
2
2
230
Jun ’25
Failing to run SystemLanguageModel inference with custom adapter
Hi, I have trained a basic adapter using the adapter training toolkit. I am trying a very basic example of loading it and running inference with it, but am getting the following error: Passing along InferenceError::inferenceFailed::loadFailed::Error Domain=com.apple.TokenGenerationInference.E5Runner Code=0 "Failed to load model: ANE adapted model load failure: createProgramInstanceWithWeights:modelToken:qos:baseModelIdentifier:owningPid:numWeightFiles:error:: Program load new instance failure (0x170006)." UserInfo={NSLocalizedDescription=Failed to load model: ANE adapted model load failure: createProgramInstanceWithWeights:modelToken:qos:baseModelIdentifier:owningPid:numWeightFiles:error:: Program load new instance failure (0x170006).} in response to ExecuteRequest Any ideas / direction? For testing I am including the .fmadapter file inside the app bundle. This is where I load it: @State private var session: LanguageModelSession? // = LanguageModelSession() func loadAdapter() async throws { if let assetURL = Bundle.main.url(forResource: "qasc---afm---4-epochs-adapter", withExtension: "fmadapter") { print("Asset URL: \(assetURL)") let adapter = try SystemLanguageModel.Adapter(fileURL: assetURL) let adaptedModel = SystemLanguageModel(adapter: adapter) session = LanguageModelSession(model: adaptedModel) print("Loaded adapter and updated session") } else { print("Asset not found in the main bundle.") } } This seems to work fine as I get to the log Loaded adapter and updated session. However when the below inference code runs I get the aforementioned error: func sendMessage(_ msg: String) { self.loading = true if let session = session { Task { do { let modelResponse = try await session.respond(to: msg) DispatchQueue.main.async { self.response = modelResponse.content self.loading = false } } catch { print("Error: \(error)") DispatchQueue.main.async { self.loading = false } } } } }
3
0
217
Jun ’25
Train adapter with tool calling
Documentation on adapter train is lacking any details related to training on dataset with tool calling. And page about tool calling itself only explain how to use it from Swift without any internal details useful in training. Question is how schema should looks like for including tool calling in dataset?
1
0
241
Jun ’25
Data used for MLX fine-tuning
The WWDC25: Explore large language models on Apple silicon with MLX video talks about using your own data to fine-tune a large language model. But the video doesn't explain what kind of data can be used. The video just shows the command to use and how to point to the data folder. Can I use PDFs, Word documents, Markdown files to train the model? Are there any code examples on GitHub that demonstrate how to do this?
2
0
193
Jun ’25
FoundationModels Content Sanitizer Blocking Legitimate Text Processing
I'm developing a macOS application using the FoundationModels framework (LanguageModelSession) and encountering issues with the content sanitizer blocking legitimate text input. ** Issue Description:** The content sanitizer is flagging text strings that contain certain substrings, even when they represent legitimate technical content. For example: F_SEEL_SEX1S.wav (sE Electronics SEX1S microphone model) Technical product identifiers Serial numbers and version codes ** Broader Concern:** The content sanitizer appears to be applying restrictions that seem inappropriate for user-owned content. Even if a filename were something like "human sex.wav", users should have the right to process their own legitimate files on their own devices without content filtering interference. ** Error Messages:** SensitiveContentSettings: Sanitizer model found unsafe content in value FoundationModels.LanguageModelSession.GenerationError error 2 ** Questions:** Is there a way to disable content sanitization for processing user-owned content? 2. What's the recommended approach for applications that need to handle arbitrary user text? 3. Are there APIs to process personal content without filtering restrictions? ** Environment:** macOS 26.0 FoundationModels framework LanguageModelSession Any guidance would be appreciated.
1
0
315
Jun ’25
RecognizeDocumentsRequest for receipts
Hi, I'm trying to use the new RecognizeDocumentsRequest from the Vision Framework to read a receipt. It looks very promising by being able to read paragraphs, lines and detect data. So far it unfortunately seems to read every line on the receipt as a paragraph and when there is more space on one line it creates two paragraphs. Is there perhaps an Apple Engineer who knows if this is expected behaviour or if I should file a Feedback for this? Code setup: let request = RecognizeDocumentsRequest() let observations = try await request.perform(on: image) guard let document = observations.first?.document else { return } for paragraph in document.paragraphs { print(paragraph.transcript) for data in paragraph.detectedData { switch data.match.details { case .phoneNumber(let data): print("Phone: \(data)") case .postalAddress(let data): print("Postal: \(data)") case .calendarEvent(let data): print("Calendar: \(data)") case .moneyAmount(let data): print("Money: \(data)") case .measurement(let data): print("Measurement: \(data)") default: continue } } } See attached image as an example of a receipt I'd like to parse. The top 3 lines are the name, street, and postal code + city. These are all separate paragraphs. Checking on detectedData does see the street (2nd line) as PostalAddress, but not the complete address. Might that be a location thing since it's a Dutch address. And lower on the receipt it sees the block with "Pomp 1 95 Ongelood" and the things below also as separate paragraphs. First picking up the left side and after that the right side. So it's something like this: * Pomp 1 Volume Prijs € TOTAAL * BTW Netto 21.00 % 95 Ongelood 41,90 l 1.949/ 1 81.66 € 14.17 67.49
3
1
477
Jun ’25
FoundationModels not supported on Mac Catalyst?
I'd love to add a feature based on FoundationModels to the Mac Catalyst version of my iOS app. Unfortunately I get an error when importing FoundationModels: No such module 'FoundationModels'. Documentation says Mac Catalyst is supported: https://developer.apple.com/documentation/foundationmodels I can create iOS builds using the FoundationModels framework without issues. Hope this will be fixed soon! Config: Xcode 26.0 beta (17A5241e) macOS 26.0 Beta (25A5279m) 15-inch, M4, 2025 MacBook Air
2
1
294
Jun ’25
SpeechAnalyzer / AssetInventory and preinstalled assets
During testing the “Bringing advanced speech-to-text capabilities to your app” sample app demonstrating the use of iOS 26 SpeechAnalyzer, I noticed that the language model for the English locale was presumably already downloaded. Upon checking the documentation of AssetInventory, I found out that indeed, the language model can be preinstalled on the system. Can someone from the dev team share more info about what assets are preinstalled by the system? For example, can we safely assume that the English language model will almost certainly be already preinstalled by the OS if the phone has the English locale?
1
0
202
Jun ’25
Is there anywhere to get precompiled WhisperKit models for Swift?
If try to dynamically load WhipserKit's models, as in below, the download never occurs. No error or anything. And at the same time I can still get to the huggingface.co hosting site without any headaches, so it's not a blocking issue. let config = WhisperKitConfig( model: "openai_whisper-large-v3", modelRepo: "argmaxinc/whisperkit-coreml" ) So I have to default to the tiny model as seen below. I have tried so many ways, using ChatGPT and others, to build the models on my Mac, but too many failures, because I have never dealt with builds like that before. Are there any hosting sites that have the models (small, medium, large) already built where I can download them and just bundle them into my project? Wasted quite a large amount of time trying to get this done. import Foundation import WhisperKit @MainActor class WhisperLoader: ObservableObject { var pipe: WhisperKit? init() { Task { await self.initializeWhisper() } } private func initializeWhisper() async { do { Logging.shared.logLevel = .debug Logging.shared.loggingCallback = { message in print("[WhisperKit] \(message)") } let pipe = try await WhisperKit() // defaults to "tiny" self.pipe = pipe print("initialized. Model state: \(pipe.modelState)") guard let audioURL = Bundle.main.url(forResource: "44pf", withExtension: "wav") else { fatalError("not in bundle") } let result = try await pipe.transcribe(audioPath: audioURL.path) print("result: \(result)") } catch { print("Error: \(error)") } } }
0
0
100
Jun ’25
SpeechTranscriber time indexes - detect pauses?
I'm experimenting with the new SpeechTranscriber in macOS/iOS 26, transcribing speech from a prerecorded mp4 file. Speed and quality are amazing! I've told the transcriber to include time indexes. Each run is always exactly one word, which can be very useful. When I look at the indexes the end of one run is always identical to the start of the next run, even if there's a pause. I'd like to identify pauses, perhaps to generate something like phrases for subtitling. With each run of text going into the next I can't do this, other than using punctuation - which might be rather rough. Any suggestions on detecting pauses, or getting that kind of metadata from the transcriber? Here's a short sample, showing each run with the start, end, and characters in the run: 105.9 --> 107.04 I 107.04 --> 107.16 think 107.16 --> 108.0 more 108.0 --> 108.42 lighting 108.42 --> 108.6 is 108.6 --> 108.72 definitely 108.72 --> 109.2 needed, 109.2 --> 109.92 downtown. 109.98 --> 110.4 My 110.4 --> 110.52 only 110.52 --> 110.7 question 110.7 --> 111.06 is, 111.06 --> 111.48 poll 111.48 --> 111.78 five, 111.78 --> 111.84 that 111.84 --> 112.08 you're 112.08 --> 112.38 increasing 112.38 --> 112.5 the 112.5 --> 113.34 50,000? 113.4 --> 113.58 Where 113.58 --> 113.88 exactly
0
0
173
Jun ’25
InferenceError referencing context length in FoundationModels framework
I'm experimenting with downloading an audio file of spoken content, using the Speech framework to transcribe it, then using FoundationModels to clean up the formatting to add paragraph breaks and such. I have this code to do that cleanup: private func cleanupText(_ text: String) async throws -> String? { print("Cleaning up text of length \(text.count)...") let session = LanguageModelSession(instructions: "The content you read is a transcription of a speech. Separate it into paragraphs by adding newlines. Do not modify the content - only add newlines.") let response = try await session.respond(to: .init(text), generating: String.self) return response.content } The content length is about 29,000 characters. And I get this error: InferenceError::inferenceFailed::Failed to run inference: Context length of 4096 was exceeded during singleExtend.. Is 4096 a reference to a max input length? Or is this a bug? This is running on an M1 iPad Air, with iPadOS 26 Seed 1.
5
0
365
Jun ’25
Apple Intelligence stuck at 100% on macOS 26 Beta 1
Hello, I'm unable to develop for Apple Intelligence on my Mac Studio, M1 Max running macOS 26 beta 1. The models get downloaded and I can also verify that they exist in /System/Library/AssetsV2/ however the download progress remains stuck at 100%. Checking console logs shows the process generativeexperiencesd reporting the following: My device region and language is set to English (India). Things I've already tried: Changing language and region to English (US) Reinstalling macOS Trying with a different ISP via hotspot.
6
10
501
Jun ’25