Not finding a lot on the Swift Assist technology announced at WWDC 2024. Does anyone know the latest status? Also, currently I use OpenAI's macOS app and its 'Work With...' functionality to assist with Xcode development, and this is okay, certainly saves copying code back and forth, but it seems like AI should be able to do a lot more to help with Xcode app development.
I guess I'm looking at what people are doing with AI in Visual Studio, Cline, Cursor and other IDEs and tools like those and feel a bit left out working in Xcode. Please let me know if there are AI tools or techniques out there you use to help with your Xcode projects.
Thanks in advance!
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Keep getting error :
I have tried Picker for File, Photo Library , both same results .
Debugging the resize for 360x360 but still facing this error.
The model I'm trying to implement is created with CreateMLComponents
The process is from example of WWDC 2022 Banana Ripeness , I have used index for each .jpg .
Prediction Failed: The VNCoreMLTransform request failed
Is there some possible way to solve it or is error somewhere in training of model ?
I've checked on pypi.org and it appears to only have arm64 packages, has x86 with AMD been deprecated?
I have integrated Apple’s Foundation Model into my iOS application. As known, Foundation Model is only supported starting from iOS 26 on compatible devices. To maintain compatibility with older iOS versions, I wrapped the API calls with the condition if #available(iOS 26, *).
The application works normally on an iPad running iOS 18 and on a Mac running macOS 26. However, when running the same build on a MacBook Air M1 (macOS 15) through iPad app compatibility, the app crashes immediately upon launch.
The main issue is that I cannot debug directly on macOS 15, since the app can only be built on macOS 26 with Xcode beta. I then have to distribute it via TestFlight and download it on the MacBook Air M1 for testing. This makes identifying the detailed cause of the crash very difficult and time-consuming.
Nevertheless, I have confirmed that the crash is caused by the Foundation Model APIs.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
While building an app with large language model inferencing on device, I got gibberish output. After carefully examining every detail, I found it's caused by the fused scaledDotProductAttention operation. I switched back to the discrete operations and problem solved. To reproduce the bug, please check https://github.com/zhoudan111/MPSGraph_SDPA_bug
Topic:
Machine Learning & AI
SubTopic:
General
I'm a bit new to the LLM stuff and with Foundation Models. My understanding is that there is a token limit of around 4K.
I want to process the contents of files which may be quite large. I first tried going the Tool route but that didn't work out so I then tried manually chunking the text to keep things under the limit.
It mostly works except that every now and then it'll exceed the limit. This happens even when the chunks are less than 100 characters. Instructions themselves are about 500 characters but still overall, well below 1000 characters per prompt, all told, which, in my limited understanding, should not result in 4K tokens being parsed.
Any ideas on what is going on here?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I'm using Xcode 26 Beta 5 and get errors on any generation I try, however harmless, when wrapped in the #Playground macro.
#Playground {
let session = LanguageModelSession()
let topic = "pandas"
let prompt = "Write a safe and respectful story about (topic)."
let response = try await session.respond(to: prompt)
Not seeing any issues on simulator or device. Anyone else seeing this or have any ideas?
Thanks for any help!
Version 26.0 beta 5 (17A5295f)
macOS 26.0 Beta (25A5316i)
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Encountered a few times when the answer get "stuck" (I am now at beta 6).
This is an example.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
It seems like there was an undocumented change that made Transcript.init(entries: [Transcript.Entry] initializer private, which broke my application, which relies on (manual) reconstruction of Transcript entries.
Worked fine on beta 1, on beta 2 there's this error
dyld[72381]: Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC
Referenced from: <44342398-591C-3850-9889-87C9458E1440> /Users/mika/experiments/apple-on-device-ai/fm
Expected in: <66A793F6-CB22-3D1D-A560-D1BD5B109B0D> /System/Library/Frameworks/FoundationModels.framework/Versions/A/FoundationModels
Is this a part of an API transition, if so -
Apple, please update your documentation
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I'm seeing this error a lot in my console log of my iPhone 15 Pro (Apple Intelligence enabled):
com.apple.modelcatalog.catalog sync: connection error during call: Error Domain=NSCocoaErrorDomain Code=4099 "The connection to service named com.apple.modelcatalog.catalog was invalidated: failed at lookup with error 159 - Sandbox restriction." UserInfo={NSDebugDescription=The connection to service named com.apple.modelcatalog.catalog was invalidated: failed at lookup with error 159 - Sandbox restriction.} reached max num connection attempts: 1
Are there entitlements / permissions I need to enable in Xcode that I forgot to do?
Code example
Here's how I'm initializing the language model session:
private func setupLanguageModelSession() {
if #available(iOS 26.0, *) {
let instructions = """
my instructions
"""
do {
languageModelSession = try LanguageModelSession(instructions: instructions)
print("Foundation Models language model session initialized")
} catch {
print("Error creating language model session: \(error)")
languageModelSession = nil
}
} else {
print("Device does not support Foundation Models (requires iOS 26.0+)")
languageModelSession = nil
}
}
Hey,
I've been trying to write an AI agent for OpenAI's GPT-5, but using the @Generable Tool types from the FoundationModels framework, which is super awesome btw!
I'm having trouble implementing the tool calling, though. When I receive a tool call from the OpenAI api, I do the following:
Find the tool in my [any Tool] array via the tool name I get from the model
if let tool = tools.first(where: { $0.name == functionCall.name }) {
// ...
}
Parse the arguments of the tool call via GeneratedContent(json:)
let generatedContent = try GeneratedContent(json: functionCall.arguments)
Pass the tool and arguments to a function that calls tool.call(arguments: arguments) and returns the tool's output type
private func execute<T: Tool>(_ tool: T, with generatedContent: GeneratedContent) async throws -> T.Output {
let arguments = try T.Arguments.init(generatedContent)
return try await tool.call(arguments: arguments)
}
Up to this point, everything is working as expected. However, the tool's output type is any PromptRepresentable and I have no idea how to turn that into something that I can encode and send back to the model. I assumed there might be a way to turn it into a GeneratedContent but there is no fitting initializer.
Am I missing something or is this not supported? Without a way to return the output to an external provider, it wouldn't really be possible to use FoundationModels Tool type I think. That would be unfortunate because it's implemented so elegantly.
Thanks!
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hi,
I have trained a basic adapter using the adapter training toolkit. I am trying a very basic example of loading it and running inference with it, but am getting the following error:
Passing along InferenceError::inferenceFailed::loadFailed::Error Domain=com.apple.TokenGenerationInference.E5Runner Code=0 "Failed to load model: ANE adapted model load failure: createProgramInstanceWithWeights:modelToken:qos:baseModelIdentifier:owningPid:numWeightFiles:error:: Program load new instance failure (0x170006)." UserInfo={NSLocalizedDescription=Failed to load model: ANE adapted model load failure: createProgramInstanceWithWeights:modelToken:qos:baseModelIdentifier:owningPid:numWeightFiles:error:: Program load new instance failure (0x170006).} in response to ExecuteRequest
Any ideas / direction?
For testing I am including the .fmadapter file inside the app bundle. This is where I load it:
@State private var session: LanguageModelSession? // = LanguageModelSession()
func loadAdapter() async throws {
if let assetURL = Bundle.main.url(forResource: "qasc---afm---4-epochs-adapter", withExtension: "fmadapter") {
print("Asset URL: \(assetURL)")
let adapter = try SystemLanguageModel.Adapter(fileURL: assetURL)
let adaptedModel = SystemLanguageModel(adapter: adapter)
session = LanguageModelSession(model: adaptedModel)
print("Loaded adapter and updated session")
} else {
print("Asset not found in the main bundle.")
}
}
This seems to work fine as I get to the log Loaded adapter and updated session. However when the below inference code runs I get the aforementioned error:
func sendMessage(_ msg: String) {
self.loading = true
if let session = session {
Task {
do {
let modelResponse = try await session.respond(to: msg)
DispatchQueue.main.async {
self.response = modelResponse.content
self.loading = false
}
} catch {
print("Error: \(error)")
DispatchQueue.main.async {
self.loading = false
}
}
}
}
}
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hey everyone,
Is it possible to generate XML using the “Generable” macro of the Foundation Model Framework?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
When I try to run visionOS 26 beta 2 on my device the app crashes on Launch:
dyld[904]: Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC
Referenced from: <A71932DD-53EB-39E2-9733-32E9D961D186> /private/var/containers/Bundle/Application/53866099-99B1-4BBD-8C94-CD022646EB5D/VisionPets.app/VisionPets.debug.dylib
Expected in: <F68A7984-6B48-3958-A48D-E9F541868C62> /System/Library/Frameworks/FoundationModels.framework/FoundationModels
Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC
Referenced from: <A71932DD-53EB-39E2-9733-32E9D961D186> /private/var/containers/Bundle/Application/53866099-99B1-4BBD-8C94-CD022646EB5D/VisionPets.app/VisionPets.debug.dylib
Expected in: <F68A7984-6B48-3958-A48D-E9F541868C62> /System/Library/Frameworks/FoundationModels.framework/FoundationModels
dyld config: DYLD_LIBRARY_PATH=/usr/lib/system/introspection DYLD_INSERT_LIBRARIES=/usr/lib/libLogRedirect.dylib:/usr/lib/libBacktraceRecording.dylib:/usr/lib/libMainThreadChecker.dylib:/usr/lib/libViewDebuggerSupport.dylib:/System/Library/PrivateFrameworks/GPUToolsCapture.framework/GPUToolsCapture
Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC
Referenced from: <A71932DD-53EB-39E2-9733-32E9D961D186> /private/var/containers/Bundle/Application/53866099-99B1-4BBD-8C94-CD022646EB5D/VisionPets.app/VisionPets.debug.dylib
Expected in: <F68A7984-6B48-3958-A48D-E9F541868C62> /System/Library/Frameworks/FoundationModels.framework/FoundationModels
dyld config: DYLD_LIBRARY_PATH=/usr/lib/system/introspection DYLD_INSERT_LIBRARIES=/usr/lib/libLogRedirect.dylib:/usr/lib/libBacktraceRecording.dylib:/usr/lib/libMainThreadChecker.dylib:/usr/lib/libViewDebuggerSupport.dylib:/System/Library/PrivateFrameworks/GPUToolsCapture.framework/GPUToolsCapture
Message from debugger: Terminated due to signal 6
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hello!
I have a swift program that tracks the location of a ball (through the back camera). It seems to be working fine, but the only issue is the run time, particularly my concatenate, normalize, and argmax functions, which are meant to be a 1 to 1 copy of the PyTorch argmax function and the following python lines:
imgs = np.concatenate((img, img_prev, img_preprev), axis=2)
imgs = imgs.astype(np.float32)/255.0
imgs = np.rollaxis(imgs, 2, 0)
inp = np.expand_dims(imgs, axis=0) # used to pass into model
However, I need my program to run in real time and in an ideal world, I want it to run way under real time. Below is a run down of the run times that result from my code:
Starting model inference
Setup took: 0.0 seconds
Resize took: 0.03741896152496338 seconds
Concatenation took: 0.3359949588775635 seconds
Normalization took: 0.9906361103057861 seconds
Model prediction took: 0.3425499200820923 seconds
Argmax took: 28.17007803916931 seconds
Postprocess took: 0.054128050804138184 seconds
Model inference took 29.934185028076172 seconds
Here are the concatenateBuffers, normalizeBuffers, and argmax functions that I use:
func concatenateBuffers(_ buffers: [CVPixelBuffer?]) -> CVPixelBuffer? {
guard buffers.count == 3, let first = buffers[0] else { return nil }
let width = CVPixelBufferGetWidth(first)
let height = CVPixelBufferGetHeight(first)
let targetChannels = 9
var concatenated: CVPixelBuffer?
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue] as CFDictionary
CVPixelBufferCreate(kCFAllocatorDefault, width, height, kCVPixelFormatType_32BGRA, attrs, &concatenated)
guard let output = concatenated else { return nil }
CVPixelBufferLockBaseAddress(output, [])
defer { CVPixelBufferUnlockBaseAddress(output, []) }
guard let outputData = CVPixelBufferGetBaseAddress(output) else { return nil }
let outputPtr = UnsafeMutablePointer<UInt8>(OpaquePointer(outputData))
// Lock all input buffers at once
buffers.forEach { buffer in
guard let buffer = buffer else { return }
CVPixelBufferLockBaseAddress(buffer, .readOnly)
}
defer {
buffers.forEach { CVPixelBufferUnlockBaseAddress($0!, .readOnly) }
}
// Process each input buffer
for (frameIdx, buffer) in buffers.enumerated() {
guard let buffer = buffer,
let inputData = CVPixelBufferGetBaseAddress(buffer) else { continue }
let inputPtr = UnsafePointer<UInt8>(OpaquePointer(inputData))
let bytesPerRow = CVPixelBufferGetBytesPerRow(buffer)
let totalPixels = width * height
// Process all pixels in one go for this frame
for i in 0..<totalPixels {
let y = i / width
let x = i % width
let inputOffset = y * bytesPerRow + x * 4
let outputOffset = i * targetChannels + frameIdx * 3
// BGR order to match numpy
outputPtr[outputOffset] = inputPtr[inputOffset + 2] // B
outputPtr[outputOffset + 1] = inputPtr[inputOffset + 1] // G
outputPtr[outputOffset + 2] = inputPtr[inputOffset] // R
}
}
return output
}
func normalizeBuffer(_ buffer: CVPixelBuffer?) -> MLMultiArray? {
guard let input = buffer else { return nil }
let width = CVPixelBufferGetWidth(input)
let height = CVPixelBufferGetHeight(input)
let channels = 9
CVPixelBufferLockBaseAddress(input, .readOnly)
defer { CVPixelBufferUnlockBaseAddress(input, .readOnly) }
guard let inputData = CVPixelBufferGetBaseAddress(input) else { return nil }
let shape = [1, NSNumber(value: channels), NSNumber(value: height), NSNumber(value: width)]
guard let output = try? MLMultiArray(shape: shape, dataType: .float32) else { return nil }
let inputPtr = inputData.assumingMemoryBound(to: UInt8.self)
let bytesPerRow = CVPixelBufferGetBytesPerRow(input)
let ptr = UnsafeMutablePointer<Float>(OpaquePointer(output.dataPointer))
let totalSize = width * height
for c in 0..<channels {
for idx in 0..<totalSize {
let h = idx / width
let w = idx % width
let inputIdx = h * bytesPerRow + w * channels + c
ptr[c * totalSize + idx] = Float(inputPtr[inputIdx]) / 255.0
}
}
return output
}
func argmax(_ array: MLMultiArray) -> MLMultiArray? {
let shape = array.shape.map { $0.intValue }
guard shape.count == 3,
shape[0] == 1,
shape[1] == 256,
shape[2] == 230400 else {
return nil
}
guard let output = try? MLMultiArray(shape: [1, NSNumber(value: 230400)], dataType: .int32) else { return nil }
let ptr = UnsafePointer<Float>(OpaquePointer(array.dataPointer))
let outputPtr = UnsafeMutablePointer<Int32>(OpaquePointer(output.dataPointer))
let channelSize = 230400
for pos in 0..<230400 {
var maxValue = -Float.infinity
var maxIndex: Int32 = 0
for channel in 0..<256 {
let value = ptr[channel * channelSize + pos]
if value > maxValue {
maxValue = value
maxIndex = Int32(channel)
}
}
outputPtr[pos] = maxIndex
}
return output
}
Are there any glaring areas of inefficiencies that can be reduced to allow for under real time processing whilst following the same logic as found in the python code exactly? Would using Obj-C speed things up for some reason? Are there any tools I can use so I don't have to write these functions myself?
Additionally, in the classes init, function, I tried to check the compute units being used since I feel 0.34 seconds for a singular model prediction is also far too long, but no print statements are showing for some reason:
init() {
guard let loadedModel = try? BallTrackerModel() else {
fatalError("Could not load model")
}
let config = MLModelConfiguration()
config.computeUnits = .all
guard let configuredModel = try? BallTrackerModel(configuration: config) else {
fatalError("Could not configure model")
}
self.model = configuredModel
print("model loaded with compute units \(config.computeUnits.rawValue)")
}
Thanks!
I’ve been testing silent Siri engagement via typing on iOS 18 and also on iOS 26 beta 1 and beta 2. While normal typing works perfectly in type-to-Siri mode, I’ve noticed that swipe-to-type gestures don’t work within Siri’s input field. Interestingly, you still feel the usual haptic feedback associated with swipe typing, but no text appears in the Siri text box. Swipe-to-type continues to work flawlessly in other apps like Messages and Notes, so this seems to be an issue specific to Siri’s typing input handler in these betas. Hopefully, it will be fixed in the next release because swipe typing is essential to my silent Siri workflow.
Topic:
Machine Learning & AI
SubTopic:
Core ML
It appears that there is a size limit when training the Tabular Classification model in CreatML. When the training data is small, the training process completes smoothly after a specified period. However, as the data volume increases, the following issues occur: initially, the training process indicates that it is in progress, but after approximately 24 hours, it is automatically terminated after an hour. I am certain that this is not a manual termination by myself or others, but rather an automatic termination by the machine. This issue persists despite numerous attempts, and the only message displayed is “Training Canceled.” I would appreciate it if someone could explain the reason behind this behavior and provide a solution. Thank you for your assistance.
Is it possible to expose a custom VirtIO device to a Linux guest running inside a VM — likely using QEMU backed by Hypervisor.framework. The guest would see this device as something like /dev/npu0, and it would use a kernel driver + userspace library to submit inference requests.
On the macOS host, these requests would be executed using CoreML, MPSGraph, or BNNS. The results would be passed back to the guest via IPC.
Does the macOS allow this kind of "fake" NPU / GPU
I'm experimenting with downloading an audio file of spoken content, using the Speech framework to transcribe it, then using FoundationModels to clean up the formatting to add paragraph breaks and such. I have this code to do that cleanup:
private func cleanupText(_ text: String) async throws -> String? {
print("Cleaning up text of length \(text.count)...")
let session = LanguageModelSession(instructions: "The content you read is a transcription of a speech. Separate it into paragraphs by adding newlines. Do not modify the content - only add newlines.")
let response = try await session.respond(to: .init(text), generating: String.self)
return response.content
}
The content length is about 29,000 characters. And I get this error:
InferenceError::inferenceFailed::Failed to run inference: Context length of 4096 was exceeded during singleExtend..
Is 4096 a reference to a max input length? Or is this a bug?
This is running on an M1 iPad Air, with iPadOS 26 Seed 1.
I’m building an app that generates images based on text input from a specific text field. However, I’m encountering a problem:
For short prompts like "a cat and a dog", the entire string is sent to the Image Playground, even when I use the extracted method. For longer inputs, the behavior is inconsistent. Sometimes it extracts keywords correctly, but other times it doesn’t extract anything at all.
Since my app relies on generating images based on the extracted keywords, this inconsistency negatively impacts the user experience in my app. How can I make sure that keywords are always extracted from the input string?
Button("Generate", systemImage: "apple.intelligence") {
isPresented = true
}
.imagePlaygroundSheet(isPresented: $isPresented, concepts: [ImagePlaygroundConcept.extracted(from: text, title: textTitle)]) { url in
imageURL = url
}
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence