Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

Local Agentic AI on Mac using MLX: issues solved with Gemma-4
I was keen on trying local models with Xcode agents after watching the WWDC 2026 session Run Local agentic AI on the Mac using MLX https://developer.apple.com/videos/play/wwdc2026/232/ Ran into a few issues while following the three setup steps shown in the session, so I put together a small project with the workarounds I used: https://github.com/jdhark-com/opencode_mlx_bridge/ I needed to use another model than the one demonstrated. The main issue I hit was that running: mlx_lm.server --model mlx-community/gemma-4-e4b-it-4bit failed with: ValueError: Received 126 parameters not in model The workaround in start_xcode_server.py is to load the model with strict=False which resolved the issue for me. opencode.json prompt config really helped to get more verbose feedback from the model. Hopefully this helps anyone else trying to get a local MLX model working as an Xcode agent.
0
1
79
5d
Unions and @Schemas
Hello, I'm still working on my @addToAlbum schema implementation, and I'm exploring how multiple entities could be "destinations" to the intent. I considered using a @UnionValue for this, but I'm running into compiler difficulties trying to get a @UnionValue to conform to @AppEntity(schema: .photos.album) Am I out of luck on a Unionized "target" for the add-to-album intent?
0
0
31
5d
Visual Intelligence for VisionOS in 3rd Party Apps
During the keynote, we saw an amazing example of Siri using Visual Intelligence to identify items in the user's physical space and make inferences based on their size. Do 3rd party apps have the ability to perform this same, or similar actions? For example: User loads a photo of an item or product and clicks a button that says 'Find Item In My Space'. Apple Intelligence is then used to analyze the user's surroundings, and notify the user if the item is present or not present, along with some positional or physical context. Response is shown on the user interface as text, "This item is in your room, 1 meter to your right." Goal: Developers currently can not access the Passthrough Camera on Apple Vision Pro to run AI/ML vision processing models on, for privacy reasons. If Apple Intelligence can look through the camera for the developer, in a privacy-preserving, isolated black box, without providing the image texture to the developer in any way, the user can make use Visual Intelligence features based on their physical surroundings without sacrificing their privacy. Purpose: Visual Intelligence is a key feature for that exemplifies the benefits of Spatial Computing, and examples like the one shown in the Keynote are a perfect use-case for the medium. Since Siri now has this capability, users will come to expect that all apps across VisionOS will be able to perform the same kinds of actions. Developers don't generally want or need direct access to the images of a user's surroundings, and having a local/private method of processing these requests is ideal both for developers concerned with data privacy management and users concerned with developers having too much access to their surroundings. Wearable devices with cameras are a foundational accelerator to users adopting AI in useful ways for their daily life. It is the most natural way to communicate with AI about what is relevant to you at any given time, removes the friction/difficulty of manually scanning good data for AI inferencing, and brings purpose to wearing this class of device every day. As these devices become more common and capable, data privacy becomes even more important. Users will need reassurance that the devices they choose to wear will only have access to observe their surroundings when they choose to allow it, while retaining the capability to use the powerful features that make them worthwhile. Accessibility: Using Visual Intelligence is an extremely powerful accessibility tool (for example; for individuals who have low vision), and can meaningfully improve quality of life. Various applications beyond Siri AI can be designed by developers with very specific inferencing capabilities powered by AI. The future of Visually Intelligent apps should have intentional, unique purposes that users can choose to incorporate in their lives. This will not be a one-size-fits-all Visual Intelligence approach, and will require specific design, training and development to create meaningful capabilities. If this is already possible, amazing! Any resources to learn more would be greatly appreciated. If this is not yet possible, please let us know what we can do to encourage Apple to consider it. Thank you.
2
0
89
5d
Adapter Problem - compatibleAdapterNotFound
Hello. I have a problem with the FoundationModels adapter and the Apple-hosted managed asset pack via TestFlight. I have created an adapter that works fine locally by creating a model via (fileURL: URL) on a real device, but I cannot create a model using background assets by downloading the adapter via TestFlight. Every time I try to get an adapter, the creation of the adapter is interrupted by the compatibleAdapterNotFound error. The aar. archive i created using a special command - xcrun ba-package foundation-models package --adapter-path aurelius1.fmadapter --asset-pack-id fmadapter-aurelius1-9799725 --output-path ./aurelius1.aar --platforms iOS --on-demand\ after that, I replaced "OnDemand": null with "OnDemand": {} in the manifest so that the Transporter could send my archive to the App Store Connect. I followed all the recommendations in this topic - https://origin-devforums.apple.com/forums/thread/823148 ...but unfortunately unsuccessfully I would appreciate any help in solving this problem. here is the code that I use in my app -
5
0
206
5d
backDeploy SystemLanguageModel.tokenCount
SystemLanguageModel.contextSize is back-deployed, but SystemLanguageModel.tokenCount is not. The custom adapter toolkit ships with a ~2.7MB tokenizer with a ~150,000 vocabulary size, but the LICENSE.rtf exclusively permits it's use for training LoRAs. Is it possible to back-deploy tokenCount or for Apple to permit the use of the tokenizer.model for counting tokens? This is important to avoiding context overflow errors.
1
1
735
5d
Autocorrection and predictive text support for additional Cyrillic languages
Hello Apple Keyboard / Internationalization team, I would like to ask about autocorrection and predictive text support for additional Cyrillic-based languages, especially Kazakh, Kyrgyz, Chuvash, and Ingush. These languages use Cyrillic scripts with their own letters, spelling rules, and word-frequency patterns. When users type in these languages, Russian-based autocorrection or missing language-specific correction can produce incorrect suggestions or replacements. My questions are: Are there plans to expand autocorrection and predictive text support for more Cyrillic-based languages? Is there a recommended way for developers or language communities to provide dictionaries, word-frequency lists, corpora, or other linguistic data to help improve autocorrection? Should this type of request be submitted through Feedback Assistant, Developer Forums, or another Apple channel? I have corpus-based frequency data and language resources for multiple Cyrillic-based languages and would be happy to share them if useful. Thank you. Ali Kuzhuget
1
3
60
5d
New Siri AI and indexing stuck
New siri AI and indexing stuck for over 65 hours i’ve tried everything. Hard restarting my phone, putting an airplane mode doing the diagnostics, everything, but nothing still helps. I also downloaded the iOS 27 beta on my iPad Air 13 inch M3 and the same thing happened to it I’m waiting over 24 hours now. Does anybody know how to resolve this because I am tired of waiting and waiting
0
0
128
5d
Approaching Custom VST GUI Automation: Combining local Vision OCR with the new FoundationModels framework for screen-grounding
Hello everyone, I’m working on a project to automate software controls inside non-standard macOS applications—specifically custom-drawn audio plugins (like the Roland TR-909 VST). The Challenge: These VST interfaces do not expose their buttons, knobs, or dials via the standard macOS Accessibility tree (NSAccessibility / event taps). Because they are custom-rendered, standard automation tools are blind to them. My Current Hybrid Approach: I am combining two of Apple's local machine learning technologies to solve this without sending data to the cloud: Step 1: Text-Based Layout Mapping (Vision Framework) I capture a screenshot of the targeted window using Quartz Window Services and run a local VNRecognizeTextRequest to extract coordinates for all text labels. This works exceptionally well for text buttons like "OPTION" or "ABOUT". Step 2: Contextual & Non-Text Element Interpretation (FoundationModels Framework) For controls that lack text labels (such as blank step sequencer buttons, parameter knobs, or toggle light states), I pass the screenshot as an Attachment into the new local LanguageModelSession. I ask the model to ground coordinates relative to the text landmarks mapped in Step 1. Here is a simplified snippet of how I am feeding the visual context into the local model: import Foundation import FoundationModels import Cocoa func analyzePluginInterface(cgImage: CGImage) async { guard SystemLanguageModel.default.isAvailable else { print("Local model not downloaded or available.") return } let instructions = """ You are a screen-aware assistant. Your job is to locate GUI controls on a custom 1024x802 VST window. """ let session = LanguageModelSession(instructions: instructions) do { let response = try await session.respond { "Look at this screenshot of the VST window." Attachment(cgImage) "Locate the blank step-sequencer buttons located below the instrument channel labels." "What are the center coordinates (X, Y) for the first active step?" } print("Model Grounding Output: \(response.content)") } catch { print("Inference failed: \(error)") } } My Questions for the Community: Performance & Latency: The local LanguageModelSession.respond call takes several seconds to run on device. For real-time DAW automation, this is a bottleneck. Has anyone experimented with using a custom LoRA adapter or a smaller model profile to speed up spatial coordinate inference? Coordinate Stability: Multimodal models can sometimes hallucinate coordinates (bounding box values). What strategies are you using to constrain the model output to precise pixel boundaries on varying display scaling configurations (Retina vs non-Retina)? Alternative Solutions: Are there newer on-device vision APIs (perhaps in CoreML or Vision) that are better suited for bounding-box grounding of abstract graphics (like dials/knobs) than a general language model session? Would love to hear how others are approaching screen-aware GUI interpretation with these new frameworks! Thanks!
0
0
44
5d
Siri to be interoperable with Copilot’s version control systems
Thank the elders for their knowledge and teachings. Is there a consensus regarding Siri’s utilization for the Agentic and/ or Copilot version control systems. For example the Copilot within, Edge Browser, the stand alone App, the Xbox copilot, and the M365 copilot App. Does the team have a standardized approach for the’start’ feature that can be prompted whilst utilizing Copilot’s build and generate capabilities? Thank you all and best regards.
1
0
20
5d
siri waitlist
How are some people getting access within 10 minutes while I've been waiting for over a week? In fact, it's been 48 hrs since I joined the waitlist. This is honestly ridiculous. I've met all the requirements, enabled every necessary setting, and still have no access. Meanwhile, others are getting approved almost instantly. If the rollout is based on a waitlist, then it should be handled fairly and consistently. Waiting this long for a feature update is extremely frustrating, especially when there has been little to no communication about the delay.
0
0
36
5d
Use different model in foundation model
Hi everyone, I’m working with the WWDC26 Foundation Models framework and would like to know how to precisely control which model is used. Specifically: On-Device: How can I force SystemLanguageModel() to use AFM 3 Core Advanced (the 20B sparse multimodal variant) instead of automatically falling back to the 3B Core? Is there an API to query or explicitly specify the on-device model variant? Private Cloud Compute (PCC): When using PrivateCloudComputeLanguageModel(), how can I ensure it uses AFM 3 Cloud Pro instead of the regular Cloud model? Does setting ContextOptions.reasoningLevel = .deep guarantee the Pro model, or is it still determined automatically by the backend? So far I can only check model.capabilities, but there’s no clear way to confirm which exact model variant is actually running. Are there more granular APIs, DynamicProfile modifiers, or Instruments methods to achieve precise control? Any insights, official documentation, or WWDC session references would be greatly appreciated!
1
0
128
5d
SpotlightSearchTool arguments: description vs. JSON Schema mismatch → “Failed to parse generated content”
Using SpotlightSearchTool with a custom LanguageModel backend (Apple’s ChatCompletionsLanguageModel from foundation-models-utilities, pointed at an OpenAI-compatible server), every tool call fails with ToolCallError → "Failed to parse generated content." The model follows the tool’s documented "Call format" and emits { root, modelComposition, … }. But the generated parameters schema (FullArguments) requires { "query": { "type": "search", "value": { root, modelComposition, … } } }. Query is a QueryType union and a search must be wrapped in DiscriminatedSearch. Wrapping the args manually makes it parse and search correctly. So the description omits the query + type:"search" envelope the schema demands, which makes the tool uninvokable by any model that follows the documentation (it presumably works only with the on-device model trained on the real format). Is this a known issue / intended? Anyone gotten SpotlightSearchTool working with a non-Apple model? Secondary: CoreSpotlightSource.fetchAttributes seems to have no effect on returned attributes. kMDItemDescription only comes back when the in-query fetchAttributes requests it. Bug or expected?
1
0
52
5d
MLX,MLX LM, MLX LM Server -> Is there a bootstrap repo?
theres a MLX, a MLX LM and a MLX LM Server mentioned. Is there a Bootstrap GitHub repo out there that can be used to directly, and quickly, set up an example of this, without the hassle of setting up, kind of like a bootstrap for "us mere mortals"? And what is the feasibility of using these on a M3 Pro with 18Gb of memory? - can these be bounced between a local M3 Pro and a Tailscale-linked M2 Pro with 36Gb memory? Do both need to be on macOS27 for it to work?
1
0
66
5d
S5 - Specific Siri Security Situation in Slovakia
Dears, I have reached out to Apple Research and Apple Security but this is NOT really for them. This is a developer topic !! Apple Research and Security are trying to find a malicious code, bugs ect, but what I am whitnessing is different and much much deeper into the code. Apple Intelligence in Slovakia is much more limited then in other countries. A specific security configuration due to EU regulations in combination with Siri NOT able to speak or understand Slovak. At low level this combination with a small PUSH with good timing, makes the devices to completely strip themselves off, of all security and trust certifications. What follows is a blank completely from scratch processed reinstall, where the attacker only prepares the "CORRECT" files and information and all the work is done by Apple system itself !!! The result is a complete domination of hardware using the NPU (ANE) chip, which does all the job. And I mean each pixel, sound, connection ect... What is the MOST ALARMING is that due to the proud declaration of customer data privacy this is the exact spot where if something like this happens, Apple will NOT be able to see it. The Customer is then in an extreme situation, where he knows that the devices, accounts, keychain, bank account, each app, each picture or sound.... Everything is compromised, but online help and the retailers are too short for this and further to this Apple DONT HAVE AN OFFICE in Slovakia. Only thing left are the contracted service (repair) shops, which are capable to perform a DFU Restore, which does NOT help. I have requested DFU Restore approx 15x in the last 9 months. Once you turn on and you only pick the language, there is a GLITCH and you know this is back again. A very quick and not too detailed process: It is a very silent and extremely sophisticated takeover without an ovious crash at the beginning. Using various tools, which I can describe and present examples. One variation is a HTML code a DOM which is recursive, calling functions and cancelling. Too many functions with offset which results in a graphics freeze, overload or similiar.. The object itself is not frozen and it is carefully prepared !! It will mostly copy and clone the target and NEST inside without knowing. What happens here is that this recursive DOM was applied and therefore the SHUTDOWN MONITOR LOG occured. This froze also mds index which blocked the mounting and unmounting of Volumes.... This is ofcourse carefully instrumented not to raise any attention. Same structure can be used in any code, any language, pdf, it can be nested in a wallpaper or a standard image, library, anywhere ... I can provide a proof and a functional script... The install log is showing - Untracked client connected - RemoteManagement which REINSTALLED the OS. After that Launchd skipps almost all tasks on the next run .... After this mounting volumes block, the system will not restart as standard, insted forced to early boot as possible which starts with PKI TRUST and SIRI UNDERSTANDING ... The PKI TRUST is manipulated and prepared and Siri is not called by the system as Apple Inteligence. So with reinstalled and carefully prepared OS, Launchd who skipped most tasks at the start and without proper encryption. There is a direct open path to Siri and her ASR HAMMERING.... I have personally checked almost 10 different electronics shops and checked the console on each Macbook that was free to try. In each of them these four Protolol logs were the exact same !!!! But after that a brutal iphone reinstall and even over lockdown mode reinstall will follow... Can also provide logs and information... And there is a SIMPLE LOGIC PARADOX with HUGE impact. Any document can be signed by Apple in a second. That is how the PKI TRUST was manipulated without any problem. That is also extremely important ... I can present this, but I must know that somebody is listening.... otherwise the only way is press... Apple Research and Security is blind here and I simply cannot get any answer.... If you know anybody in Slovakia, tell them to go to check this out !!! Get this information to Somebody who could just check it please .... This is probably the largest Supply Chain Attack ever ... And all it takes is a phone call to iStores to Slovakia so they can check for you ... From what I can see, now an update is prepared for Siri. It is based on Ruby but mostly Nokogiri and Gumbo. It will be presented as a 8 bit range training for local LLM, as super fast, but really it will be a combination of Hohner Electric Piano from the 70s with 8 bit sound which will use DTrace and its ROOT privileges. The sound is a square frequency which can be used to hide communication or something we dont know yet. And it does not matter anymore... With a direct connection to GitHub or just the internet ... Any code can be signed and stored anywhere .... The codename is ELECTRA, from what I know this tag was used for jailbreak of Siri in the past. So I belive this will be the final act ... Is there somebody to whom I can speak to about this ?? No generic mails ... THX Mike
0
0
28
5d
Local Agentic AI on Mac using MLX: issues solved with Gemma-4
I was keen on trying local models with Xcode agents after watching the WWDC 2026 session Run Local agentic AI on the Mac using MLX https://developer.apple.com/videos/play/wwdc2026/232/ Ran into a few issues while following the three setup steps shown in the session, so I put together a small project with the workarounds I used: https://github.com/jdhark-com/opencode_mlx_bridge/ I needed to use another model than the one demonstrated. The main issue I hit was that running: mlx_lm.server --model mlx-community/gemma-4-e4b-it-4bit failed with: ValueError: Received 126 parameters not in model The workaround in start_xcode_server.py is to load the model with strict=False which resolved the issue for me. opencode.json prompt config really helped to get more verbose feedback from the model. Hopefully this helps anyone else trying to get a local MLX model working as an Xcode agent.
Replies
0
Boosts
1
Views
79
Activity
5d
Unions and @Schemas
Hello, I'm still working on my @addToAlbum schema implementation, and I'm exploring how multiple entities could be "destinations" to the intent. I considered using a @UnionValue for this, but I'm running into compiler difficulties trying to get a @UnionValue to conform to @AppEntity(schema: .photos.album) Am I out of luck on a Unionized "target" for the add-to-album intent?
Replies
0
Boosts
0
Views
31
Activity
5d
Visual Intelligence for VisionOS in 3rd Party Apps
During the keynote, we saw an amazing example of Siri using Visual Intelligence to identify items in the user's physical space and make inferences based on their size. Do 3rd party apps have the ability to perform this same, or similar actions? For example: User loads a photo of an item or product and clicks a button that says 'Find Item In My Space'. Apple Intelligence is then used to analyze the user's surroundings, and notify the user if the item is present or not present, along with some positional or physical context. Response is shown on the user interface as text, "This item is in your room, 1 meter to your right." Goal: Developers currently can not access the Passthrough Camera on Apple Vision Pro to run AI/ML vision processing models on, for privacy reasons. If Apple Intelligence can look through the camera for the developer, in a privacy-preserving, isolated black box, without providing the image texture to the developer in any way, the user can make use Visual Intelligence features based on their physical surroundings without sacrificing their privacy. Purpose: Visual Intelligence is a key feature for that exemplifies the benefits of Spatial Computing, and examples like the one shown in the Keynote are a perfect use-case for the medium. Since Siri now has this capability, users will come to expect that all apps across VisionOS will be able to perform the same kinds of actions. Developers don't generally want or need direct access to the images of a user's surroundings, and having a local/private method of processing these requests is ideal both for developers concerned with data privacy management and users concerned with developers having too much access to their surroundings. Wearable devices with cameras are a foundational accelerator to users adopting AI in useful ways for their daily life. It is the most natural way to communicate with AI about what is relevant to you at any given time, removes the friction/difficulty of manually scanning good data for AI inferencing, and brings purpose to wearing this class of device every day. As these devices become more common and capable, data privacy becomes even more important. Users will need reassurance that the devices they choose to wear will only have access to observe their surroundings when they choose to allow it, while retaining the capability to use the powerful features that make them worthwhile. Accessibility: Using Visual Intelligence is an extremely powerful accessibility tool (for example; for individuals who have low vision), and can meaningfully improve quality of life. Various applications beyond Siri AI can be designed by developers with very specific inferencing capabilities powered by AI. The future of Visually Intelligent apps should have intentional, unique purposes that users can choose to incorporate in their lives. This will not be a one-size-fits-all Visual Intelligence approach, and will require specific design, training and development to create meaningful capabilities. If this is already possible, amazing! Any resources to learn more would be greatly appreciated. If this is not yet possible, please let us know what we can do to encourage Apple to consider it. Thank you.
Replies
2
Boosts
0
Views
89
Activity
5d
Python 3.13 macOS wheel for coreai-core
Will there be a wheel published on pypi.org for Python 3.13 on macOS? There is a 3.13 wheel for Linux, but not macOS.
Replies
0
Boosts
0
Views
26
Activity
5d
Adapter Problem - compatibleAdapterNotFound
Hello. I have a problem with the FoundationModels adapter and the Apple-hosted managed asset pack via TestFlight. I have created an adapter that works fine locally by creating a model via (fileURL: URL) on a real device, but I cannot create a model using background assets by downloading the adapter via TestFlight. Every time I try to get an adapter, the creation of the adapter is interrupted by the compatibleAdapterNotFound error. The aar. archive i created using a special command - xcrun ba-package foundation-models package --adapter-path aurelius1.fmadapter --asset-pack-id fmadapter-aurelius1-9799725 --output-path ./aurelius1.aar --platforms iOS --on-demand\ after that, I replaced "OnDemand": null with "OnDemand": {} in the manifest so that the Transporter could send my archive to the App Store Connect. I followed all the recommendations in this topic - https://origin-devforums.apple.com/forums/thread/823148 ...but unfortunately unsuccessfully I would appreciate any help in solving this problem. here is the code that I use in my app -
Replies
5
Boosts
0
Views
206
Activity
5d
backDeploy SystemLanguageModel.tokenCount
SystemLanguageModel.contextSize is back-deployed, but SystemLanguageModel.tokenCount is not. The custom adapter toolkit ships with a ~2.7MB tokenizer with a ~150,000 vocabulary size, but the LICENSE.rtf exclusively permits it's use for training LoRAs. Is it possible to back-deploy tokenCount or for Apple to permit the use of the tokenizer.model for counting tokens? This is important to avoiding context overflow errors.
Replies
1
Boosts
1
Views
735
Activity
5d
Autocorrection and predictive text support for additional Cyrillic languages
Hello Apple Keyboard / Internationalization team, I would like to ask about autocorrection and predictive text support for additional Cyrillic-based languages, especially Kazakh, Kyrgyz, Chuvash, and Ingush. These languages use Cyrillic scripts with their own letters, spelling rules, and word-frequency patterns. When users type in these languages, Russian-based autocorrection or missing language-specific correction can produce incorrect suggestions or replacements. My questions are: Are there plans to expand autocorrection and predictive text support for more Cyrillic-based languages? Is there a recommended way for developers or language communities to provide dictionaries, word-frequency lists, corpora, or other linguistic data to help improve autocorrection? Should this type of request be submitted through Feedback Assistant, Developer Forums, or another Apple channel? I have corpus-based frequency data and language resources for multiple Cyrillic-based languages and would be happy to share them if useful. Thank you. Ali Kuzhuget
Replies
1
Boosts
3
Views
60
Activity
5d
New Siri AI and indexing stuck
New siri AI and indexing stuck for over 65 hours i’ve tried everything. Hard restarting my phone, putting an airplane mode doing the diagnostics, everything, but nothing still helps. I also downloaded the iOS 27 beta on my iPad Air 13 inch M3 and the same thing happened to it I’m waiting over 24 hours now. Does anybody know how to resolve this because I am tired of waiting and waiting
Replies
0
Boosts
0
Views
128
Activity
5d
Approaching Custom VST GUI Automation: Combining local Vision OCR with the new FoundationModels framework for screen-grounding
Hello everyone, I’m working on a project to automate software controls inside non-standard macOS applications—specifically custom-drawn audio plugins (like the Roland TR-909 VST). The Challenge: These VST interfaces do not expose their buttons, knobs, or dials via the standard macOS Accessibility tree (NSAccessibility / event taps). Because they are custom-rendered, standard automation tools are blind to them. My Current Hybrid Approach: I am combining two of Apple's local machine learning technologies to solve this without sending data to the cloud: Step 1: Text-Based Layout Mapping (Vision Framework) I capture a screenshot of the targeted window using Quartz Window Services and run a local VNRecognizeTextRequest to extract coordinates for all text labels. This works exceptionally well for text buttons like "OPTION" or "ABOUT". Step 2: Contextual & Non-Text Element Interpretation (FoundationModels Framework) For controls that lack text labels (such as blank step sequencer buttons, parameter knobs, or toggle light states), I pass the screenshot as an Attachment into the new local LanguageModelSession. I ask the model to ground coordinates relative to the text landmarks mapped in Step 1. Here is a simplified snippet of how I am feeding the visual context into the local model: import Foundation import FoundationModels import Cocoa func analyzePluginInterface(cgImage: CGImage) async { guard SystemLanguageModel.default.isAvailable else { print("Local model not downloaded or available.") return } let instructions = """ You are a screen-aware assistant. Your job is to locate GUI controls on a custom 1024x802 VST window. """ let session = LanguageModelSession(instructions: instructions) do { let response = try await session.respond { "Look at this screenshot of the VST window." Attachment(cgImage) "Locate the blank step-sequencer buttons located below the instrument channel labels." "What are the center coordinates (X, Y) for the first active step?" } print("Model Grounding Output: \(response.content)") } catch { print("Inference failed: \(error)") } } My Questions for the Community: Performance & Latency: The local LanguageModelSession.respond call takes several seconds to run on device. For real-time DAW automation, this is a bottleneck. Has anyone experimented with using a custom LoRA adapter or a smaller model profile to speed up spatial coordinate inference? Coordinate Stability: Multimodal models can sometimes hallucinate coordinates (bounding box values). What strategies are you using to constrain the model output to precise pixel boundaries on varying display scaling configurations (Retina vs non-Retina)? Alternative Solutions: Are there newer on-device vision APIs (perhaps in CoreML or Vision) that are better suited for bounding-box grounding of abstract graphics (like dials/knobs) than a general language model session? Would love to hear how others are approaching screen-aware GUI interpretation with these new frameworks! Thanks!
Replies
0
Boosts
0
Views
44
Activity
5d
New siri AI wait list
Its been 3 days since i had requested for Siri AI , still in the waitlist. This is disappointing .
Replies
2
Boosts
1
Views
114
Activity
5d
Siri to be interoperable with Copilot’s version control systems
Thank the elders for their knowledge and teachings. Is there a consensus regarding Siri’s utilization for the Agentic and/ or Copilot version control systems. For example the Copilot within, Edge Browser, the stand alone App, the Xbox copilot, and the M365 copilot App. Does the team have a standardized approach for the’start’ feature that can be prompted whilst utilizing Copilot’s build and generate capabilities? Thank you all and best regards.
Replies
1
Boosts
0
Views
20
Activity
5d
A question about new iOS 27 Siri and Apple Intelligence
I have a question about the new Siri on iOS 27. That Im developing an app where people can order like dif sound technologies such as speakers and earphone and goes on. But can I merge it with new Siri that if customers can order through Siri and she will make the order?
Replies
0
Boosts
0
Views
51
Activity
5d
Speech generation by the new Foundation Model
During the Keynote (at 30m:20s) Craig Federighi mentions the second, "even more powerful version of our on-device model" and that this model lets supported products understand and generate speech. Is there any public API for generating speech using this model?
Replies
0
Boosts
0
Views
29
Activity
5d
siri waitlist
How are some people getting access within 10 minutes while I've been waiting for over a week? In fact, it's been 48 hrs since I joined the waitlist. This is honestly ridiculous. I've met all the requirements, enabled every necessary setting, and still have no access. Meanwhile, others are getting approved almost instantly. If the rollout is based on a waitlist, then it should be handled fairly and consistently. Waiting this long for a feature update is extremely frustrating, especially when there has been little to no communication about the delay.
Replies
0
Boosts
0
Views
36
Activity
5d
Use different model in foundation model
Hi everyone, I’m working with the WWDC26 Foundation Models framework and would like to know how to precisely control which model is used. Specifically: On-Device: How can I force SystemLanguageModel() to use AFM 3 Core Advanced (the 20B sparse multimodal variant) instead of automatically falling back to the 3B Core? Is there an API to query or explicitly specify the on-device model variant? Private Cloud Compute (PCC): When using PrivateCloudComputeLanguageModel(), how can I ensure it uses AFM 3 Cloud Pro instead of the regular Cloud model? Does setting ContextOptions.reasoningLevel = .deep guarantee the Pro model, or is it still determined automatically by the backend? So far I can only check model.capabilities, but there’s no clear way to confirm which exact model variant is actually running. Are there more granular APIs, DynamicProfile modifiers, or Instruments methods to achieve precise control? Any insights, official documentation, or WWDC session references would be greatly appreciated!
Replies
1
Boosts
0
Views
128
Activity
5d
SpotlightSearchTool arguments: description vs. JSON Schema mismatch → “Failed to parse generated content”
Using SpotlightSearchTool with a custom LanguageModel backend (Apple’s ChatCompletionsLanguageModel from foundation-models-utilities, pointed at an OpenAI-compatible server), every tool call fails with ToolCallError → "Failed to parse generated content." The model follows the tool’s documented "Call format" and emits { root, modelComposition, … }. But the generated parameters schema (FullArguments) requires { "query": { "type": "search", "value": { root, modelComposition, … } } }. Query is a QueryType union and a search must be wrapped in DiscriminatedSearch. Wrapping the args manually makes it parse and search correctly. So the description omits the query + type:"search" envelope the schema demands, which makes the tool uninvokable by any model that follows the documentation (it presumably works only with the on-device model trained on the real format). Is this a known issue / intended? Anyone gotten SpotlightSearchTool working with a non-Apple model? Secondary: CoreSpotlightSource.fetchAttributes seems to have no effect on returned attributes. kMDItemDescription only comes back when the in-query fetchAttributes requests it. Bug or expected?
Replies
1
Boosts
0
Views
52
Activity
5d
Spoken Locale Exposure (Dynamic Language Routing)
Does the App Intents framework expose the user's active spoken Siri locale (e.g., ja-JP, fr-FR) directly within the perform() context, or must the extension rely on the system's global locale setting? If a user switches Siri's language dynamically, how is that locale string propagated to the intent execution block?
Replies
1
Boosts
0
Views
51
Activity
5d
Siri Ai
I don’t think this Siri waitlist is normal, I am on iPhone 16 Pro Max in the US set to English and I’ve been on the new Siri waitlist for >48 hours. Is this a bug?
Replies
0
Boosts
0
Views
63
Activity
5d
MLX,MLX LM, MLX LM Server -> Is there a bootstrap repo?
theres a MLX, a MLX LM and a MLX LM Server mentioned. Is there a Bootstrap GitHub repo out there that can be used to directly, and quickly, set up an example of this, without the hassle of setting up, kind of like a bootstrap for "us mere mortals"? And what is the feasibility of using these on a M3 Pro with 18Gb of memory? - can these be bounced between a local M3 Pro and a Tailscale-linked M2 Pro with 36Gb memory? Do both need to be on macOS27 for it to work?
Replies
1
Boosts
0
Views
66
Activity
5d
S5 - Specific Siri Security Situation in Slovakia
Dears, I have reached out to Apple Research and Apple Security but this is NOT really for them. This is a developer topic !! Apple Research and Security are trying to find a malicious code, bugs ect, but what I am whitnessing is different and much much deeper into the code. Apple Intelligence in Slovakia is much more limited then in other countries. A specific security configuration due to EU regulations in combination with Siri NOT able to speak or understand Slovak. At low level this combination with a small PUSH with good timing, makes the devices to completely strip themselves off, of all security and trust certifications. What follows is a blank completely from scratch processed reinstall, where the attacker only prepares the "CORRECT" files and information and all the work is done by Apple system itself !!! The result is a complete domination of hardware using the NPU (ANE) chip, which does all the job. And I mean each pixel, sound, connection ect... What is the MOST ALARMING is that due to the proud declaration of customer data privacy this is the exact spot where if something like this happens, Apple will NOT be able to see it. The Customer is then in an extreme situation, where he knows that the devices, accounts, keychain, bank account, each app, each picture or sound.... Everything is compromised, but online help and the retailers are too short for this and further to this Apple DONT HAVE AN OFFICE in Slovakia. Only thing left are the contracted service (repair) shops, which are capable to perform a DFU Restore, which does NOT help. I have requested DFU Restore approx 15x in the last 9 months. Once you turn on and you only pick the language, there is a GLITCH and you know this is back again. A very quick and not too detailed process: It is a very silent and extremely sophisticated takeover without an ovious crash at the beginning. Using various tools, which I can describe and present examples. One variation is a HTML code a DOM which is recursive, calling functions and cancelling. Too many functions with offset which results in a graphics freeze, overload or similiar.. The object itself is not frozen and it is carefully prepared !! It will mostly copy and clone the target and NEST inside without knowing. What happens here is that this recursive DOM was applied and therefore the SHUTDOWN MONITOR LOG occured. This froze also mds index which blocked the mounting and unmounting of Volumes.... This is ofcourse carefully instrumented not to raise any attention. Same structure can be used in any code, any language, pdf, it can be nested in a wallpaper or a standard image, library, anywhere ... I can provide a proof and a functional script... The install log is showing - Untracked client connected - RemoteManagement which REINSTALLED the OS. After that Launchd skipps almost all tasks on the next run .... After this mounting volumes block, the system will not restart as standard, insted forced to early boot as possible which starts with PKI TRUST and SIRI UNDERSTANDING ... The PKI TRUST is manipulated and prepared and Siri is not called by the system as Apple Inteligence. So with reinstalled and carefully prepared OS, Launchd who skipped most tasks at the start and without proper encryption. There is a direct open path to Siri and her ASR HAMMERING.... I have personally checked almost 10 different electronics shops and checked the console on each Macbook that was free to try. In each of them these four Protolol logs were the exact same !!!! But after that a brutal iphone reinstall and even over lockdown mode reinstall will follow... Can also provide logs and information... And there is a SIMPLE LOGIC PARADOX with HUGE impact. Any document can be signed by Apple in a second. That is how the PKI TRUST was manipulated without any problem. That is also extremely important ... I can present this, but I must know that somebody is listening.... otherwise the only way is press... Apple Research and Security is blind here and I simply cannot get any answer.... If you know anybody in Slovakia, tell them to go to check this out !!! Get this information to Somebody who could just check it please .... This is probably the largest Supply Chain Attack ever ... And all it takes is a phone call to iStores to Slovakia so they can check for you ... From what I can see, now an update is prepared for Siri. It is based on Ruby but mostly Nokogiri and Gumbo. It will be presented as a 8 bit range training for local LLM, as super fast, but really it will be a combination of Hohner Electric Piano from the 70s with 8 bit sound which will use DTrace and its ROOT privileges. The sound is a square frequency which can be used to hide communication or something we dont know yet. And it does not matter anymore... With a direct connection to GitHub or just the internet ... Any code can be signed and stored anywhere .... The codename is ELECTRA, from what I know this tag was used for jailbreak of Siri in the past. So I belive this will be the final act ... Is there somebody to whom I can speak to about this ?? No generic mails ... THX Mike
Replies
0
Boosts
0
Views
28
Activity
5d