\

Inside the M4 Apple Neural Engine, Part 1: Reverse Engineering

233 points - yesterday at 5:11 PM

Source
  • LatencyKills

    today at 3:38 PM

    I worked on the Xcode team for years and know the lengths Apple goes to make this stuff difficult to figure out.

    I just wanted to say that you’ve done an excellent job and am looking forward to the 3rd installment.

      • RetpolineDrama

        today at 5:58 PM

        >I worked on the Xcode team for years

        Why did you guys remove the ability to detach the console and move it to another window?

          • estimator7292

            today at 6:25 PM

            [flagged]

    • blobbers

      today at 9:56 PM

      Can someone help me understand when these neural engines kick in in open source software?

      I typically use python ML libraries like lightgbm, sklearn, xgboost etc.

      I also use numpy for large correlation matrices, covariance etc.

      Are these operations accelerated? Is there a simple way to benchmark?

      I see a lot of benchmarks on what look like C functions, but today in my jobs I rely on higher level libraries. I don't know if they perform any better on apple HW, and unless they have a flag like use_ane I'm inclined to think they do better.

      Of course chatgpt suggested I benchmark an Intel Mac vs. newer apple silicon. Thanks chatgpt, there's a reason people still hate AI.

        • zozbot234

          today at 10:11 PM

          > when these neural engines kick in in open source software?

          It mostly doesn't because NPUs are bespoke and vendor-specific (which incents neglect by software devs working on open source numerics and ML/AI infrastructure), and the Apple ANE is no exception. Part of this effort is most likely about fixing that for the specific case of the Apple ANE.

            • blobbers

              today at 10:52 PM

              Part of which effort? The Reverse engineering is so it can be used blog article?

              I just think: great it seems like I'm paying for a hardware accelerator that makes Siri go faster. And I use siri on my laptop exactly 0 times in the last infinite years.

      • Octoth0rpe

        today at 3:13 PM

        Part 2 has benchmarks: https://maderix.substack.com/p/inside-the-m4-apple-neural-en...

        6.6 FLOPS/W, plus the ability to completely turn off when not in use, so 0W at idle.

          • AceJohnny2

            today at 10:53 PM

            But not 38 TOPS that Apple claims, with the weak explanation of

            > Apple’s “38 TOPS INT8” is computed as 19 TFLOPS FP16 × 2, following the industry convention of counting INT8 operations as 2× the FP16 rate. But the hardware doesn’t actually execute INT8 operations twice as fast.

            Why would Apple follow that convention when the hardware explicitly doesn't seems like a more straight-faced lie that I expect from Apple

        • notepad0x90

          today at 7:47 PM

          I've been guilty of this myself, but every other comment here is like "What about <insert something unrelated to the topic but related to apple>".

          • eleventyseven

            today at 3:31 PM

            > Throughout this series, “we” refers to maderix (human) and Claude Opus 4.6 (by Anthropic) working as a pair. The reverse engineering, benchmarking, and training code were developed collaboratively

            Sure, "collaboratively." Why would I ever trust a vibe coded analysis? How do I, a non expert in this niche, know that Opus isn't pulling a fast one on both of us? LLMs write convincing bullshit that even fools experts. Have you manually verified each fact in this piece? I doubt it. Thanks for the disclaimer, it saved me from having to read it.

              • Anonbrit

                today at 4:19 PM

                Humans also write endless amounts of convincing bullshit, and have done since time immemorial. False papers and faked results have been a growing scourge in academia before LLMs were a thing, and that's just counting the intentional fraud - the reproducibility crisis in science, especially medical and psychological science, affects even the best designed and well intentioned of studies.

                Humans also make mistakes and assumptions while reverse engineering, so it will always need more engineers to go through the results, test things

                • withinboredom

                  today at 3:59 PM

                  Claude likes to hide bad benchmarks from you, so it will show you where you are clearly winning. You even see some weird benchmarks in the article.

              • zozbot234

                today at 7:46 PM

                Much of this information we already knew the very basics of from documentation of the M1/M2 ANE as accessed via bare-metal from Asahi Linux, but it's nice to see confirmation and it being explored in further depth. Note that according to OP Parts 1/2 for very large matmuls CoreML adds little to no overhead compared to the lower-level interface, so there seems to be plenty of scope for supporting ANE for prefill in local AI frameworks. Decode is generally memory-bandwidth limited unless context is very large, and the ANE requires special handling (converting from matmul to 1x1 convolution as described here is wasteful of memory bandwidth, as is potentially dequantizing to INT8/FP16 in memory) so it's less of a clear win.

                • behnamoh

                  today at 4:26 PM

                  It's insane that the source code of ANE is not available even to the MLX team, possibly one of the reasons Awni (MLX project head) left Apple.

                    • mathisfun123

                      today at 4:29 PM

                      [flagged]

                        • behnamoh

                          today at 4:30 PM

                          Yes I haven't worked at a hardware company, nothing to be ashamed of!

                            • timcobb

                              today at 5:27 PM

                              I'm not op but I don't think op meant to shame, I understand the construction "tell me you're... without telling me" as a way to highlight that something is unexpected to people who haven't done something, that is that something is particularly unintuitive without some special experience.

                                • webdevver

                                  today at 5:41 PM

                                  he did a reddit (cringe) and now must be punished for it (the text becomes an absolutely fucking unreadable shade of light grey)

                              • webdevver

                                today at 5:40 PM

                                actually, it really is not neccesarily a 'hardware company' thing. ive been in 'hardware companies' where the rtl was just as available for viewing as the rest of the firmware/software.

                                in big hardware companies, things start getting siloed, but that probably has more to do with big companies (seemingly invariably) operating as a union of fiefdoms (dunbar-number-ification?)

                                • mathisfun123

                                  today at 6:50 PM

                                  > It's insane that the source code of ANE is not available even to the MLX team

                                  no it's not insane - it's completely mundane policy. that's my point - that you're calling something out as insane with exactly zero experience (which is the actually insane thing...).

                                    • 9dev

                                      today at 9:05 PM

                                      on that line of argument, nobody would have ever called out the emperor for not wearing any clothes, civilians would not go to peace protests, and nobody would ever improve things by looking at something from another angle.

                                        • mathisfun123

                                          today at 9:09 PM

                                          This is a completely asinine take - you're not observing the emperor with no clothes here - you're completely outside the kingdom hypothesizing that the emperor has no clothes. To wit: you don't actually know the the ANE "source" isn't available to MLX. Hint: it probably is but there's just red tape involved.

                      • GeekyBear

                        today at 4:22 PM

                        The recent news is that Apple is supposedly replacing the Core ML framework with an updated version that will make it easier to integrate third party LLMs into your apps.

                        > the company is also planning a few other software-based AI upgrades, including a new framework called Core AI. The idea is to replace the long-existing Core ML with something a bit more modern.

                        https://www.bloomberg.com/news/newsletters/2026-03-01/apple-...

                        • love2read

                          today at 2:41 PM

                          This article was clearly written by a human (and AI) but still has a few "LLMisms" such as:

                          - The key insight - [CoreML] doesn't XXX. It YYY.

                          With that being said, this is a highly informative article that I enjoyed thoroughly! :)

                          The article links to their own Github repo: https://github.com/maderix/ANE

                            • walthamstow

                              today at 3:06 PM

                              We've got about a year before so many people are interacting with LLMs on a daily basis that its style starts to reverse infect human speech and writing

                                • gogopromptless

                                  today at 10:24 PM

                                  It's already happened to me. I've started to have dreams where instead of some sort of interpersonal struggle the entire dream is just a chatbot UI viewport and I'm arguing with an LLM streaming the responses in. Which is super trippy when I become aware its a dream. In the old days I'd dream about playing chess against myself and lose which was quite bizzare feeling because my brain was running both players. But thats totally normal compared to having my brain pretend to be an LLM inside a dream.

                                  • baxtr

                                    today at 5:24 PM

                                    Great insight – Would you like to try and identify some specific "AI-isms" that you've noticed creeping into your own writing or your colleagues' emails lately?

                                    • pixl97

                                      today at 3:26 PM

                                      This said, there were people that talked like this before LLMs, it didn't develop this whole cloth.

                                        • pcrh

                                          today at 6:21 PM

                                          The article above doesn't read well, at all.

                                          It's not my subject, but it reads as a list of things. There's little exposition.

                                            • dylan604

                                              today at 10:38 PM

                                              Gawd Damn LISTICLES!!!! And all of those articles that list in bullet points at the top of the article the summary of the article. And all of those people saying they don't want to read exposition, just give me the bullet points.

                                          • DrScientist

                                            today at 4:27 PM

                                            Exactly. LLM's are mimics.

                                            People seem to be going around pointing out that people talk like parrots, when in reality it's parrots talk like people.

                                              • pixl97

                                                today at 5:02 PM

                                                I mean, it's both.

                                                Did you develop your own whole language at any point to describe the entire world? No, you, me, and society mimic what is around us.

                                                Humans have the advantage, at least at this point, of being a continuous learning device so we adapt and change with the language use around us.

                                        • Angostura

                                          today at 3:12 PM

                                          My honest take? You're probably right

                                            • sholladay

                                              today at 4:19 PM

                                              You are absolutely right.

                                              Here is why you are correct:

                                              - I see what you did there.

                                              - You are always right.

                                      • rafram

                                        today at 4:38 PM

                                        Also the Prior Art section, which has telltale repetition of useless verbs like "documenting," "providing insight into," and "confirming" on each line. This was definitely AI-written, at least in part.

                                          • tzs

                                            today at 7:39 PM

                                            Below are the items from that section. How should they be written to not look like an AI?

                                            > hollance/neural-engine — Matthijs Hollemans’ comprehensive community documentation of ANE behavior, performance characteristics, and supported operations. The single best existing resource on ANE.

                                            > mdaiter/ane — Early reverse engineering with working Python and Objective-C samples, documenting the ANECompiler framework and IOKit dispatch.

                                            > eiln/ane — A reverse-engineered Linux driver for ANE (Asahi Linux project), providing insight into the kernel-level interface.

                                            > apple/ml-ane-transformers — Apple’s own reference implementation of transformers optimized for ANE, confirming design patterns like channel-first layout and 1×1 conv preference.

                                    • mattlangston

                                      today at 2:49 PM

                                      The future is bright for software engineers.

                                      The big takeaway isn't reverse engineering the ANE per se, but what Manjeet could do with his software engineering skills when accelerated by AI.

                                      This is a good example of the present state of software engineering. Not future state - present state.

                                      • grey-area

                                        today at 6:52 PM

                                        If only they could fix the iOS autocomplete, which is getting worse with every iteration.

                                          • today at 6:57 PM

                                        • giancarlostoro

                                          today at 5:06 PM

                                          Reverse Engineering with AI is only going to get better. I have seen some crazy things friends of mine have done with Claude alone. Let's just says SaaS isn't the only industry that could one day suffer.

                                          • mayhemducks

                                            today at 5:06 PM

                                            I never realized just how much hardware engineering Apple dedicated to enabling people to type faster with their thumbs!

                                            • kamranjon

                                              today at 3:28 PM

                                              I have always wondered if the neural engine could be used for training - pretty excited for part 3 of this to see if the juice is actually worth the squeeze

                                                • juancn

                                                  today at 5:20 PM

                                                  In principle most if not all inference hardware should be usable for training.

                                                  Efficiency is the question.

                                              • daoistmonk

                                                today at 4:13 PM

                                                Tangential: Is anyone doing something similar to accelerate the support matrix of Linux on anything higher than M2?

                                                • msie

                                                  today at 5:14 PM

                                                  I remember the good old days when Apple was desperate for developers and produced great documentation and there were a lot of great 3rd-party books too. You can't just give out awards in hopes that someone will make that great app.

                                                    • pstuart

                                                      today at 5:43 PM

                                                      Yeah, the Inside Macintosh guides were epic.

                                                  • ericol

                                                    today at 6:45 PM

                                                    > human intuition driving the exploration

                                                    This, a thousand times this.

                                                    For me, what AI brings is augmented humans. Just as we don't calculate on paper anymore, what is the reason of doing things by hand when a machine in X times better.

                                                    Want to code by hand, as artisans of old? Suit yourself.

                                                    I, for one, love the smell of burning chrome.

                                                      • pklausler

                                                        today at 6:49 PM

                                                        If "AI" were doing anything more than repeating content from the web without attribution, I might agree with you.

                                                    • FL33TW00D

                                                      today at 4:50 PM

                                                      Unreadable Claude slop

                                                      • techpulse_x

                                                        today at 3:00 PM

                                                        [dead]

                                                        • poszlem

                                                          today at 1:51 PM

                                                          Genuine question, not trying to throw a shade or anything, but are those cores actually useful with the state of apple intelligence being what it is?

                                                            • rahkiin

                                                              today at 1:57 PM

                                                              They are also used by ML models that are deeply integrated in macos and ios without you knowing. Like object and text detection in images.

                                                                • geerlingguy

                                                                  today at 3:15 PM

                                                                  And help in Photos, Final Cut Pro, and other apps.

                                                                  • willis936

                                                                    today at 3:52 PM

                                                                    I wish they would (or wouldn't if they are) hook it up to the ios keyboard.

                                                                • dagmx

                                                                  today at 3:27 PM

                                                                  If you strip away the branding, Apple has and continues to ship a ton of algorithms that likely use the ANE and end users can use CoreML to do the same.

                                                                  Just some things that people will likely take for granted that IIRC Apple have said use the ANE or at least would likely benefit from it: object recognition, subject extraction from images and video, content analysis, ARKit, spam detection, audio transcription.

                                                                    • sroussey

                                                                      today at 4:23 PM

                                                                      Don’t forget FaceID and many of the image manipulation.

                                                                      And while everyone else went to more powerful giant LLMs, Apple moved most of Siri from the cloud to your device. Though they do use both (which you can see when Siri corrects itself during transcription—you get the local Siri version corrected later by the cloud version).

                                                                  • stetrain

                                                                    today at 2:42 PM

                                                                    Apple's OSes run a lot of local ML models for many tasks that aren't branded as Apple Intelligence, and they have done so for many years now.

                                                                    • llm_nerd

                                                                      today at 2:34 PM

                                                                      https://dennisforbes.ca/blog/microblog/2026/02/apple-neural-...

                                                                        • malshe

                                                                          today at 5:42 PM

                                                                          This is a nice article. Thanks for sharing.

                                                                      • esafak

                                                                        today at 2:24 PM

                                                                        You can convert your own ML models to MLX to use them; Apple Intelligence is not the only application.

                                                                          • nullstyle

                                                                            today at 2:39 PM

                                                                            MLX does not run on NPUs AFAIK; just gpu and cpu. You have to use CoreML to officially run code on the neural engine.

                                                                              • mirsadm

                                                                                today at 2:41 PM

                                                                                Even then there is no transparency on how it decides what runs on the ANE/GPU etc

                                                                                  • sroussey

                                                                                    today at 4:24 PM

                                                                                    Correct. OS level stuff get first priority, so you can’t count on using it.

                                                                                      • znagengast

                                                                                        today at 6:32 PM

                                                                                        Turns out third party actually gets priority for ANE