\

Advanced Compilers: The Self-Guided Online Course

194 points - today at 11:04 AM

Source
  • titzer

    today at 2:16 PM

    The section on dynamic compilers is more or less all about trace compilation. Generally, trace compilation is a dead end and has been abandoned repeatedly. The more important concepts here are type feedback and speculation and deoptimization, as well as making fast compilers and tiering.

    The course overall looks good, and it's great that so much is available online, so well done, Adrian.

      • samps

        today at 2:29 PM

        Thanks, Ben. I admit I mostly think tracing is just a mind-expanding concept to learn about, even if history has proven it’s not very practical as an organizing principle. But as you say, I’d love to offer more context on β€œwhat actually seems to work” industrially.

          • titzer

            today at 3:05 PM

            Yeah, it is conceptually interesting. I like giving students perspective and in 770 (https://www.cs.cmu.edu/~wasm/cs17-770/fall2025/) I might spend half or less of a lecture on tracing and I don't pull punches on how I think it ends up not really working well in real systems. It's a good opportunity to talk about program behavior and the cost/benefit of speculation.

            We spend a lot more time on type feedback, ICs, and deoptimization which are the more universal concepts that can be applied to multiple different compiler designs.

        • jlebar

          today at 3:49 PM

          > Generally, trace compilation is a dead end and has been abandoned repeatedly.

          JAX is a tracing compiler!

          (I know, I know, it sits in an extremely different part of the problem space than TraceMonkey or LuaJIT. Still.)

            • titzer

              today at 5:50 PM

              Interesting. I think numerical computing is a narrow enough domain where programs have very well-behaved control flow, which avoids most of the problems of trace compilation. Loops over branchy code, which are really common in general programs, are very difficult to make work well with tracing.

              Numerical programs being very stable in terms of control is what enables GPU parallelization and loop optimizations in the long tradition of Fortran compilers. Optimizations like loop tiling, interchange, strip mining, etc aren't going to be easy to do with trace compilation.

              Anyway my comment was more directed toward trace compilation in the context of dynamic languages, and there I think it's pretty well established it only works well for small programs.

                • achierius

                  today at 6:02 PM

                  ML compilers in particular go beyond even the level of stability you would expect from numerical programs. Due to how the SIMT model of thread/warp divergence works, the hardware heavily punishes unstable branches. E.g. if you have 32 threads taking a branch then recoalescing on a barrier -- if they all go the same direction then they can go down the execution pipe as a single bundle, but if 1 takes it while 31 don't, then that's 2x the ex-pipe usage by default (and if you have e.g. a computed-branch, performance goes out the window). Consequently, the whole stack is built around the expectation of stable control flow, even to the detriment of performance (from a local perspective).

                  ML frameworks even take advantage of this to compute, ahead-of-time, how much memory will be used at different points in the program graph, and thereafter schedule memcpy's to make space as necessary. Of course this only works for well-behaved program classes, but e.g. most LLM architectures fit into that category. Interestingly MoE models don't, since they require data-dependent control flow, thus the recent push towards accommodating dynamism in frameworks (like JAX, which until ~recently couldn't handle it at all).

              • zipy124

                today at 4:48 PM

                and PyPy right?

            • jcranmer

              today at 3:13 PM

              The TraceMonkey paper was on my qual reading list, and my quals happened to be around the time TraceMonkey was ripped out of SpiderMonkey. I was talking with one of the developers at the time (I think it was Jason Orendorff?), who said that tracing just doesn't work out, but there was limited circumstances where he thought it might work out... but I've completely forgotten what those circumstances were.

              • abecedarius

                today at 3:44 PM

                Has LuaJIT been superseded?

                • yu3zhou4

                  today at 5:46 PM

                  PyTorch?

                  • giancarlostoro

                    today at 2:25 PM

                    Got any other recommended resources on building compilers?

                • j2kun

                  today at 2:32 PM

                  I'm a bit confused about what makes this course "advanced." Most of the topics (dead code elimination, data flow, dominator analysis, SSA form) seem like they belong in a first course on compilers.

                    • jcranmer

                      today at 3:01 PM

                      Well, course numbers are regular enough that you can look up what the "intro compilers" course is: https://www.cs.cornell.edu/courses/cs4120/2026sp/?schedule

                      The short answer is that compilers is basically broken up into two courses, with the first course largely being the minimum necessary to build a compiler (lexing, parsing, codegen, register allocation), and the second course being how to build an optimizing compiler.

                        • ebiederm

                          today at 6:07 PM

                          The academic literature on register allocation is scary.

                          First is presented a linear time optimal algorithm for graph coloring then it is claimed better can be done by a O(N^2) algorithm that uses a heuristic.

                          I do believe the dragon book got caught with the emperor's new register allocator and the literature hasn't really recovered yet.

                          • j2kun

                            today at 5:52 PM

                            Looks like there is quite a bit of overlap with regards to the optimizing parts between these two courses. I guess it's switching from the dragon book to academic papers that makes it advanced.

                        • mamcx

                          today at 3:21 PM

                          I have read TONS of material about it*, and none of that is part of the majority of that!

                          In fact, the "backend" be compiler or interpreter is nearly always left as "exercise to reader".

                          You can't imagine how much is left to be discovered, from how make a closure, track environment, do pattern matching, memory representation, etc.

                          EVERYTHING interesting is something you need to look for.

                          P.D: This only one of the years:https://gist.githubusercontent.com/mamcx/e1743571b9a1ea163a7...

                          • ferguess_k

                            today at 2:35 PM

                            I think a lot of the non-professionals start with parsing and do not get exposed to backend. I have read two books about interpreters/compilers and they don't touch the backend very much.

                            Maybe this is introductory for backend?

                              • DonaldPShimoda

                                today at 3:05 PM

                                That's part of it. I think another part is that it seems like the students are asked to read the papers behind a lot of the concepts, and academic literature is not generally very accessible to undergrads. (Not that they can't read it, but without someone guiding you through at least the first few papers, it can be a frustrating experience for many.)

                            • vkazanov

                              today at 4:47 PM

                              What is advanced then? Good coverage of dce, data flow, ssa, intruction selection and reg alloc is actually like 98% of the backend.

                                • j2kun

                                  today at 5:53 PM

                                  Perhaps polyhedral optimization, tiling, scalar evolution, vectorization...

                                  I guess garbage collection is pretty advanced in the syllabus.

                          • awesomeMilou

                            today at 6:04 PM

                            Is there also a self guided course for "basic compilers", before stepping into an advanced level?

                              • runevault

                                today at 6:06 PM

                                I'd say check out Crafting Interpreters [1]. It has 2 parts, the first in Java for doing a treewalk Interpreter in Java before going farther with a version written in C.

                                1. https://craftinginterpreters.com/

                            • GL26

                              today at 3:52 PM

                              Saw a podcast that talked about the rust compiler, which apparently included machine learning algorithms at some points to determine whether or not you had code that could crash your system

                                • afdbcreid

                                  today at 5:43 PM

                                  I've never heard about that and I'm pretty sure it's incorrect (although "machine learning" is a wide term), do you have a source for that?

                              • gaze

                                today at 3:14 PM

                                I'm super curious what alexia massalin is up to these days, besides collecting microunity patent royalties