\

CUDA Books

191 points - yesterday at 12:52 PM

Source
  • somethingsome

    yesterday at 7:39 PM

    Having read or at least skimmed most of those books, I think the best intro is 'CUDA Programming: A Developer's Guide to Parallel Computing with GPUs'

    Massively Parallel Processors: A Hands-on Approach is not really good in my opinion, many small mistakes and confusing sentences (even when you know cuda).

    CUDA by Example: An Introduction to General-Purpose GPU Programming is too simple and abstract too much the architecture.

    Next year I'm planning to start writing a cuda book that starts by engineering the hardware, and goes up to the optimization part on that harware (which is basically a nvidia card) including all the main algorithms (except for graphs).

    I'm already teaching the course in this way at uni, and it is quite successful among students.

      • iamcreasy

        today at 3:52 AM

        Interesting, thanks for sharing.

        What makes CUDA Programming: A Developer's Guide to Parallel Computing with GPUs better among its peers?

        • boomzilla

          today at 4:15 AM

          How about this guide:

          https://docs.nvidia.com/cuda/cuda-programming-guide/pdf/cuda...

          • Aurornis

            today at 3:02 AM

            Very valuable comment. Thank you.

            I always appreciate book lists like this one, but having a small targeted list is more practical for those of us with limited reading time.

            • KnuthIsGod

              today at 6:51 AM

              Thank you, that is very useful advice !

              • bobmarleybiceps

                yesterday at 10:15 PM

                I really wish there were better options to PMPP... It's by far the most up-to-date book, but I totally agree the writing is sort of bad and some of the code examples are straight up incorrect.

                So tl;dr, you have at least one person who would pay for a better book :-)

                • synergy20

                  yesterday at 8:19 PM

                  the first book was published in 2012,is it too outdated?

                    • somethingsome

                      yesterday at 10:12 PM

                      Not really, Hardware didn't really change that much, of course you'll not find Tensor or raytracing cores, but you will have a very solid grasp of gpu programming and the cuda language (that didn't change that much either), and then you can easily learn those more modern things with blog posts or even, at worst, chatgpt.

                        • jpgvm

                          today at 8:59 AM

                          Yeah pretty much this.

                          I would separate the knowledge into maybe 3 distinct buckets.

                          The baseline: device/host boundary, SIMT programming etc.

                          The intermediate: kernel architecture, CUDA graph vs persistent kernels, warp specialisation/divergence avoidance techniques etc.

                          The advanced: architecture specifics so tcgen05, TMA, SMEM/HBM, memory throughput vs compute biases in various arch impls., GEMM, FHMA, all the tricks that make modern fused kernels very fast. Also would bucket most GPU Direct RDMA/GPU NetIO/friends here too.

                          The baseline hasn't changed much and probably won't, the intermediate knowledge has also remained pretty reliably stable for ~10 years with only things like graphs changing stuff. Tile might become more relevant than it is today but for now CUDA, cuBLAS, friends are where it's worth investing knowledge.

              • wces

                today at 3:55 AM

                This is highly condensed video of all important concepts in CUDA from Stephen Jones, one of the CUDA architects: https://www.youtube.com/watch?v=QQceTDjA4f4

                Understand everything he talks about and you understand CUDA.

                • dahart

                  yesterday at 7:28 PM

                  Regarding the section on Python and high-level CUDA, anyone interested should maybe first take a peek at Warp, which I’m guessing is too new to have a book yet. Warp lets you write CUDA kernels directly in Python, and it’s a breeze to get started. https://github.com/nvidia/warp

                • juvoly

                  yesterday at 6:05 PM

                  Increasingly (for instance ADSP podcast [1]) those in nvidia's inner circle are advocating against writing your own CUDA kernels. (Unless that's your full time job at nvidia, that is).

                  [1] https://adspthepodcast.com/2024/08/30/Episode-197.html

                    • halJordan

                      yesterday at 7:57 PM

                      That would be cool but nvidia released blackwell and still have not released unbroken kernels for sm120. Sm120 is not the data center gpu, so it doesn't get its love. So we can't depend on nvidia to do the right thing is my point unfortunately

                      • dahart

                        yesterday at 7:24 PM

                        It’s not about whether you work at Nvidia. Avoid writing CUDA kernels if there are higher level libraries that do what you need. Do write CUDA kernels if you want to learn how, or if you need the low level control, or to micro-optimize. Being able to fuse kernels to avoid memory traffic or get better specialization is also a reason to reach for raw CUDA. Just consider what’s the right tool for the job…

                          • saagarjha

                            today at 3:45 AM

                            I don't think writing CUDA is a good way to do this tbh

                              • nnevatie

                                today at 8:35 AM

                                To do what? If you need the highest performance GPU kernel performance on NVidia HW, using CUDA is the way to go.

                        • drnick1

                          yesterday at 9:45 PM

                          That advice seems like nonsense. It's like saying avoid C because you can use Python, or avoid writing a graphics engine because you can license Unreal.

                            • pjmlp

                              today at 7:50 AM

                              Not at all, the advice is like use SDL or Raylib instead of writing your framebuffer blitter in inline Assembly to call from C.

                                • lacedeconstruct

                                  today at 8:39 AM

                                  I bet you will learn alot doing that though

                                    • pjmlp

                                      today at 8:44 AM

                                      Depends if the purpose is learning or actually delivery something on the same amount of time.

                                      Each one has their place.

                          • bobmarleybiceps

                            yesterday at 10:36 PM

                            can very much agree about not writing stuff like reductions yourself, unless you have good reason to. but this sort of feels like another "implement everything with <nvidia stuff> and you'll have a great time!! (but also coincidentally get locked in even more to Nvidia hardware)"

                        • SkiFreeWin3

                          today at 1:27 AM

                          I wish the README had a solid “what cool things you can do with this” right at the top.

                          In this day and age when programming is so accessible, why not have a more tempting pitch than just book titles categorized by difficulty.

                          • chrsw

                            yesterday at 5:41 PM

                            "AI Systems Performance Engineering" might deserve a mention, even though it's not strictly CUDA.

                            • saagarjha

                              today at 3:46 AM

                              Probably worth noting that writing performant kernels for modern Nvidia hardware looks almost nothing like what the books from 2012 are going to teach you. You can read them for fun if you'd like but they're basically irrelevant.

                              • zparky

                                yesterday at 6:04 PM

                                I liked going through https://www.olcf.ornl.gov/cuda-training-series/ for an intro and some fundamentals.

                                  • lacedeconstruct

                                    yesterday at 8:32 PM

                                    Going through books after this one was a breeze

                                • pwython

                                  yesterday at 6:33 PM

                                  First one I clicked on is 404: Programming Massively Parallel Processors: A Hands-on Approach (3rd Edition) https://www.cambridge.org/core/books/programming-in-parallel...

                                    • synergy20

                                      yesterday at 8:12 PM

                                      the newest is 4th ed i think

                                        • cdavid

                                          today at 2:39 AM

                                          A fifth edition has been out recently: https://shop.elsevier.com/books/programming-massively-parall...

                                          I started learning about GPU and CUDA from this book recently, and I agree the writing is confusing, and code examples have errors. However, it is still a nice reference about many types of algorithms for heterogeneous memory devices, it helped me understand better some patterns for CPUs.

                                  • fwx

                                    yesterday at 11:03 PM

                                    Does anyone know of any good resources for the newer paradigms like cuTile?

                                    • brcmthrowaway

                                      yesterday at 8:19 PM

                                      Any good MOOCs on Parallel programming/NVIDIA?

                                    • phoronixrly

                                      yesterday at 4:10 PM

                                      In an age when your company mandates you to raise your productivity right now with hundreds of percentage points using LLMs, how do you find an excuse to sit down and read a book?

                                        • pjmlp

                                          today at 7:52 AM

                                          As always, on private time, if available, otherwise wait when LLM connection breaks down.

                                          • q8zd3

                                            yesterday at 4:21 PM

                                            It feels like a dirty secret, doesn't it?

                                              • phoronixrly

                                                yesterday at 5:59 PM

                                                Yeah, corps don't want you to know how to code, they want you to be a prompter...

                                                  • canyp

                                                    today at 12:35 AM

                                                    Sometimes I squeeze in an hour or so a day to read. Living on the edge, looking for the next dopamine hit.

                                            • signa11

                                              today at 5:28 AM

                                              not on company time ?

                                              • mohamedkoubaa

                                                yesterday at 10:53 PM

                                                Anthropunk

                                                • fileeditview

                                                  yesterday at 4:49 PM

                                                  Don't you read while your agents are doing all the work for you? /s

                                                    • hartator

                                                      yesterday at 4:50 PM

                                                      Or make your agents do the reading for you!

                                                        • yesterday at 6:37 PM

                                              • qzgrid37

                                                today at 4:00 AM

                                                [dead]