\

So, you want to design your own language? (2017)

98 points - today at 5:44 AM

Source
  • librasteve

    today at 7:16 AM

    I would like to see Raku (https://raku.org) at least mentioned in the list of languages to be aware of. Why?

      - Raku has built in Grammars so it is a great place to do early iteration of your parser
      - Raku is objects and type classes all the way down (as explained here https://gist.github.com/raiph/849a4a9d8875542fb86df2b2eda89296 )
      - RakuAST development is well advanced (use v6.e.PREVIEW) with the Slangify module to accelerate development of sub languages (Slangs)
    
    Here is a Raku implementation of Brainfuck to whet the appetite https://github.com/alabamenhu/PolyglotBrainfuck/blob/main/li...

      • adzm

        today at 8:07 AM

        For some reason https://gist.github.com/raiph/849a4a9d8875542fb86df2b2eda892... wasn't a link in your comment but it was a great read

        • xfeeefeee

          today at 7:51 AM

          For those unaware, Raku is the evolution of Perl 6, basically. It's honestly a beautiful and seductive language. At the same time it terrifies me.

            • librasteve

              today at 8:02 AM

              The main idea of renaming from Perl6 to Raku was to allow this beautiful and seductive new language to escape the black hole gravity well formed by the collapse of the Perl star. Seems like Raku is stuck inside the Perl event horizon for ever, with no hope of reputational escape.

                • silisili

                  today at 8:25 AM

                  Great analogy, and similar to how I saw things play out.

                  IIRC Perl 6 wanted to expand or morph into something better, spent a ton of time on it, and the community in general rejected it hard.

                  So now we have this dangling language that's shunned by its own community, regardless of its merits. Weird place to be in.

                    • librasteve

                      today at 8:53 AM

                      lol

                      Well Raku is not shunned by the very warm and welcoming Raku Community … https://raku.org/community

      • procaryote

        today at 8:45 AM

        I think you should probably start by asking yourself if you should design a new language. Most new languages fall in the bucket of low value innovation that is instant tech debt for anyone who tries to use it for real

        Even the successful ones are often pointless variations on a theme. Ruby, perl & python don't all need to exist for example, as they essentially do the same thing, about as poorly. Now python has won we should just drop the others

          • 000ooo000

            today at 10:53 AM

            You've assumed there's only one reason for designing a language and based your opinion around that, which makes it shallow and not terribly convincing.

              • procaryote

                today at 12:39 PM

                Playing around and learning is great of course, as long as the language doesn't escape the lab

            • hnlmorg

              today at 8:52 AM

              Terrible advice.

              Different languages excel at different things. There shouldn’t be a “one size fits all” otherwise we’d be writing software in FORTRAN and assembly.

              And designing a language is a good exercise if purely from an academic perspective. Eg you learn how to write parsers, and a bunch of PL theory that we take for granted when just being a consumer of a programming language.

              Not everything needs to be done with global domination in mind.

                • procaryote

                  today at 12:49 PM

                  Building one for fun or to learn is great.

                  The bad thing is the uncanny valley. Popular enough to fragment the niche and add tech debt, not big enough to win and defragment the niche, not innovative enough to make any real positive difference beyond personal tastes.

                    • hnlmorg

                      today at 1:26 PM

                      That seems like a silly reason not to do something educational.

                      Build something for fun. Build it for yourself. If people want to use it then they’ll use it. But more likely they won’t.

                      The only thing I would advise against is building something expecting other people to use it.

                      As I said elsewhere, not enough people these days build things for their own pleasure without any expectation nor desire for it to be used by anyone else. Sometimes just doing something for yourself is its own reward.

                      And it’s far more likely that DIY projects will teach skills that you can then directly contribute to established projects with, rather than DIY projects fragmenting those established communities.

                  • iberator

                    today at 9:36 AM

                    FYI:

                    I started programming assembly in 2025 for 6592 and Z80 cpus and believe me: it is fun and IMO actually easier then lets say learning Haskell or JS from scratch.

                    Assemblers with macros are amazingly simple.

                      • hnlmorg

                        today at 9:47 AM

                        You’re missing my point. I’m not criticising FORTRAN nor assembly. I’m saying that people wouldn’t have created C, Java, Python, Pascal, BASIC, and so on and so forth if everyone said “why bother creating a new language when we already have something perfectly good here”

                    • IshKebab

                      today at 9:53 AM

                      You missed his point. He's not saying "why bother why a new language if this one is fine", he's saying "why bother with a very similar language if this one is fine".

                      I think that's fair. Even if you are just doing a hobby language there are plenty of unexplored niches, e.g. that compile-to-shell language I've forgotten the name of.

                        • hnlmorg

                          today at 10:05 AM

                          I didn’t miss his point. Back in the 70s, many of the new languages were just a subtle variation on the previous one. It’s only later they evolved into something distinctive.

                          Which is why I said we’d still be using FORTRAN.

                          Languages that start out radically different don’t tend to gain momentum. Whereas languages that are familiar tend to grow and introduce new ideas.

                          Nothing is invented in a vacuum.

                          Also I completely disagree that one shouldn’t create a hobby project need to be innovative. Sometimes people do create things just because they can. And it’s a good thing too because otherwise we wouldn’t have half the open source software available to us today. Many of which was originally intended for personal use, including Linux.

                          The problem these days is we’re so brainwashed by stories of unicorn start ups pumped with VC money that now everyone thinks every hobby needs to has a viable business plan underneath. It’s like people have forgotten how to play for fun.

                          So people should go out and create new programming languages. The worst that would happen is they learn to be a better programmer in their day to day language.

                  • immibis

                    today at 10:36 AM

                    You should, just so you'll know how compilers and languages work. It doesn't have to be good.

                • hongbo_zhang

                  today at 8:25 AM

                  https://forum.rescript-lang.org/t/introducing-moonbit-and-a-... I happend to write a post sharing experience in building Rescript and MoonBit language

                  • sandruso

                    today at 9:40 AM

                    That's nice summary of the space and how large it is. My recommendation is to just start with math expression parser and evaluator. You can start with Pratt but I would even recommend going with infix to reverse polish using stack.

                    Adding construct like IF or variables is naturally next step but you will have code in place and idea where to put it and how approach it.

                    I learned a lot about JVM runtime, how Zig is parsing itself, how Lua represents values... Too many good rabbit holes to fall in.

                    • gf000

                      today at 8:58 AM

                      > Java and C#, for being enterprisey

                      I believe there are far more interesting stuff to learn about these languages, like the whole category of runtimes could have been mentioned, which can directly affect the language design itself (e.g. having GC vs some language feature for managing memory, open vs closed world model, having an async feature in the language or let the runtime handle it, etc)

                      • jokoon

                        today at 9:46 AM

                        I started making a language, and I took many shortcuts.

                        I just parse my language, translate it to C, and use C compiler errors.

                        I don't add new semantics, I just add many things like strings, map, etc to make it usable and fast.

                        I don't know if it's a good idea and how difficult this will be.

                          • lukan

                            today at 9:51 AM

                            "I don't know if it's a good idea and how difficult this will be."

                            It is a great idea, if you want to learn about languages!

                            (But if money is your goal, you may want to reconsider)

                            • shakna

                              today at 10:53 AM

                              Chicken Scheme compiles to C, using a method that ended up as a maths paper (Cheney on the MTA).

                              Its a valid approach.

                                • 0x3444ac53

                                  today at 1:59 PM

                                  One day I aspire to be able to fully comprehend Cheney on the MTA. I kinda get it? But I've never learned C, and never had to slog through manual memory management, so it's a little lost on me

                              • Rohansi

                                today at 9:51 AM

                                Nothing wrong with that - some others do it too. You can even use TCC to do quick test builds and only use Clang/GCC for release builds.

                            • didierbreedt

                              today at 7:51 AM

                              I’m waiting for a llm focused language. We’re already seeing AI is better with strongly typed languages. If we think about how an agent can ensure correctness as instructed by a human, as the priority, things could get interesting. Question is, will humans actually be able to make sense of it? Do we need to?

                                • suddenlybananas

                                  today at 7:58 AM

                                  How could an LLM learn a programming language sufficiently well unless there is already a large corpus of human-written examples of that language?

                                    • nrhrjrjrjtntbt

                                      today at 8:21 AM

                                      LLM could generate such a corpus, right? With feedback mechanisms such as side by side tests.

                                        • tbossanova

                                          today at 8:26 AM

                                          So… llm learns from a corpus it has created?

                                            • hnlmorg

                                              today at 9:02 AM

                                              It’s basically called “reinforced learning” and it’s a common technique for machine learning.

                                              You provide a goal as a big reward (eg test passing), and smaller rewards for any particular behaviours you want to encourage, and then leave the machine to figure out the best way to achieve those rewards through trial and error.

                                              After a few million attempts, you generally either have a decent result, or more data around additional weights you need to apply before reiterating on the training.

                                                • suddenlybananas

                                                  today at 9:09 AM

                                                  How do you define the goal? This kind of de novo neural program synthesis is a very hard problem.

                                                    • hnlmorg

                                                      today at 9:40 AM

                                                      Defining the goal is the easy part: as I said in my OP, the goal is unit tests passing.

                                                      It’s the other weights that are harder. You might want execution speed to be one metric. But how do you add weights to prevent cheating (eg hardcoding the results)? Or use of anti-patterns like global variables? (For example. Though one could argue that scoped variables aren’t something an AI-first language would need)

                                                      This is where the human feedback part comes into play.

                                                      It’s definitely not an easy problem. But it’s still more pragmatic than having a human curate the corpus. Particularly considering the end goal (no pun intended) is having an AI-first programming language.

                                                      I should close off by saying that I’m very skeptical that there’s any real value in an AI-first PL. so all of this is just a thought experiment rather than something I’d advocate.

                                                        • macleginn

                                                          today at 10:46 AM

                                                          With such learning your model needs to be able to provide some kind of solution or at least approximate it right off the bat. Otherwise it will keep producing random sequences of tokens and will not learn anything ever because there will be nothing in its output to reward, so no guidance.

                                                            • hnlmorg

                                                              today at 11:27 AM

                                                              I don’t agree it needs to provide a solution off the bat. But I do agree there is some initial weights you need to define.

                                                              With a AI-first language, I suspect the primitives to be more similar to assembly or WASM rather than something human readable like Rust or Python. So the amount of pre-training preparation would’ve a little easier since syntax errors due to parser constraints.

                                                              I’m not suggesting this would be easy though haha. I think it’s a solvable problem but that doesn’t mean it’s easy.

                                                      • nrhrjrjrjtntbt

                                                        today at 9:26 AM

                                                        1. Choose set of code challenges (generate them, leetcode, AOC etc.)

                                                        2. LLM generates python solution and seperate python test (as in python test calls code as black box process so it can test non python code)

                                                        3. Agent using skills etc. tries to write new language let's call it Shark.

                                                        4. Run Shark code against test. If fails use agentic flows to correct until test passes.

                                                        5. Now have list of challenges, working code (maybe not beautiful) for training.

                                                        A bit of human spot checking may not go amiss!

                                                • nrhrjrjrjtntbt

                                                  today at 8:29 AM

                                                  Yes. The learning comes from running tests on the program and ensuring they pass. So running as an agent. Tests and compiler give hard feedback- thats the data outside the model that it learns from.

                                                  I think modern RLHF schemes have models that train LLMs. LLMs teaching each other isn't new.

                                                  My knowledge is limited, just based on a read of https://huyenchip.com/2023/05/02/rlhf.html though.

                                                    • suddenlybananas

                                                      today at 9:09 AM

                                                      RLHF

                                  • BaudouinVH

                                    today at 11:55 AM

                                    Reading the headline my first thought was another kind of language : the linguistic language (English, Spanish, French, Esperanto etc.)

                                    How does one create a new spoken/written language ?

                                      • 0x3444ac53

                                        today at 2:00 PM

                                        Conlangs are really cool!! You could always learned toki pona :)

                                        • christophilus

                                          today at 1:49 PM

                                          Start a tribe and isolate yourself from the world for a millennium or so.

                                          • jimktrains2

                                            today at 1:30 PM

                                            Look up conlangs (constructed languages).

                                            https://zompist.com/kit.html

                                            https://conlang.org/resources/

                                        • fjfaase

                                          today at 7:55 AM

                                          Interesting page. The latest language I designed is an stack based intermediate language for a C compiler. Not realy intended for human usage, but readable in the sense that you can compare it with the original C code.

                                            • iberator

                                              today at 9:45 AM

                                              Ha! I just wrote stack based RISC cpu architecture with assembler and now thinking about implementing my own FORTH like lang (niche stack based programming language) compiler.

                                              Fun

                                                • fjfaase

                                                  today at 1:00 PM

                                                  Great. Are you going to open source it? If so, let me know. You find my email address on my website mentioned in my profile.

                                          • zkmon

                                            today at 7:10 AM

                                            What do these languages compile to? What's the build pipeline and runtime context?

                                            • cyco130

                                              today at 10:35 AM

                                              No mention of INTERCAL!

                                                • nemetroid

                                                  today at 1:17 PM

                                                  PLEASE mention INTERCAL!

                                              • imvetri

                                                today at 7:14 AM

                                                This talks about programming language.

                                                Right question is to design own linguistic language common between computer and across human.

                                                  • exe34

                                                    today at 8:18 AM

                                                    Marain

                                                • DeathArrow

                                                  today at 7:14 AM

                                                  I had some thoughts about designing a new language. However it's a huge undertaking and I don't know the answers to some questions:

                                                  1. Is there a need for the programming language?

                                                  2.If the answer to the previous question is yes, can I find enough people to help and enough resources?

                                                  3. If the answer to the previous question is yes, can we release a MVPin a reasonable amount of time?

                                                  4. If the answer to the previous question is yes, what is the chance it will gather a reasonable amount of users?

                                                  There are literally tons of programming languages that didn't make it. I wouldn't want to waste my and other people resources.

                                                    • artpar

                                                      today at 7:43 AM

                                                      I made a language for using in another project, so I'll answer your questions

                                                      https://www.npmjs.com/package/wang-lang

                                                      - this new language looks and behaves exactly like javascript, except it doesnt have "eval" and "new Function", so it is CSP safe. That's the only difference. I wanted to execute dynamically generated code in chrome extension

                                                      - llm did most of the work of creating a nearley grammar and associated interpreter (whole thing is bundled, nearley is not a final dependency), elaborate tests make this quite sane to handle

                                                      - took me about total of 1 weeks for the initial mvp to try out, and then have been fixing bugs and inconsistencies with javascript behavior, about 1 day a month of effort

                                                      - mostly 0

                                                      The only reason to create was I couldnt find something similar and it was low effort thanks to llm

                                                      I also created another even smaller DSL you can say

                                                      https://www.npmjs.com/package/free-text-json-parser

                                                      It parses json embedded in plain text

                                                        • yokljo

                                                          today at 8:54 AM

                                                          Nice. I built something basically just like this for work for the same reason last year. It only look a few hours though, cause I just used Acorn [0] to parse my JS, then directly evaluated the AST. It also had an iteration limit and other configurable limits so I can eval stuff in the browser without crashing the tab. I did not use an LLM.

                                                          [0]: https://github.com/acornjs/acorn

                                                            • artpar

                                                              today at 8:57 AM

                                                              This is exactly what I wanted and couldn't find. Ended up creating along with an interpreter (so slightly easier then walk and execute)

                                                          • Hendrikto

                                                            today at 8:06 AM

                                                            > safe

                                                            > llm did most of the work

                                                            > it was low effort

                                                            I really wouldn’t trust its supposed safety.

                                                              • artpar

                                                                today at 8:26 AM

                                                                csp safe has a particular meaning associated it with. its not a "safe" language whatever that is. chrome webstore team is okay with it and serves my purpose. if you have submitted extensions to google chrome then you would know that any sign of "eval" or new Function in the code will lead to rejection.

                                                                  • procaryote

                                                                    today at 8:50 AM

                                                                    So what you want is a linter, not a language

                                                                      • artpar

                                                                        today at 8:55 AM

                                                                        I want to execute dynamically generated javascript looking code in chrome extension without using eval or new function. basically eval without actually using eval.

                                                                        linter would help me find and avoid usages of eval.

                                                        • lifthrasiir

                                                          today at 7:24 AM

                                                          "Just for fun" is always a valid default option. Though many authors don't stop there ;-)

                                                          • yokljo

                                                            today at 7:37 AM

                                                            I think most popular languages were started as an experiment in some feature, or to solve a specific problem someone had. Those are good reasons to make a language. I see no reason to make a language just to take attention away from other existing languages. Instead, make a language so you can understand how to make languages. It is 100% doable by one person. It's fun and educational.

                                                            • chistev

                                                              today at 7:25 AM

                                                              Sometimes it might just be a fun project to push yourself. Maybe such a complex undertaking can't be fun indeed lol

                                                                • DeathArrow

                                                                  today at 7:32 AM

                                                                  My idea of for fun is to release something people will use. I have more fun if I work on something useful. For me is less the journey than the end goal.

                                                                  I love working on software, architecture, design but only if I see some use.

                                                                  Of course, for other people, the journey is more interesting than the destination and they have fun hacking stuff just for the sake of it. They discover things and learn new stuff they wouldn't have learned otherwise. And this is a path at least as valid as the other.

                                                              • ModernMech

                                                                today at 7:37 AM

                                                                I'll try to answer your questions best I can.

                                                                1. Yes, as long as there are new machines that need programming, new programming languages will be needed. Today's top languages were built for the machines of the 1970, 80s, and 90s. Tomorrow's languages will be built for machines of today and tomorrow. As Alan Kay put it, if you want to invent a new language, first invent the machine of the future and then build a language for it.

                                                                2. No, you cannot. First of all, PL devs are cats, it's very difficult collecting them without financial compensation. So if your plan is to post a language and hope that people will come help you, you'll likely be disappointed. The problem is that everyone else interested in building PLs has their own itch to scratch, and they're not going to scratch yours without some compensation.

                                                                You might think "Well I can just raise money to do this", and you would be wrong. First, it's very hard to raise money for PLs. Usually you have to have come sort of cred to do it. I know of only 3 projects to have raised VC money for a PL project, and they each had some success before they had done so: Chris Granger (Light Table), Paul Biggar (CircleCI), and Chris Lattner (Swift/LLVM). Granger's project Eve raised $2M and ran out of money after 3 years; Biggar's project Dark also raised money, then fired all the devs when he realized he was burning cash too fast, then he slow-burned development for years, then he gave up and handed development over to someone else; and Lattner raised almost $100M for Mojo, which is probably going to end much the same way as Eve and Dark, but I wish them the best.

                                                                Anyway, the point is that you personally (no offense) don't have the profile to raise $100M like Lattner. $2M is not enough for a PL project. Lattner is keeping Mojo closed source for now because there's no good answer for how they're going to make enough money as an open source language to justify raising $100M.

                                                                And the reason it's so hard to raise money is because there's no money to be made. No one pays for PLs. No one pays for PL dev tools. They have to be open source or they're rejected by the dev community. The only ones these days who can reasonably pay for all of this with no potential revenue stream are giant corporations, who use the lang as a hook into their ecosystem.

                                                                3. Even though the answer is no, you yourself can still get an MVP off the ground in a pretty reasonable amount of time. It's never been easier to make a PL. The problem with PLs is building them is kind of like measuring the coastline; language projects are fractals -- there's an infinite amount of detail you can work on in any given direction. It's very easy for a language project to become a language + editor project, and it's easy for that to turn into language + editor + operating system if you're not disciplined. Plenty of PL devs have fallen into that trap.

                                                                4. Rounds to 0% chance. You'll be lucky if you build something that even you will use. Rather, most PL devs end up working on their language in some other language, because working on languages is what they want to do!

                                                                That said, it's still important to write languages that you understand no one will use. First it allows you to try new things that may good but unpopular. If PL devs only did what was popular with devs, PLs would go nowhere as a field.

                                                                Consider the so called "Hornet's nest" of programming languages [1], which is the tightly related cluster of imperative programming languages which have been the most researched and used over the last 50 years. There is a vaaaaaaaast design space outside that nest, begging for more language development. No one will use most of them, but it's important to understand what those languages might look like to maybe find some new ideas that work.

                                                                Also "didn't make it" is kind of an unfair judgement. Gaining popularity doesn't have to be a goal. In fact, it shouldn't be a goal if you want to have any fun at all. There's an infinite amount of work to be done, and if you're not doing it for you, you won't get far at all. That's really the only way to fail at this.

                                                                Good luck!

                                                                [1] https://tomasp.net/techdims/#footer=index,navigation;left=ca...

                                                                  • tov_objorkin

                                                                    today at 10:03 AM

                                                                    Bold plus, making PLs is a lifestyle, not a business. Most PLs clones each other and absorb features. The only difference is QOL and tooling. Users expect to have a full set of batteries, an IDE/LSP, jobs, OOP style, and minimal effort to learn. Being popular contradicts with the idea of pushing the boundaries and shifting paradigms.

                                                            • today at 7:30 AM

                                                              • dash2

                                                                today at 6:51 AM

                                                                This is bad and reads like AI slop. Try the "programming language checklist" instead.

                                                                  • andai

                                                                    today at 7:01 AM

                                                                    I can't comment on the quality (I don't know anything about PL design) but the page is older than LLMs.

                                                                    http://web.archive.org/web/20170506182606/https://cs.lmu.edu...

                                                                    I have to say that the 2017 version is a lot less intimidating though.

                                                                    • raincole

                                                                      today at 7:01 AM

                                                                      N=1: Skimmed through it and found nothing screaming AI.

                                                                      • almosthere

                                                                        today at 7:09 AM

                                                                        I think at this point we have to assume all articles are attempts to make some llm better, so they're spewing this shit out and if it gets a genuine reaction, we just gave the llm a rf-cookie. And if someone says "crap llm post" then it gets punished.

                                                                        • ModernMech

                                                                          today at 7:01 AM

                                                                          It's worth for this SnekQL mascot alone: https://github.com/jennashea/snekql/blob/master/docs/images/...