\

Running Python code in a sandbox with MicroPython and WASM

33 points - today at 2:15 PM

Source
  • fzysingularity

    today at 6:03 PM

    What’s your experience with Monty? Been looking at it for one of our environments and it seems very promising.

      • simonw

        today at 6:08 PM

        I've tried it out a bit - it does look solid and it has a good team behind it.

        It's a subset of Python though (much more so than MicroPython), which is fine for LLMs since they can easily work around any limitations but does mean you can't use a lot of existing Python code with it.

        I'm also a little bit nervous about the safety. It's a fresh implementation in Rust, which means plenty of possibilities for edge case security bugs. The thing I like about WebAssembly is that there's a robust, well tested sandbox already - better for defense in depth.

        I certainly wouldn't bet against Monty though! It may well prove itself to be a great solution for this.

    • incognito124

      today at 5:07 PM

      If you're interested in not reinventing the sandbox for LLMs, consider Judge0: https://judge0.com/

      I have absolutely no relation to the project except for the fact that I went to the same Uni as the creator.

        • simonw

          today at 6:04 PM

          That one looks pretty good - it's been around since 2016, I'm surprise I haven't encountered it before.

          It's not quite right for what I'm after because you can't just "pip install" it on multiple platforms.

          • era86

            today at 5:15 PM

            I'm using judge0 for a Leetcode-clone I'm working on. Never thought of using it in the context of LLMs, though.

        • hmokiguess

          today at 5:33 PM

          Super tangential comment but glad to see I'm not the only one that send typos to sessions and still get good results.

          Was reading your https://chatgpt.com/share/6a1e2a5c-58b8-8328-ba1c-0e6aadb0a0... and noticed the "my on Python tools" instead of "my own Python tools" (apologies for the grammar police)

          This stuff always gets me anxious for no reason because of the underlying tokenizer and prediction stochastic parrot that runs stuff, makes me wonder if I should rerun the prompt correcting the typo or accept the token tax on some interpreter that spent translating the intention.

            • simonw

              today at 6:09 PM

              Yeah I'm very loose with my prompting now - I can usually tell from the reasoning traces if it correctly interpreted any typos.

              If it looks like it didn't I hit "stop" and then edit and resubmit my prompt.

          • theanonymousone

            today at 2:17 PM

            P.S. I was casually searching for "sandboxed Python" for an experiment I'm working on, and reached this article that was published "today". Very nice coincidence! Thanks.

            • tmaly

              today at 4:15 PM

              I am trying to think of a use case for this.

              I was thinking the client side WASM version would be useful as a platform for beginners to practice a subset of Python in.

              I can't really think of any good WASI use cases.

                • andrewaylett

                  today at 5:20 PM

                  Running arbitrary untrusted code safely is pretty easy nowadays, so long as the code is written in Javascript and you want to run it in a browser. It's only a little harder if the code is written in another language but targets WASM and browser APIs, or if you want to run your WASM inside of NodeJS, and there's even good support for running Python in a browser or Node.

                  Once you get away from running in a JS environment or away from code that's written with the intention of running in a WASM sandbox, if you don't want to have to modify the code for your environment then you're going to start having problems. This looks like a good step for anyone wanting to run arbitrary Python outside of a browser environment.

                  • theanonymousone

                    today at 4:53 PM

                    For me it is a tool I avail to an LLM so that it can provide correct answers to a certain category of questions, instead of hallucinating nonsense.

                    • roywiggins

                      today at 5:16 PM

                      The idea is to expose it as a tool to your LLM agent so it can run calculations on its own initiative.

                  • today at 4:42 PM

                    • knightops_dev

                      today at 5:44 PM

                      [flagged]