\

Launch HN: Cardboard (YC W26) – Agentic video editor

96 points - yesterday at 6:38 PM


Hey HN - we're Saksham and Ishan, and we’re building Cardboard (https://www.usecardboard.com). It lets you go from raw footage to an edited video by describing what you want in natural language. There’s a demo video at https://www.usecardboard.com/share/fUN2i9ft8B46, and you can try the product out at https://demo.usecardboard.com (no login required!)

People sit on mountains of raw assets - product walkthroughs, customer interviews, travel videos, screen recordings, changelogs, etc. - that could become testimonials, ads, vlogs, launch videos, etc.

Instead they sit in cloud storage / hard drives because getting to a first cut takes hours of scrubbing through the raw footage manually, arranging clips in correct sequence, syncing music, exporting, uploading to a cloud storage to share, and then getting feedback on WhatsApp/iMessage/Slack, then re-doing the same thing again till everyone is happy.

We grew up together and have been friends for 15 years. Saksham creates content on socials with ~250K views/month and kept hitting the wall where editing took longer than creating. Ishan was producing launch videos for HackerRank's all-hands demo days and spent most of his time on cuts and sequencing rather than storytelling. We both felt that while tools like Premiere Pro and DaVinci are powerful, they have a steep learning curve and involve lots of manual labor.

So we built Cardboard. You tell it to "make a 60s recap from this raw footage" or "cut this into a 20s ad" or "beat-sync this to the music I just added" and it proposes a first draft on the timeline that you can refine further.

We built a custom hardware-accelerated renderer on WebCodecs / WebGL2, there’s no server-side rendering, no plugins, everything runs in your browser (client-side). Video understanding tasks go through a series of Cloud VLMs + traditional ML models, and we use third party foundational models for agent orchestration. We also give a dropdown for this to the end user.

We've shipped 13 releases since November (https://www.usecardboard.com/changelog). The editor handles multi-track timelines with keyframe animations, shot detection, beat sync via percussion detection, voiceover generation, voice cloning, background removal, multilingual captions that are spatially aware of subjects in frame, and Premiere Pro/DaVinci/FCP XML exports so you can move projects into your existing tools if you want.

Where we're headed next: real-time collaboration (video git) to avoid inefficient feedback loops, and eventually a prediction engine that learns your editing patterns and suggests the next low entropy actions - similar to how Cursor's tab completion works, but for timeline actions.

We believe that video creation tools today are stuck where developer tools were in the early 2000s: local-first, zero collaboration with really slow feedback loops.

Here are some videos that we made with Cardboard: - https://www.usecardboard.com/share/YYsstWeWE9KI - https://www.usecardboard.com/share/nyT9oj93sm1e - https://www.usecardboard.com/share/xK9mP2vR7nQ4

We would love to hear your thoughts/feedback.

We'll be in the comments all day :)

Source
  • hbardigital

    today at 2:43 AM

    I'm currently building something in the generative AI space and am struggling with pricing. With your fixed price monthly plans, how do you deal with power users who might be blowing through more than $60/month worth of tokens? Do you eat the cost and hope the margins average out? Or have you optimized enough where that's not really a concern?

    • flyingcircus3

      yesterday at 10:15 PM

      Here's an agent skill that lets you do similar things: https://skills.sh/remotion-dev/skills/remotion-best-practice...

      https://www.remotion.dev/docs/ai/claude-code

        • ishandeveloper

          today at 12:01 AM

          We've played around with this and honestly have a lot of respect for what the Remotion team has built. Fun fact, I tinkered with it back in 2021 when they made those GitHub Wrapped videos, it was one of those projects that made me think differently about video on the web :) Cardboard is a bit different though, aimed at non-developers who want to edit raw footage through natural language without writing any code. Motion graphics is on the roadmap and Remotion would hopefully be a natural fit when we get there.

          Cool to see the space evolving from so many directions! :)

      • 1024core

        yesterday at 10:57 PM

        For your example videos that you made with Cardboard: can you also put up the raw material that went into those videos? Just looking at the output doesn't tell me anything. :thanks:!

          • ishandeveloper

            yesterday at 11:19 PM

            Sure! Will share the raw material for all the videos.

            For some of the examples we shared though, we've created sample projects right within the product itself. They contain the raw assets and the exact prompts used to create the videos. You can try them out directly at https://demo.usecardboard.com and see the whole process!

        • today at 12:00 AM

          • popalchemist

            yesterday at 9:58 PM

            Impressive UI. I assume you must be doing some kind of RAG + audio/video transcription on all the media. What's RAG architecture did you go with?

              • sxmawl

                yesterday at 10:48 PM

                we've found more success with similar directions to what claude code took. maybe its closer to hybrid+agentic RAG

                • newbeeguy

                  yesterday at 10:52 PM

                      Firefox is not supported ...
                  
                  But why?

                    • ishandeveloper

                      yesterday at 11:17 PM

                      Totally fair question. I've actually been a longtime Gecko/Firefox user myself, so this one stings a bit.

                      The short answer: Firefox doesn't support the File System Access API (https://caniuse.com/?search=File+System+Access+API).

                      We made a deliberate decision to go client-first. Video editing happens entirely in your browser without us uploading your entire footage on our end. No bandwidth costs for you, no storing your raw video on our servers. The File System Access API is what makes that possible, and unfortunately Firefox just doesn't have it yet.

                      It's not a forever thing though. For cloud-based projects where files live on our end anyway, Firefox support is very much on the roadmap. But for the local-first editing flow, our hands are a bit tied until Mozilla ships it.

                      Hope that makes sense, and fingers crossed Firefox adds support soon!

              • moralestapia

                yesterday at 7:42 PM

                This is amazing (I'll add you on LinkedIn).

                I recently started making videos for a loved one that lives far away, I started using CapCut and this is the kind of thing I was thinking "I wish it did that".

                I'll definitely try it out. Congrats!

                  • sxmawl

                    yesterday at 7:58 PM

                    that's really cool!

                    lmk if i can help in any way :)

                • barefootford

                  yesterday at 8:36 PM

                  Really impressive work guys! It seems like YC has funded a few companies attacking this but I think you all might have the best approach so far. Behind the scenes is the agent just editing using text/annotated timelines? I feel like the move is probably text for roughcut/narrative, then a vlm for digesting the initial roughcut, then adding broll and fixing timing issues. Feel free to steal my FCP xml generator. https://github.com/barefootford/buttercut

                    • sxmawl

                      yesterday at 9:07 PM

                      happy that you liked our approach! also, i think it's a better idea to just give agent these tools and let it figure out its course of actions than giving it a specific workflow to work on - it seems like the world keeps reminding us the bitter lesson [http://www.incompleteideas.net/IncIdeas/BitterLesson.html] more frequently these days

                      will definitely check the XML exports, ty :)

                        • barefootford

                          today at 1:49 AM

                          Theoretically I agree, but practically without guidance agents aren't really able to edit video ATM. Without hand holding Claude will just call ffmpeg and look at a few frames.

                            • sxmawl

                              today at 2:27 AM

                              yeah we just ask a lot more questions to user to begin with

                  • calebm

                    yesterday at 7:30 PM

                    This seems like a great idea. Tools like video editors (and CAD) often impose a big learning curve - there is a big differential between "I want to do X" and actually knowing all the right buttons to press to do X. Good luck.

                      • sxmawl

                        yesterday at 7:35 PM

                        appreciate your support!

                    • WaylonKenning

                      yesterday at 8:59 PM

                      Funnily, this was an issue for myself so I built an open source AI video editor - https://github.com/waylonkenning/aidirector

                      Cardboard looks really well polished, well done!

                        • sxmawl

                          yesterday at 9:17 PM

                          damn that's really cool, you ship fast!

                      • michaelevensen

                        yesterday at 9:22 PM

                        Love this idea! I built something similar last year https://www.usecrossfade.com and know how difficult this is to get right - I'm rooting for you guys!

                          • ishandeveloper

                            yesterday at 9:31 PM

                            Thank you! You're right, there are so many subtle things to get right, appreciate the kind words. Crossfade's landing page looks slick btw!

                              • michaelevensen

                                yesterday at 9:34 PM

                                Thanks! Yeah, it can just quickly spiral into this massive product when you take video editing which has a base level of features you sort of expect and add on a whole new paradigm like AI-assisted. But really like your approach!

                        • jimmis

                          yesterday at 8:31 PM

                          Excited to see AI integrations into more non-text-related applications (coding, spreadsheets, proofreading etc). As someone who only occasionally needs to edit videos for product / feature reels, I'd happily ask an AI to "sync the narration to the video, cut away irrelevant footage, and add transitions". The convenience of being able to automate simple, repeatable tasks in creative software via ai is something that gets overshadowed a lot by the agentic coding discussions. I can only imagine the nightmare it would be for a tool like Premier to integrate effective ai features, so new ai-in-mind tools really feel like a necessity.

                          Great website and good luck!

                            • sxmawl

                              yesterday at 8:59 PM

                              you understood well what we are building. non-text domains certainly have additionally challenges and we're working on making it reliable without learning curve.

                              also, appreciate the kind words on the site — give Cardboard a spin next time you need a product reel!

                          • moinism

                            yesterday at 9:12 PM

                            Wow! congrats on the launch guys. client-side rendering is incredible, really. I saw your product somewhere and have it as an open tab in my chrome for ~2 weeks :D

                            I also saw another YC company, Mosaic, doing something similar. But your approach of chat-based editing is a lot closer to what I'm building. Shameless plug: I'm also working on a chat-based media processor. https://chatoctopus.com

                            But you guys are way ahead! will be looking at you for inspiration.

                              • sxmawl

                                yesterday at 9:33 PM

                                mosaic's approach is also v fresh. curious about the flow after a user q/a with an asset in chatoctopus?

                                and ig it's time to revisit that chrome tab :)

                            • rd

                              yesterday at 7:36 PM

                              Who do you think your target customer is? Curious to know if you think the money is in short form, traditional YouTube videos, or even movie studios one day.

                              Great website btw. The onboarding was very pleasing

                                • sxmawl

                                  yesterday at 7:56 PM

                                  there's value in all the categories you mentioned — we're not focusing on feature filmmakers right now.

                                  target customers usually fall under one of these - marketers / creators / founders

                              • joshribakoff

                                yesterday at 8:56 PM

                                Very cool idea. If your product is about video, please fix your video players. I cannot even seek on my touch screen.

                                  • ishandeveloper

                                    yesterday at 9:24 PM

                                    my bad, I didn't test it enough on touch devices. Just pushed a fix, appreciate you flagging it!

                                    • sxmawl

                                      yesterday at 9:13 PM

                                      ah, ty for notifying about the mobile player. on it!

                                  • RobotToaster

                                    yesterday at 8:25 PM

                                    The 10gb file size is going to be limiting for anyone shooting prores or raw.

                                      • sxmawl

                                        yesterday at 8:28 PM

                                        yeah, i agree. we're actively working on bumping that up. it was 5GB last week

                                        for now, an intermediate solution is to splice and upload.

                                    • yesterday at 8:41 PM

                                      • deklesen

                                        yesterday at 7:54 PM

                                        Nice demo experience!

                                          • sxmawl

                                            yesterday at 8:10 PM

                                            ty!

                                        • danieltk76

                                          yesterday at 8:10 PM

                                          We use Cardboard at Vulnetic and it is an incredible product. The founders are easily accessible, and it has definitely made it easier to film feature update videos. I can't recommend them enough.

                                            • sxmawl

                                              yesterday at 8:14 PM

                                              glad i'm able to help, i really enjoy working with you!

                                          • regus

                                            yesterday at 11:02 PM

                                            What is the story behind the name?

                                              • ishandeveloper

                                                yesterday at 11:30 PM

                                                haha, good question.

                                                My co-founder and I met in high school, and we wanted the name to carry a sense of craft. Cardboard was always that material in school projects that was firm enough to hold structure but malleable enough to build almost anything out of. That balance of structure and flexibility felt like a good metaphor for what we're building.

                                                Also we just thought it was a cool name and bought a bunch of domains... https://cardboard.mov is one of my favorites :)

                                            • telesilla

                                              yesterday at 10:09 PM

                                              Helpful for those who care less about the craft and more about a quick outcome. Werner Herzog said that he watches his footage a few times, takes extensive notes then edits based on his notes. That's how he crafts such extraordinary, once-in-a-lifetime stories. But for those who are working on commercial or home movies, why not use AI to build a narrative? It can be like throwing dice and the outcome could be OK. Maybe even good.

                                              Regardless, having a tool that knows the content of your footage is a huge time saver. Good luck with the product.

                                                • ishandeveloper

                                                  yesterday at 11:50 PM

                                                  I totally resonate with you. Craft takes time, and that's completely valid. We're not focused on filmmakers right now, though we'd love to have them eventually.

                                                  That's also why we built a full editor alongside the agentic experience. Use AI where it helps, like finding the right shot or removing silences, and do the rest manually. And if you'd rather finish in your editor of choice, we support XML export for Premiere, DaVinci, etc.

                                                  And agreed, there's really no substitute for the kind of intentionality Herzog brings to his work :)

                                              • popalchemist

                                                yesterday at 10:56 PM

                                                As a professional video editor (short-form and feature films) I've always thought realtime collaboration on a timeline makes no sense. Editors' decisions can be mutually destructive / conceptually incompatible.

                                                  • ishandeveloper

                                                    yesterday at 11:42 PM

                                                    Fair point. What we mean by collaboration is closer to how Figma works. From our user interviews, video creation almost always involves multiple people but in different ways: screenwriters, marketers, designers, directors reviewing the edit and sharing feedback.

                                                    The value might not be co-editing the timeline, it's making the feedback / iteration loops faster.

                                                • jhatemyjob

                                                  yesterday at 9:04 PM

                                                  > We built a custom hardware-accelerated renderer on WebCodecs / WebGL2, there’s no server-side rendering, no plugins, everything runs in your browser (client-side).

                                                  Aight imma head out. Holy moly.

                                                    • sxmawl

                                                      yesterday at 9:14 PM

                                                      haha xD

                                                  • adboio

                                                    yesterday at 9:16 PM

                                                    LET'S GOOOOOOO excellent product friends

                                                      • sxmawl

                                                        yesterday at 9:34 PM

                                                        ty ty!

                                                    • TimCTRL

                                                      yesterday at 8:41 PM

                                                      $60...eh

                                                        • ishandeveloper

                                                          yesterday at 11:25 PM

                                                          Totally fair reaction! Here's our honest thinking behind it.

                                                          We deliberately avoided credits/usage-based pricing because as founders using this in our own creative workflow, we hate the cognitive load that comes with it.

                                                          If I don't like a voiceover/variation, I should have the freedom to regenerate it until I'm happy without thinking about whether it's "worth" a credit.

                                                          That said, we could be wrong! Genuinely curious what you think would feel fair?