\

Show HN: I built an ISP infrastructure emulator from scratch with a custom vBNG

35 points - today at 1:38 PM


Demo: https://aether.saphal.me GitHub: https://github.com/saphalpdyl/Aether

Aether is a multi-BNG (Broadband Network Gateway) ISP infrastructure lab built almost from scratch that emulates IPoE IPv4 subscriber management end-to-end. It supports IPoE/Ipv4 networks and runs a python-based vBNG with RADIUS AAA, per-subscriber traffic shaping, and traffic simulation emulated on Containerlab. It is also my first personal networking project, built roughly over a month.

Motivations behind the project

I'm a CS sophomore. About three years ago, I was assigned, as an intern, to build a OSS/BSS platform for a regional ISP by myself without mentoring. Referencing demo.splynx.com , I developed most of the BSS side ( bookkeeping, accounting, inventory management ), but, in terms of networking, I managed to install and setup RADIUS and that was about it. I didn't have anyone to mentor me or ask questions to, so I had given up then.

Three years later, I decided to try cracking it again. This project is meant to serve as a learning reference for anyone who's been in that same position i.e staring at closed-source vendor stacks without proper guidance. This is absolutely not production-grade, but I hope it gives someone a place to start.

Architecture overview

The core component, the BNG, runs on an event-driven architecture where state changes are passed around as messages to avoid handling mutexes and locks. The session manager is the sole owner of the session state. To keep it clean and predictable, the direBNG never accepts external inputctly. The one exception is the Go RADIUS CoA daemon, which passes CoA messages in via IPC sockets. Everything the BNG produces(events, session snapshots) gets pushed to Redis Streams, where the bng-ingestor picks them up, processes them, and persists them.

Simulation and meta-configs

I am generating traffic through a simulator node that mounts the host's docker socket and runs docker exec commands on selected hosts. The topology.yaml used by Containerlab to define the network topology grows bigger as more BNG's and access nodes are added. So aether.config.yaml, a simpler configuration, is consumed by the configuration pipeline to generate the topology.yaml and other files (nginx.conf, kea-dhcp.conf, RADIUS clients.conf etc.)

Known Limitations

- Multiple veth hops through the emulated topology add significant overhead. Profiling with iperf3 (-P 10 -t 10, 9500 MTU, 24 vCPUs) shows BNG→upstream at ~24 Gbit/s, but host→BNG→upstream drops to ~3.5 Gbit/s. The 9500 MTU also isn't representative of real ISP deployments. This gets worse when the actual network is reintroduced capping my throughput to 1.6 Gbits/sec in local. - The circuit ID format (1/0/X) is non-standard. I simplified it for clarity. - No iBGP or VLAN support. - No Ipv6 support. I wanted to target IPv4 networks from the start to avoid getting too much breadth without a lot of depth.

Nearly everything I know about networking (except some sections from AWS) I learned building this. A lot was figured out on the fly, so engineers will likely spot questionable decisions in the codebase. I'd genuinely appreciate that feedback.

Questions

- Currently, the circuit where the user connects is arbitrarily decided by the demo user. In a real system with thousands of circuits, it'd be very difficult to properly assess which circuit the customer might connect to. When adding a new customer to a service, how does the operator decide, based on customer's location, which circuit to provide the service to ?

Source
  • yjftsjthsd-h

    today at 6:17 PM

    Forgive my ignorance, this isn't my strong suit. Am I correct in understanding that this is mostly a simulation layer for the actual physical network, but that you're mostly(?) running off-the-shelf software on top? So this is running the same software that you'd use for a real ISP network, just without having to actually provision all the hardare? Or is part of the actual network management custom as well?

      • saphalpdyl

        today at 7:58 PM

        Hello. Containerlab gives me the virtual network topology ( links through veth pairs, containers etc.). The actual BNG's Control plane ( authentication, authorization, session handling, traffic shaping, events streaming etc. ) is written by me. So it's less running off-the-shelf software running on virtualized hardware, and more writing the software and running it on a virtualized hardware.

        At some point, I did use Nokia SR Linux as my access node + relay, but had issues with configuration and Option 82. Later, I wrote one myself.

    • john_strinlai

      today at 6:50 PM

      this looks pretty interesting! i plan to take a closer look after work, but thought i would mention it now: it may be worth a look through the NANOG (north american network operators group) archives (https://nanog.org/nanog-mailing-list/list-archives/) for information around your question if you havent, and/or posting your question to the NANOG mailing list. there are many very friendly people who have experience running ISPs of all sizes.

      (or whichever operators group best fits your area. i only subscribe to NANOG, so cant speak to the activity/friendliness of the other groups. you can find a pretty comprehensive list here: https://nanog.org/resources/organizations-our-community/)

      • saphalpdyl

        today at 5:45 PM

        I recently found out about Apache Netbox that would act as the authoritative source of truth for the network topology and replace majority of aether.config.yaml. In Splynx, I did not see any mention of an external solution. It seems they have their own stack for that.

        A better and UX-friendly implementation would have been Netbox + aether.config.yaml -> configuration pipeline -> topology.yaml + <other generated files>.

        • nonameiguess

          today at 6:26 PM

          I feel like you were done dirty. When I was in grad school 12 years ago, our networking classes used mininet to simulate networks on a single host. It's mostly meant for developing SDN systems, but probably would have met your needs and supports way more.

          On the other hand, building even a tiny subset but doing it yourself from scratch is a great way to learn. I made a very poor man's VM image builder for HyperV years back because Packer didn't have a builder for it at the time and that was a pretty interesting experience. Finally grokked the Windows object model and even though I still don't use it, I at least no longer jeer at PowerShell.

          I'm interested in the answer to your question, too, but as a customer of an ISP. I don't work for one. I was the first owner of my house and when they hooked me into their network, whoever did messed up my neighbors badly, putting them on the wrong circuit and bleeding noise into adjacent neighborhoods. For three years, complaint calls would get our network cut by third-party contractors with no warning, then we'd have to call and get it reconnected. I don't know how they're supposed to do it, but know it can cause quite a mess when they do it wrong.

          • bikesharing

            today at 5:28 PM

            [dead]