2026-07-04 12:10 UTCIn-site rewrite3 min readUpdated: 2026-07-04 12:40 UTC

Create your own AI, then watch it battle others in your browser

Agenlus is a browser-based platform for reinforcement learning that requires no installation. It uses WebGPU and Pyodide to train AI models locally on user devices, achieving zero infrastructure cost. The platform aims to democratize RL, enabling anyone to train and share AI agents through community features, leaderboards, and gamification.

SourceHacker News AIAuthor: umjunsik132

A Quick Primer: What is Reinforcement Learning (RL)?

If you have mostly worked with Large Language Models (LLMs) or Supervised Learning, RL is a shift in mindset:

Supervised Learning (like predicting the next token in a text file or classifying an image) relies on a static dataset of static correct answers.

Reinforcement Learning relies on active feedback loops. An Agent (our AI model) interacts with an Environment (the game or simulation). It takes an Action, receives an Observation (the new state of the environment) and a Reward (a signal telling the agent how well it is doing), and repeats.

The agent’s goal is to learn a Policy (a mapping from observations to actions) that maximizes the cumulative reward over time. It starts completely random and improves purely through trial and error.

Because there is no “correct answer” provided upfront, RL agents often find incredibly clever, emergent ways to solve games that developers never anticipated. However, this trial-and-error process is computationally intensive and requires millions of interactions, which is why bringing it directly to the browser is both challenging and exciting.

Reinforcement Learning (RL) has always been one of the most fascinating branches of AI. There is something deeply satisfying about watching a blank-slate agent explore an environment and gradually emerge with a superhuman policy.

Yet, compared to the explosive growth of LLM playgrounds and tools, RL remains relatively inaccessible. Setting up a local environment often means wrestling with Python virtual environments, CUDA versions, PyTorch installations, and headless rendering bugs in Gymnasium.

We built Agenlus to solve this. It is a community platform and model hub for Reinforcement Learning designed to run entirely in the browser—no installation, no CUDA configuration, just instant training and evaluation.

Democratizing Reinforcement Learning

For the past decade, state-of-the-art Reinforcement Learning has been the exclusive playground of elite corporate labs and well-funded academic institutions. Whether it is Google DeepMind’s AlphaGo, OpenAI’s Dota 2 bots, or sophisticated industrial robotics control, RL has required access to massive compute clusters, complex simulator setups, and specialized mathematical expertise.

This centralization has stifled the creative potential of independent developers and researchers. While anyone can easily prompt a large language model online, starting out with RL requires wrestling with complex local setups, GPU drivers, and local virtualization, only to wait hours for a simple agent to converge.

We believe RL needs to be democratized.

By leveraging modern web technologies, we want to break down these barriers:

Lowering the Entry Barrier: You don’t need a high-end local machine or an AWS budget to experiment with RL. If you have a browser, you have a fully functional RL research lab.

Open-Source Environment Sharing: Just as Hugging Face democratized NLP by making models easy to share, Agenlus allows developers to upload, share, and benchmark environments instantly.

Interactive Learning: Seeing the training process happen live in the browser builds a deep, intuitive understanding of how agent policies adapt to rewards.

By putting the tools of RL directly into the hands of the global developer community, we aim to accelerate the discovery of novel control architectures and algorithms that corporate labs might overlook.

Why B2C RL is Highly Viable Today

We are currently witnessing immense compute inflation dominated by LLMs. This has made building B2C AI startups incredibly expensive, forcing founders to choose between paying massive cloud GPU invoices or raising millions in venture capital.

We believe Reinforcement Learning (RL) is structurally primed to break this cycle and lead a new wave of highly profitable B2C AI applications for three key reasons:

Zero Marginal Infrastructure Cost: Unlike LLMs where every inference token costs API credits, RL training and inference in Agenlus run 100% locally on the user’s client hardware via WebGPU. Our server costs are virtually zero. This allows us to scale to millions of active users and offer a permanent free tier without burning through compute credits, shifting the monetization focus to marketplace transactions and custom assets.

Extreme Model Efficiency: While a decent LLM requires billions of parameters, high-performing RL agents for games (even complex 2D/3D platformers and control tasks) are incredibly lightweight. A small Multi-Layer Perceptron (MLP) or a tiny Convolutional Neural Network (CNN) of under 100K parameters is often enough to achieve superhuman policies. These models load instantly and execute hundreds of steps per second on entry-level mobile devices or laptops.

Gamification and Natural Viral Loops: Generative AI tools are mostly focused on productivity. In contrast, training an RL agent is inherently gamified. It feels like nurturing a digital pet (like a Tamagotchi) or coaching a sports team. By transitioning the agent from direct control to autonomic training, users develop a deep emotional attachment to their agent’s playstyle. When you add competitive leaderboards and multi-agent PvP arenas, you create a natural, viral social loop (“My agent can beat yours”) that drives organic growth without expensive customer acquisition costs.

Crowdsourcing the Future of Offline Data: RL’s biggest bottleneck in industries like robotics is the lack of diverse, high-quality demonstration datasets. By building a B2C platform where users play to train agents, we are crowdsourcing a massive library of human behavioral trajectories across thousands of environments. This diverse dataset is a goldmine for training future foundational models that generalize across multiple control domains.

See It in Action

Here is a quick YouTube demonstration of the platform in action:

And here is a preview of the web UI running a client-side training loop:

Here is how we built the architecture to make in-browser RL viable, the UX challenges we encountered, and why we believe RL is prime for a B2C comeback.

The Architecture: WebGPU + Pyodide + Web Worker

To achieve a zero-install experience, we had to move both the environment simulation and the model training into the client browser.

Our core architecture splits the load into three parts:

Browser Context ├── Web Worker (Pyodide) │ ├── gymnasium (micropip) │ ├── env.step(action) -> observation, reward, terminated │ └── Sends state updates to the main thread │ ├── Main Thread (WebGPU & JS) │ ├── Model inference & policy updates │ ├── Action selection (via WebGPU-accelerated tensors) │ └── Drawing Commands Bridge (rendering to Canvas)