Choosing the right tool for each moderation task

Content moderation is one of those problems that looks simple from the outside. You have content. You want to stop the harmful stuff.
Just point an AI at it and let it run.

Anyone who has actually tried to moderate an online platform at scale knows it does not work that way. Finding an AI model that can detect harmful content is the easy part. Building a system that does it consistently, at volume, in real time, without breaking your budget or compromising user privacy is a different challenge entirely. Most AI moderation systems are not designed for that. They are designed for demos. Amanda is Aiba’s AI content moderation platform, and it was built for that reality.

Why single-model moderation falls short

Large language models are genuinely impressive at moderation tasks. They understand context, interpret policy nuance, and catch patterns that simpler tools miss entirely. In controlled conditions they perform well, but production is not a controlled condition.

When you push all your moderation decisions through a large model at scale, three problems surface quickly:

  • Cost scales with volume. Every message, every post, every username check costs money.
    On a busy platform, that adds up fast.
  • Latency compounds. Large models are slower.
    In real-time workflows, even small delays create friction that users feel.
  • Most decisions are not complicated.
    Routing obvious violations through a powerful reasoning model when a pattern check would catch them instantly is expensive and unnecessary.

There is also a privacy dimension. When all your content flows through an external model, the question of where that model runs, who operates it, and what data is retained becomes critical. For platforms serving younger audiences or operating under GDPR, that is a requirement that has to be designed in from the start.

This is why building a moderation system that works in production is more complex than simply connecting an AI model to your platform. Many teams eventually face the question of whether to build that infrastructure internally or adopt a system designed for it. Our guide on buying vs building a content moderation system explores the tradeoffs.

Aiba’s toolbox

The right tool for every decision

Deterministic Filtering

The first tools are a blocklist and a pattern recognition system. It handles the clear cases: known slurs, banned phrases, obvious violations. When something new appears in your community, a coded term your users start using to evade detection, you can add it to the blocklist immediately and it gets filtered from that moment forward. No retraining or waiting.

This layer is deliberately simple. Fast, cheap, and fully auditable. That is exactly what it is supposed to be.

Small Language Models (SLM)

The second tool is small language models built and hosted by Aiba. Because they run on Aiba’s own infrastructure, your content never leaves a controlled environment. They are fine-tuned for specific moderation tasks: username screening, message filtering, conversation-level analysis. That specialisation is what makes them efficient enough to run at high volume without cutting corners on accuracy.

A small model trained on general text will not hold up against the specific vocabulary, slang, and patterns found in gaming communities or social platforms. Aiba’s models are trained for those environments, and that focus is what makes the difference.

Large Language Models (LLM)

The third and most capable tool is the large language models.
They are reserved for the decisions that genuinely require them. Large language models bring a depth of contextual understanding that lighter tools cannot replicate. But in Aiba’s architecture, they also serve a second role: teachers.

Through a process called distillation, these powerful models train Aiba’s smaller, faster models to improve over time. The large model’s judgment gets baked into the small model’s behaviour, so you get sharp, context-aware decisions at the speed and cost of a lightweight system. In practice, that means your moderation gets more accurate over time without your infrastructure costs going up.

The system behind the system

Aiba has spent a significant amount of time studying how moderation teams work: where decisions get made, where backlogs pile up, where reviewers burn out, and where the system buckles under pressure. That understanding shapes the routing logic that connects all the tools.

Most content never makes it past the first layer. A large share is handled by the small language models. Only a small fraction reaches the LLM layer. The result is a system that can handle the full volume of a busy platform without burning through infrastructure costs or creating bottlenecks.

The real engineering lies in the orchestration.

That means designing routing logic that decides, accurately and in real time, which layer handles each piece of content. It means fine-tuning models for each customer’s platform, community, and policy context. And it means managing inference cost, latency, privacy, and quality at the same time, at scale. That orchestration is what turns a set of capable models into a system you can actually rely on. It is the difference between a convincing demo and something that can protect a real platform.

If you are building or rethinking your moderation infrastructure, we would be glad to show you how Amanda works in practice. Not a pitch, just a walkthrough of the system and an honest conversation about whether it fits.