Best practices for scalable enforcement that balance fairness, safety, and compliance
-
IntouchCX Team
Imagine one of your platform’s users posts an AI-generated video of a politician that looks eerily real. It spreads fast, being shared by more users each hour, sowing misinformation about that candidate’s statements or policies just days before an election. Your team scrambles to respond, and they may get a rebuttal out quickly, but the damage is already underway. Should the video have been flagged earlier? Should it be taken down or just labeled? And who decides?
These are the real-world questions facing every trust and safety team today.
At the Marketplace Risk conference in Austin, Alexandra Popken, VP of Trust & Safety at WebPurify, an IntouchCX company, unpacked the practical challenges and solutions for enforcing content policies at scale without sacrificing on fairness, safety, or compliance.
1. Start with principles, not just policies
“We are not arbiters of truth,” Alex points out. “Our focus should be on problematic content that has been widely and, in some cases, scientifically debunked.”
Too often, she says, companies dive into enforcement without first defining their values. Alex believes platforms need to start by being honest about their mission. What is your risk tolerance? What are your operational limitations? These are the questions to be asking first.
She also recommends acknowledging internal constraints, such as limited moderator coverage in certain languages or regions. Transparency within your organization helps avoid unrealistic expectations and misaligned priorities.
Ultimately, Alex advocates a harm-based approach, where you’re focusing on highest, most directly harmful content. “Whether it’s misinformation shared maliciously or accidentally, the impact is the same,” Alex notes. The goal should be to mitigate harm, not determine intent.
2. Use a harms-based framework for policy scope
Once you’ve established your internal principles, how do you determine what types of misinformation are actually enforceable? Alex suggests defining a North Star Criteria; in other words, a decision-making framework that helps you decide what content should be in or out of scope for enforcement.
Her recommended North Star Criteria includes three core questions:
- Is the content confirmed to be false or misleading?
- Does it pose a high potential for physical, financial, or societal harm?
- Is it likely to reach a wide audience quickly, amplifying its impact?
This framework is an especially useful benchmark for when new misinformation narratives emerge, as it lets your trust and safety team assess those narratives through a consistent lens. For example, a misleading post about vaccine ingredients may meet all three criteria, while a regional rumor about a product recall may not. Having these criteria upfront helps teams prioritize enforcement resources more efficiently and reduce the ambiguity for moderators.
3. Leverage a cross-platform analysis
To refine what your team focuses on, Alex recommends conducting a cross-platform analysis, or in layman’s terms, a review of how other platforms are enforcing similar policies. The goal here isn’t simply to copy your competitors but to gain more understanding about patterns and shared priorities.
“It turns a nebulous policy area into something enforceable,” she says. By examining what Facebook, YouTube, TikTok, and others flag or remove, for example, you can identify the misinformation categories that are most actionable and high-risk.
Alex shares several enforcement themes that consistently rise to the top across platforms:
- Civic disruption: misinformation about elections or census processes.
- Climate denial: content that contradicts established scientific consensus on climate change.
- Health misinformation: false health claims that could lead to injury or death.
- Mass atrocity denial: conspiracy theories that rewrite or deny historical tragedies.
- Synthetic media: deepfakes or AI-manipulated content designed to deceive.
A cross-platform analysis gives your team a shared vocabulary and a stronger reference point when building enforcement priorities. It also helps with executive alignment when you can show your approach is grounded in industry norms.
4. Build scalable workflows (not just policies)
Policies are the what. Workflows are the how.
Alex explains that it’s not enough to just write a good policy. If moderators don’t have clear, actionable guidance, then enforcement of that policy across your platform becomes inconsistent and ultimately biased against certain users. That’s why symptom-based decision trees — simple yes/no flowcharts for different violation types — are hugely important.
“You want to remove as much subjectivity as possible from the process,” Alex says. These workflows empower moderators to act confidently and consistently, especially under time pressure. Moderators often have to make decisions within seconds.
They also prepare your team for scale. Whether you’re managing a handful of moderators or hundreds, consistency is key to building trust with your users and minimizing reputational risk.
And don’t forget agility. As Alex says, “There weren’t COVID misinformation policies pre-pandemic. Policies had to evolve as new information emerged.” Treat your policies and workflows as living documents that adapt to new threats and data.
5. Adopt a hybrid moderation model
Scaling enforcement means knowing when to delegate.
Alex outlines three primary models:
1. Fully in-house: this offers control and deeper platform knowledge but is more expensive and slower to scale.
2. Third-party moderation: this is more cost-effective and scalable, but can lack context and training.
3. Tech-only enforcement: this is more fast and efficient, but risks false positives and misses nuance.
So what is the sweet spot? The perfect balance? A hybrid approach.
“Your in-house team should own the policies and workflows,” Alex said. “Then leverage a third-party moderation team to enforce those policies at scale. Finally, bring in tech and data providers to flag emerging trends.”
Alex highlights as a perfect example of when WebPurify and NewsGuard — an organization that provides a reliability rating system for news and information websites — recently partnered with a leading ad tech company. WebPurify provided the human moderators and policy guidance, while NewsGuard delivered real-time alerts on emerging misinformation narratives. The result was a 99.24% agreement rate between moderators, creating high-quality data they could use to train and validate the company’s AI model.
This kind of partnership allows platforms to enforce policies consistently and confidently, without having to solely rely on automation.
6. Plan for major events
High-risk moments need high-impact strategies. Moderating harmful content in real-time, such as unfolding crisis events, presents a whole new set of challenges.
During election seasons, major news cycles, or regional or global crises, misinformation surges. That’s why Alex recommends standing up temporary “war rooms” or “tiger teams” to focus on those moments.
“It’s important to be proactive, not reactive,” she explains. These short-term teams can move faster, escalate issues more efficiently, and stay aligned on platform goals during critical times.
Tech tools help here too. “We rely heavily on NewsGuard to stay ahead on new trends, and resources like Ad Fontes Media Biashart,” Alex says. These tools help ensure your moderation team is using nonpartisan fact-checking sources, which is crucial when addressing politically sensitive topics.
Planning ahead also reduces burnout. When your moderators know a strategy is in place for high-volume events, they can work with greater confidence and less stress.
7. Design appeals and feedback loops thoughtfully
How do you handle appeals, especially in gray areas? This is a real challenge.
“Transparency is everything,” Alex says. She emphasizes the need for clear user education, ideally before enforcement even occurs. That means publishing clear policies, providing examples, and explaining what happens when someone’s content is flagged.
Appeals should also be handled by specially trained moderators who understand the nuances of misinformation. “You don’t want to get into philosophical debates,” Alex says. “If someone tells others to take a shot of bleach to cure COVID, we can all agree: that’s misinformation.”
User reporting flows should also include dedicated categories for misinformation. That way, issues can be triaged and escalated appropriately, rather than being lumped into generic buckets.
Conclusion
Ultimately, there’s no silver bullet for scalable enforcement. But there is a system.
By defining your principles early, using a harm-based framework, building actionable workflows, and combining human expertise with the right tech and partners, platforms can navigate the complexity of misinformation with greater confidence.
Enforcement at scale doesn’t mean sacrificing fairness or nuance. It means you’re operationalizing them.As online risks and their associated challenges continue to evolve, Our Trust & Safety Consultancy is helping brands across all industries meet the moment, offering everything from end-to-end risk audits to regulation compliance to helping establish community guidelines and enforcement practices. We’re here to help.