How we calculate
the Gap Score
The Gap Score is our proprietary 0–100 metric measuring how much a story is trending on social media relative to its mainstream media coverage. A score of 94 means enormous social interest with almost no MSM coverage — a significant coverage gap signal. Here's exactly how we compute it.
Important
A high GapScore means a story is receiving significantly more social attention than mainstream coverage. It is a signal for further investigation, not a verdict on truth or importance.
Core formula
Gap Score =
(SVI − MCI)
× SDM × SW
Normalized to 0–100 scale · Updated every 15 minutes
Rate of social mention acceleration across platforms
Article count × outlet prominence × recency decay
Up to 1.35× when public sentiment diverges from MSM framing
Up to 1.25× when cross-partisan silence or throttling detected
1. Social Velocity Index (SVI)
We measure mention volume and acceleration across topic-relevant subreddits. The base velocity uses an exponential curve over on-topic post count (score = 100 × (1 − e−n/15)), so going from 0 to 15 posts matters more than going from 50 to 65. An engagement bonus of up to 10 points is added when verified upvote data is available from Reddit's public API.
Platforms in SVI calculation:
Reddit (public API)
Live · Real scores, comments, timestamps
Reddit (ForumScout)
Live · Broader keyword coverage, merged
X / Twitter
Coming soon · Real-time public conversation
Coming soon · Visual + caption signal
TikTok
Coming soon · Virality leading indicator
Reddit data is sourced from two complementary feeds: Reddit's public JSON API (real upvote counts, comment totals, verified timestamps) and ForumScout keyword search (broader topic coverage). Results are merged and deduplicated. Only posts from topic-relevant subreddits contribute to the velocity score. Engagement numbers are only displayed when they come from a verified API response.
2. MSM Coverage Index (MCI)
We ingest article counts, headline prominence, and broadcast minutes from 180+ mainstream media sources across left-leaning, right-leaning, and centrist outlets. MCI is a 0–100 score where 100 = saturation coverage.
We deliberately include both sides of the political spectrum to avoid ideological bias in the score itself. A story buried by CNN and Fox simultaneously scores much lower on MCI than one buried only by CNN — because cross-partisan coverage avoidance is a stronger signal.
Gap formula subtracts MCI directly from SVI at equal weight — full mainstream coverage of a fully viral story produces a gap score of zero.
After subtracting MCI from SVI, a square root compression curve is applied before multipliers. This spreads scores across the full 0–100 range — without it, topics with genuinely near-zero MSM coverage would cluster at 90–100 with no differentiation. The curve maps the raw gap to a base score (rawGap 60 → 50, 80 → 58, 100 → 65), leaving room for sentiment and coverage anomaly multipliers to push higher-signal stories to 90+.
Before computing MCI, articles are passed through a topic relevance filter. Each story has a curated set of required keywords — an article must contain at least one in its headline or description to count toward coverage. This prevents loosely related articles (e.g. a Pentagon budget piece matching a UAP query) from artificially inflating the MCI. If the 10-article sample all fail the filter but NewsData found matching articles overall, a conservative 30% coverage floor is applied — acknowledging the topic is likely covered even when our keyword sample was imperfect.
3. Sentiment Divergence Multiplier
When social sentiment and MSM framing diverge sharply, the raw coverage gap understates the story's significance. A story the public feels positively about that MSM frames negatively (or vice versa) receives a multiplier of up to 1.35×.
Sentiment is scored using a fine-tuned model trained on media framing and social discourse data, producing a score from -1 (very negative) to +1 (very positive) for both the social and MSM signals.
4. Coverage Anomaly Weight
The most nuanced component. We look for anomalies associated with editorial avoidance patterns — distinct from simply low interest. This is our most interpretive signal and the one most susceptible to false positives. See limitations #09 for the full caveat.
- →Story has high social velocity but zero or negative MSM article growth
- →Story is trending in multiple geo-regions simultaneously with low MSM coverage in all
- →Social platform engagement throttling detected (sudden velocity drop + comment restrictions)
- →Story involves entities with known PR/media relations infrastructure
- →Cross-partisan silence: absent from both left and right outlets simultaneously
Coverage Anomaly Weight ranges from 1.0 (no anomaly signal) to 1.25 (strong coverage anomaly pattern). This is the most proprietary component of the Gap Score.
The Causal Gap
ANALYST TIERA second dimension beyond the core Gap Score formula
Beyond individual story gaps, the Analyst tier surfaces a second type of gap: the causal gap — connections between seemingly unrelated stories that suggest a larger pattern mainstream media is failing to connect.
Our entity extraction model identifies shared people, organizations, assets, and events across stories and computes a connection strength score. The Intelligence Hub visualizations (Conspiracy Board, Story Network Graph, Entity Explorer) render these connections interactively.
We do not editorialize on what the connections mean — we surface the data and let analysts draw their own conclusions.
Gap Score interpretation
Massive social velocity, near-zero MSM coverage. Strong coverage gap signal. Historically often precedes major news breaks.
Very high social interest with minimal MSM coverage. Story likely being actively avoided by mainstream outlets.
Significant gap. Story has strong social momentum but limited mainstream coverage. Monitor closely.
Partial coverage gap. Some MSM pickup but social interest significantly exceeds it.
Story is reasonably well-covered by mainstream media relative to social interest.
Limitations & known biases
- English-language bias: Our current data pipeline is weighted toward English-language content. International stories may be under-represented in social velocity calculations.
- Platform API dependencies: X/Twitter API rate limits may cause delays in real-time data for high-velocity stories. We use third-party data aggregators to mitigate this.
- Coordinated inauthentic behavior: Bot-amplified stories can inflate Social Velocity Index. We apply bot-detection filters but they are not foolproof.
- Niche vs. coverage gap: A very niche story may register a high gap score simply because it appeals to a specific community, not because of a genuine coverage gap. Context matters.
MSM Coverage Set
GapWatch monitors 180+ outlets. Coverage set is updated quarterly. Outlet inclusion is based on reach, editorial independence, and geographic diversity — not political lean.
Major National Newspapers
Broadcast & Cable News
Wire Services
Digital Native News
Business & Finance
Science & Technology
International
Specialty & Investigative
Methodology FAQ
What counts as "mainstream media"?
The named outlets in the MSM Coverage Set above, updated quarterly. Outlets are selected for reach and editorial independence across the political spectrum. No outlet is excluded solely on the basis of political lean.
Which countries and languages are included?
The primary coverage set is English-language. US, Canadian, UK, and Australian outlets are included. International English-language outlets (Al Jazeera English, South China Morning Post, The Guardian) are included. Non-English outlets are not currently in the MSM set, but social signals from non-English platforms are tracked.
How are stories clustered and deduplicated?
GapWatch uses keyword clustering and semantic similarity to group related mentions into a single story. A post mentioning "Pentagon UAP program" and one mentioning "DoD unidentified aerial phenomena" are grouped as the same story. Duplicate articles from the same outlet are counted once.
How does GapWatch handle bot-amplified virality?
Social velocity is measured against baseline activity levels for each platform and topic category. Sudden spikes that don't show organic engagement patterns (comments, reposts, varied authorship) are weighted lower in the Velocity Score calculation. GapWatch cannot fully filter coordinated inauthentic behavior but applies normalization to reduce its impact.
How do you distinguish organic interest from coordinated manipulation?
We apply subreddit relevance filtering, account age weighting, and engagement-to-impression ratios where available. High mention counts from newly created accounts or single-platform sources are flagged in the Velocity Score.
Is the score based on absolute volume, velocity, or engagement quality?
The GapScore combines both. The Social Velocity Index (SVI) measures rate of mention acceleration, not absolute volume. A story with 40,000 mentions growing at 8,000/hour scores higher than one with 200,000 mentions growing at 500/hour. The MSM Coverage Index (MCI) measures article count weighted by outlet prominence.
What is the refresh rate?
Stories are refreshed every 15 minutes across all tiers. The pipeline runs on a cron schedule pulling fresh social and MSM data for every tracked story.
How is the "First seen" timestamp determined?
The "First seen" date is set the first time GapWatch detects meaningful social velocity for a story and is permanently preserved — it is never overwritten by subsequent refreshes. This timestamp is the foundation of the "Called It" feature: it proves GapWatch identified a story before mainstream media pickup. The "Last updated" timestamp shown alongside it reflects the most recent data refresh.
What does a GapScore of 100 mean?
A score of 100 means GapWatch detected significant social velocity around a story with near-zero mainstream coverage in the measurement window. It does not mean the story is true, important, or being deliberately withheld. It means the gap between social attention and mainstream coverage is at maximum. Always verify the underlying source before drawing conclusions.
Questions about our methodology? press@gapwatch.io