Our Methodology

Transparency in how we categorize, cluster, and present the news.

Bias Categorization

Sources are categorized based on widely recognized media bias ratings from AllSides, Ad Fontes Media, and Media Bias/Fact Check. We group sources into three categories:

Left-Leaning

CNN, MSNBC, Washington Post, NBC News, CBS News, ABC News, NPR, HuffPost, Politico, The Guardian, Vox, Slate, Mother Jones, The Intercept, Salon, and more

Center

Associated Press, Reuters, BBC News, Al Jazeera, USA Today, Axios, The Hill, Newsweek

Right-Leaning

Fox News, Breitbart, Washington Times, National Review, NY Post, Daily Wire, The Blaze, Daily Caller, The Federalist, Washington Examiner, Townhall, RedState, and more

These categorizations reflect the editorial tendency of a source, not any individual article. A left-leaning outlet may publish a conservative op-ed and vice versa. We categorize the source, not the article.

Data Freshness

Articles are fetched directly from RSS feeds across 70+ news sources. The site rebuilds every 2 hours and also refreshes live via an on-demand API — articles typically appear within minutes of publication.

Promotional and affiliate content is automatically filtered out during ingestion so that only genuine editorial content appears on the site.

Story Clustering

Articles are grouped into story clusters using keyword-overlap matching. The process works as follows:

Keyword extraction — Significant keywords are extracted from each headline.
Stop word removal — Common words (the, a, is, etc.) are removed to focus on meaningful terms.
Union-find clustering — Articles that share enough keywords to likely be covering the same event are grouped together using a union-find algorithm.
Cross-spectrum grouping — Clusters pull articles from across all three bias categories, making it easy to compare how different outlets frame the same story.

This approach is intentionally simple and transparent. It avoids opaque machine learning models in favor of a deterministic, auditable process.

Coverage Analysis

Beyond clustering, Extra Extra surfaces several coverage metrics designed to help you think critically about the media landscape:

Blind Spots

A "blind spot" occurs when a story is covered by sources from one or two political perspectives but ignored by the others. For example, a "Left Blind Spot" means left-leaning sources are not covering a story that right-leaning and/or center sources are reporting on. Blind spots reveal what each side of the spectrum considers (or does not consider) newsworthy.

Balance Scoring

Story clusters are scored based on the diversity of perspectives represented. A story covered by left, center, and right sources receives a higher balance score than one covered by only a single perspective.

Diversity Metrics

We track the overall diversity of coverage on any given day — how many stories are covered broadly vs. narrowly, and whether certain perspectives are dominating the conversation on a particular topic.

Limitations & Transparency

No methodology is perfect. We want to be upfront about the limitations of our approach:

Bias is a spectrum, not a label. Placing a source into "left," "center," or "right" is inherently reductive. Individual reporters, articles, and topics within a source can vary widely.
Clustering is imperfect. Keyword-overlap matching may occasionally group unrelated stories or miss connections that a human reader would catch.
RSS feeds are not exhaustive. Not every article published by a source appears in its RSS feed. Some stories may be missed.
U.S.-centric perspective. Our source list focuses primarily on U.S. news outlets. International coverage is included via sources like BBC News, Al Jazeera, and The Guardian, but the overall lens is American.
We don't rate accuracy. Extra Extra categorizes bias, not truthfulness. A source being labeled "center" does not mean it is more accurate than one labeled "left" or "right."

We are committed to improving our methodology over time. If you have suggestions or spot an error, we welcome feedback.