News Source Credibility Assessments: Comparing Journalists, Fact-Checkers, Communities and Tools

#TTOcon
5 October 2019
Martino Mensio, Harith Alani
News Source Credibility
in the Eyes of Different Assessors

Assessments from journalists, fact-checkers, communities, tools
How do they relate?
1. Overlap: do they evaluate the same sources of information?
2. Agreement: do they have similar verdicts?
3. Granularity level: how does the verification of claims compare to the credibility of sources where they appear*?
2
Introduction
citizen Information
Expert
assessments
* https://schema.org/appearance

Existing assessments
how to compare / combine?
3

What is credibility?
- Trustworthiness (Hovland and Weiss 1951, Web of Trust) – more perception, by gut
- Expertise (Hovland and Weiss 1951) – related to the public image, history of news outlet
- Believability (Meyer, 1988) – ability to be believed, with reputation brings to credibility
- Community affiliation (Meyer, 1988) – context of the source: point of view / bias / opinion
- Factuality (fact-checkers, NewsGuard) – not publishing false information
- Safety (Web Of Trust) – safe content / virus / scam
- Popularity (PageRank) – how “well-known” is the source
- Transparency (NewsGuard, Newsroom Transparency Tracker) – adhering to a set of standards, accountability
- W3C discussion group “Credibility signals”
https://credweb.org/signals/
Credibility definition
5

1. Which specific factors of credibility we are considering (credibility definition)
• Factuality
• Believability
• Trustworthiness
• Transparency
2. The dimensions used:
• Value: how much positive/negative is the assessment ⊂ [−1, +1]
• Confidence: how certain is the value ⊂ [0, +1]
Credibility formulation
Approach (1/2)
6

Mapping the assessments
Approach (2/2)
Assessor Credibility value Credibility confidence
Web Of Trust trust.score[0; 100] → cred[−1; 1] trust.conf[0; 100] → conf[0; 1]
NewsGuard linear score[0; 100] → cred[−1; 1]
exception Platform, Satire → cred = 0
conf = 1
exception Platform, Satire → conf = 0
Media Bias/Fact Check factuality{LOW, MIXED, HIGH} → cred{−1, 0, 1} conf = 1 if factuality, otherwise conf = 0
OpenSources fake → −1, reliable → 1
conspiracy, junksci → −0.8 clickbait, bias → −0.5
rumor, hate → −0.3 all other tags → 0
conf = 1 when credibility is not null
otherwise conf = 0
International Fact-
Checking Network
starting from cred = 1 apply penalties for partially (0.05)
and none (0.1) compliant with lower bound cred = 0
if expired signatory conf = 0.5
otherwise conf = 1
Newsroom Transparency
Tracker
proportionally to the number of indicators satisfied,
partial compliance counts half
conf = 1
ClaimReview cred =
)*+,-./*012345)6+7*+,-.
826+7*+,-.345)6+7*+,-.
∗ 2 − 1
otherwise try mapping the alternateName
if the mapping is successful conf = 1
otherwise conf = 0
7

*ClaimReview assessments are at the claim level, this number reflects the source level aggregation
Analysis: data statistics
Assessor Number of sources rated Average credibility
Web Of Trust 308155 0.4264
NewsGuard 2795 0.5433
Media Bias/Fact Check 2404 0.3874
OpenSources 811 -0.6618
International Fact-Checking Network 86 0.8786
Newsroom Transparency Tracker 52 0.4256
ClaimReview 379* -0.3349
8

Comparing the assessments:
1. Overlap: do they evaluate the same sources of information?
2. Agreement: do they have similar verdicts?
3. Granularity level: how does the verification of claims compare to the sources of appearance?
Recap RQs
9

RQ1: do they evaluate the same sources of information?
Overlap definitions:
- Symmetrical: Jaccard index 𝐽 𝐴, 𝐵 =
?∩A
?∪A
- Asymmetrical: 𝑜𝑣𝑒𝑟𝑙𝑎𝑝 𝐴 → 𝐵 =
?∩A
A
𝑜𝑣𝑒𝑟𝑙𝑎𝑝 ⊂ [0, +1]
Definition
Analysis: Overlap
|A| = 100000
|B| = 3000
|A∩B| = 2000
J A, B
2000
101000
= 1.98%
𝑜𝑣𝑒𝑟𝑙𝑎𝑝 𝐴 → 𝐵 =
2000
3000
= 66.67%
𝑜𝑣𝑒𝑟𝑙𝑎𝑝 𝐵 → 𝐴 =
2000
100000
= 2.00% 10

1. How to read the figure
The value of each cell: percentage of how many
sources evaluated from the assessor on the row
have also been evaluated by the assessor on the
column
2. Notable examples:
• Several don’t provide ratings for platforms
(facebook.com, twitter.com, youtube.com, …)
3. The main problems:
• Most of the assessors just rate a few sources
• Most of the assessors rate disjoint sets of sources
Analysis: Overlap
11

RQ2: do they have similar verdicts?
Agreement definition: pairwise cosine similarity,
evaluated on the sources rated by both assessors
Example:
𝑎𝑔𝑟𝑒𝑒𝑚𝑒𝑛𝑡(𝐴, 𝐵) = 𝑐𝑜𝑠𝛼 = 0.375
𝑎𝑔𝑟𝑒𝑒𝑚𝑒𝑛𝑡 ⊂ [−1, +1]
Definition and results
Analysis: Agreement
12
Assessor Source_1 Source_2 Source_3 Source_4
A +1.0 +0.2 -0.8 -0.9
B +0.9 +0.5 +0.5 -1.0

Disagreement examples:
• weeklystandard.com
• IFCN: expired signatory
• NewsGuard: “generally maintains basic standards of credibility and transparency”
• NewsGuard: “does not handle the difference between news and opinion”
• OpenSources: political, bias
• Media Bias/Fact Check: factual reporting HIGH
• zerohedge.com
• NewsGuard: “severely violates basic standards of credibility and transparency”
• Media Bias/Fact Check: factual reporting MIXED
• Web of Trust: reputation 4.5/5
• breitbart.com
• NewsGuard: “generally maintains basic standards of credibility and transparency, with some significant exceptions”
• Web of Trust: reputation 4/5
• OpenSources: political, unreliable, bias
Why?
• They evaluate different criteria / features
• How can we validate / modify our mappings to a single scale? (Intuitiveness for the users)
• How can we prioritise disagreeing assessors?
Problems
Analysis: Agreement
13

RQ3: Do fact-checker verdicts match with source credibility?
Compare the extracted assessment with native source-level assessments
- Credibility: mean value of the fact-checking verdicts for the source considered
- Having a fact-checked article compensates having a false one?
- Selection bias of the fact-checkers?
Analysis: different granularities
14
Fact-check claim
Claim
appearance
Source
(breitbart.com)
Source
(politifact.com)
extracted source-level
assessment

1. Examples:
• breitbart.com
• Positive: NewsGuard, Web Of Trust
• Negative fact-checks: Politifact, Les Decodeurs
• bbc.co.uk
2. Problems:
• Claim appearing on news outlets, just reporting
• Example: James Cleverly on BBC “Today” https://fullfact.org/europe/free-ports/
“The EU has stopped the UK from having free ports.” à Incorrect (FullFact)
• Claim appearance annotations in ClaimReview are few
• Platforms with user content: define more granularity levels
• Accounts / pages
• Subdomains
Analysis: different granularities
15

1. How to handle disagreement? Which assessments to trust more?
• Credibility Propagation Graph:
• Using the credibility of the assessor itself (recursive)
• Confidence of the origin of the assessment
• Granularity
• Default and customizable credibility of starting nodes
2. How to present to the public
• Intuitiveness
• Avoid backfire
• Stimulate interest
• Level of details
Open challenges
16

Martino Mensio
twitter.com/MartinoMensio
Co-Inform
twitter.com/Co_Inform
Harith Alani
twitter.com/halani

News Source Credibility Assessments: Comparing Journalists, Fact-Checkers, Communities and Tools

Recommended

Recommended

More Related Content

Similar to News Source Credibility Assessments: Comparing Journalists, Fact-Checkers, Communities and Tools

Similar to News Source Credibility Assessments: Comparing Journalists, Fact-Checkers, Communities and Tools (20)

More from Martino Mensio

More from Martino Mensio (7)

Recently uploaded

Recently uploaded (13)

News Source Credibility Assessments: Comparing Journalists, Fact-Checkers, Communities and Tools