SlideShare a Scribd company logo
1 of 38
Download to read offline
Applications &
Implications of
Big Data for
Official Statistics
Emmanuel Letouzé
Director & co-Founder
Data-Pop Alliance
DfID, London, February 26, 2015
1. The Emergence of Big Data &
The Statistical Tragedy
Framing and surfacing of the issue
2. Big Data and Official Statistics:
Substitute, Complement, or “It’s complicated”?
3. The Case of the SDGs
A story of fish and fishermen
1. The Emergence of Big Data
vs. The Statistical Tragedy
Framing and surfacing of the issue
Hal Varian’s nowcasting, GDP
and light emissions paper.…
Line shows returns for “Big Data” on Google Trends between 2007 and 2014; 100=maximum value
“We are at
the beginning
of what I call
The Industrial
Revolution of
Data.”
Joe Hellerstein
, November
19, 2008
Context: the Big Data rush
*Source:OxfamInternational,citingCreditSuisse,Jan.2014
“Data is the new oil”
Google Flu Trend: rise and fall
Hope or Hype?
2. Big Data & Official Statistics
It’s complicated. Or complex.
What is Big Data?
2010-12: the 3 Vs of big data
i. Exhaust
ii. Web
iii. Sensing
Crumbs
Capacities
Communities
What is Big Data?
Now: the 3 Vs of Big Data
Movement of an individual in
Rwanda over 4 years using CDRs (Source J. Blumenstock, 2010)
The new data ecosystem
1. Early warning
1. Real time awareness
1. Real-time feedback
Source: Letouzé, 2012
“What can it be used for?”
—Taxonomy of applications (1)
1. Descriptive
-e.g. maps, clouds..
1. Predictive:
-forecasting
-inference
1. Prescriptive
-causal inference Source: Letouzé, Vinck and Meier, 2013
“What can it be used for?”
—Taxonomy of applications (2)
National
Statistical
Institutes carry
out surveys
Telefonica team
used their data to
‘predict’ SELs from
Cell Phone Usage
Predict the present
(SELs for non-
surveyed regions)
and monitor the
future (track
changes over time)
Survey from “a
major city in
Latin America”
Source: “Prediction of Socio-Economic Levels Using Cell-
Phone Records” (Telefonica research, 2011)
‘Predicting’ socioeconomic levels?
Promoting a people-centered Big Data
Revolution
Counting people?
Sample bias correction
Then:
blending of hypothesis based vs. supervised
machine learning methods to model bias
Source: Letouzé, 2014, based on primary and secondary sources
What & how much do we know?
….and does it matter?
Poverty prevalence 1990-2030
Fragile States vs. Non-Fragile States
Are official statistics ever more than shadows in
the cave? If so what are they good for?
“Official statistics assumes a key role in ensuring
democracy and fostering social
progress…[should] ”provide society with
knowledge of itself”
Enrico Giovanni—former President of Istat
Co-Chair of IEG on the Data Revolution
“Knowledge is power; statistics is democracy”
Former President of Statistics Finland
(2) Official
Statistics as
systems—not
reducible to
producing (1)
(1) Official
Statistics as
data—entirely
defined as
product of (2)*
* According to Fundamental Principles of Official Statistics
What is/are official statistics?
Official
Statistics
(1) Ensure that
societies benefit from
“knowledge of itself”
(according to some
political and
technical standards)
(2) Ensure that
societies benefit
from the presence
of a deliberative
public space
What is/are the purpose(s)
of official statistics?
(2) Ensure that societies
benefit from the
presence of a
deliberative public
space
It’s complex—and it’s political
(1) Ensure that societies
benefit from “knowledge of
itself” (according to some
political and technical
standards)
Official
Statistics
Big Data
Source: Letouzé, 2014, based on primary and secondary sources
(How) can data reduce poverty?
Poverty prevalence 1990-2030
Fragile States vs. Non-Fragile States
…by that Gary King means: it’s about the analytics
Jonathan Glemmie
The Guardian, Oct 3, 2013
3. The case of the SDGs
A story of fish and fishermen
SDGs adopted by
the OWG
Big data examples What is monitored How is
monitored
Country(ies) Year Advantages of using
big data
1. Poverty
eradication
Satellite data to estimate povertyi Poverty Satellite images,
night-lights
Global map 2009 International
comparable data,
which can be
updated more
frequently
Estimating poverty maps with cell-
phone recordsii
Poverty Cell phone
records
Cote d’Ivoire 2013-4
Internet-based data to estimate
consumer price index and poverty
ratesiii
Price indexes Online prices at
retailers
websites
Argentina 2013 Cheaper data
available at higher
frequencies
Cell-phone records to predict socio-
economic levelsiv
Socio-economic
levels
Cell phone
records
City in Latin
America
2011 Data available more
regularly and cheaper
than official data;
informal economy
better reflected
2. End hunger,
achieve food
security and
improved nutrition,
and promote
sustainable
agriculture
Mining Indonesian Tweets to
understand food price crisesv
Food price crises Tweets Indonesia 2014
Uses indicators derived from mobile
phone data as a proxy for food
security indicatorsvi
Food security Cell phone data
and airtime
credit
purchases
A country in
Central Africa
2014
Use of remote-sensing data for
drought assessment and monitoring
Drought Remote sensing Afghanistan,
India,
Pakistanvii
2004
Chinaviii 2008
3. Health Internet-based data to identify
influenza breakoutsix
Influenza Google search
queries
US 2009 Real-time data;
captures disease cases
not officially recorded;
data available earlier
than official data
Data from online searches to
monitor influenza epidemicsx
Influenza Online searches
data
China 2013
Detecting influenza epidemics using
twitterxi
Influenza Twitter Japan 2011
Monitoring influenza outbreaks
using twitterxii
Influenza Twitter US 2013
Systems to monitor the activity of
influenza-like-illness with the aid of
volunteers via the internetxiii,xiv
Influenza Voluntary
reporting
through the
internet
Belgium, Italy,
Netherlands,
Portugal,
United
Kingdom,
United States
ongoi
ng
Cell-phone data to model malaria
spreadxv
Malaria Cell-phone
data
Kenya 2012
Using social and news media to Cholera Social and news Haiti 2012
SDG monitoring & Big Data
SDG achievement & Big Data
Thank You
www.datapopalliance.org
eletouze@datapopalliance.org
@datapopalliance

More Related Content

Similar to Applications & Implications of Big Data for Official Statistics - Emmanuel Letouzé

WEF_TC_MFS_BigDataBigImpact_Briefing_2012
WEF_TC_MFS_BigDataBigImpact_Briefing_2012WEF_TC_MFS_BigDataBigImpact_Briefing_2012
WEF_TC_MFS_BigDataBigImpact_Briefing_2012Bill Brindley
 
Global Pulse Rk Activate Summit
Global Pulse Rk Activate SummitGlobal Pulse Rk Activate Summit
Global Pulse Rk Activate SummitRobert Kirkpatrick
 
ICCM 2013 Panel 1: What's so Big about Big Data?
ICCM 2013 Panel 1: What's so Big about Big Data?ICCM 2013 Panel 1: What's so Big about Big Data?
ICCM 2013 Panel 1: What's so Big about Big Data?Tom Weinandy
 
Global pulse technology summary
Global pulse technology summaryGlobal pulse technology summary
Global pulse technology summarySara-Jayne Terp
 
Coronavirus Case Tracking
Coronavirus Case TrackingCoronavirus Case Tracking
Coronavirus Case TrackingDavidAhmed4
 
Fiware: open data & open big data
Fiware: open data & open big dataFiware: open data & open big data
Fiware: open data & open big dataEUBrasilCloudFORUM .
 
presentation-giovanni-veronese.PDFghjhhgfddfcgjjjnb
presentation-giovanni-veronese.PDFghjhhgfddfcgjjjnbpresentation-giovanni-veronese.PDFghjhhgfddfcgjjjnb
presentation-giovanni-veronese.PDFghjhhgfddfcgjjjnbsubhashreetini1994
 
What does “BIG DATA” mean for official statistics?
What does “BIG DATA” mean for official statistics?What does “BIG DATA” mean for official statistics?
What does “BIG DATA” mean for official statistics?Vincenzo Patruno
 
Digital news report_2013
Digital news report_2013Digital news report_2013
Digital news report_2013Insoon Kim
 
Reuters digital news report 2013
Reuters digital news report 2013Reuters digital news report 2013
Reuters digital news report 2013Xosé María Cid
 
Big Data For Development A Primer
Big Data For Development A PrimerBig Data For Development A Primer
Big Data For Development A PrimerUN Global Pulse
 
Big Data, Democratized Analytics and International Development
Big Data, Democratized Analytics and International Development Big Data, Democratized Analytics and International Development
Big Data, Democratized Analytics and International Development CIDPNSI
 
Digital Health: Managing Patients and disease
Digital Health: Managing Patients and diseaseDigital Health: Managing Patients and disease
Digital Health: Managing Patients and diseaseJoana Santos Silva
 
Who´s connected Who´s not - worldwide in 2016
Who´s connected Who´s not - worldwide in 2016Who´s connected Who´s not - worldwide in 2016
Who´s connected Who´s not - worldwide in 2016Amalist Client Services
 
Analysing Large-Scale News Media Content for Early Warning of Conflict - Proj...
Analysing Large-Scale News Media Content for Early Warning of Conflict - Proj...Analysing Large-Scale News Media Content for Early Warning of Conflict - Proj...
Analysing Large-Scale News Media Content for Early Warning of Conflict - Proj...UN Global Pulse
 
GRBN Trust and Personal Data Survey report - Part 1 - Concern, familiarity, t...
GRBN Trust and Personal Data Survey report - Part 1 - Concern, familiarity, t...GRBN Trust and Personal Data Survey report - Part 1 - Concern, familiarity, t...
GRBN Trust and Personal Data Survey report - Part 1 - Concern, familiarity, t...Andrew Cannon
 
Big data for development
Big data for development Big data for development
Big data for development Junaid Qadir
 

Similar to Applications & Implications of Big Data for Official Statistics - Emmanuel Letouzé (20)

WEF_TC_MFS_BigDataBigImpact_Briefing_2012
WEF_TC_MFS_BigDataBigImpact_Briefing_2012WEF_TC_MFS_BigDataBigImpact_Briefing_2012
WEF_TC_MFS_BigDataBigImpact_Briefing_2012
 
Global Pulse Rk Activate Summit
Global Pulse Rk Activate SummitGlobal Pulse Rk Activate Summit
Global Pulse Rk Activate Summit
 
ICCM 2013 Panel 1: What's so Big about Big Data?
ICCM 2013 Panel 1: What's so Big about Big Data?ICCM 2013 Panel 1: What's so Big about Big Data?
ICCM 2013 Panel 1: What's so Big about Big Data?
 
Global pulse technology summary
Global pulse technology summaryGlobal pulse technology summary
Global pulse technology summary
 
Coronavirus Case Tracking
Coronavirus Case TrackingCoronavirus Case Tracking
Coronavirus Case Tracking
 
Fiware: open data & open big data
Fiware: open data & open big dataFiware: open data & open big data
Fiware: open data & open big data
 
presentation-giovanni-veronese.PDFghjhhgfddfcgjjjnb
presentation-giovanni-veronese.PDFghjhhgfddfcgjjjnbpresentation-giovanni-veronese.PDFghjhhgfddfcgjjjnb
presentation-giovanni-veronese.PDFghjhhgfddfcgjjjnb
 
Big Data Paper
Big Data PaperBig Data Paper
Big Data Paper
 
What does “BIG DATA” mean for official statistics?
What does “BIG DATA” mean for official statistics?What does “BIG DATA” mean for official statistics?
What does “BIG DATA” mean for official statistics?
 
Ws2011 giovannini
Ws2011 giovanniniWs2011 giovannini
Ws2011 giovannini
 
Digital news report_2013
Digital news report_2013Digital news report_2013
Digital news report_2013
 
Reuters digital news report 2013
Reuters digital news report 2013Reuters digital news report 2013
Reuters digital news report 2013
 
Big Data For Development A Primer
Big Data For Development A PrimerBig Data For Development A Primer
Big Data For Development A Primer
 
Big Data, Democratized Analytics and International Development
Big Data, Democratized Analytics and International Development Big Data, Democratized Analytics and International Development
Big Data, Democratized Analytics and International Development
 
Digital Health: Managing Patients and disease
Digital Health: Managing Patients and diseaseDigital Health: Managing Patients and disease
Digital Health: Managing Patients and disease
 
State of Internet 2015
State of Internet 2015State of Internet 2015
State of Internet 2015
 
Who´s connected Who´s not - worldwide in 2016
Who´s connected Who´s not - worldwide in 2016Who´s connected Who´s not - worldwide in 2016
Who´s connected Who´s not - worldwide in 2016
 
Analysing Large-Scale News Media Content for Early Warning of Conflict - Proj...
Analysing Large-Scale News Media Content for Early Warning of Conflict - Proj...Analysing Large-Scale News Media Content for Early Warning of Conflict - Proj...
Analysing Large-Scale News Media Content for Early Warning of Conflict - Proj...
 
GRBN Trust and Personal Data Survey report - Part 1 - Concern, familiarity, t...
GRBN Trust and Personal Data Survey report - Part 1 - Concern, familiarity, t...GRBN Trust and Personal Data Survey report - Part 1 - Concern, familiarity, t...
GRBN Trust and Personal Data Survey report - Part 1 - Concern, familiarity, t...
 
Big data for development
Big data for development Big data for development
Big data for development
 

Recently uploaded

Digital Transformation of the Heritage Sector and its Practical Implications
Digital Transformation of the Heritage Sector and its Practical ImplicationsDigital Transformation of the Heritage Sector and its Practical Implications
Digital Transformation of the Heritage Sector and its Practical ImplicationsBeat Estermann
 
World Health Day theme 2024 is 'My health, my right’.
World Health Day theme 2024 is 'My health, my right’.World Health Day theme 2024 is 'My health, my right’.
World Health Day theme 2024 is 'My health, my right’.Christina Parmionova
 
NL-FR Partnership - Water management roundtable 20240403.pdf
NL-FR Partnership - Water management roundtable 20240403.pdfNL-FR Partnership - Water management roundtable 20240403.pdf
NL-FR Partnership - Water management roundtable 20240403.pdfBertrand Coppin
 
European Court of Human Rights: Judgment Verein KlimaSeniorinnen Schweiz and ...
European Court of Human Rights: Judgment Verein KlimaSeniorinnen Schweiz and ...European Court of Human Rights: Judgment Verein KlimaSeniorinnen Schweiz and ...
European Court of Human Rights: Judgment Verein KlimaSeniorinnen Schweiz and ...Energy for One World
 
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -17 April.
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -17 April.ECOSOC YOUTH FORUM 2024 - Side Events Schedule -17 April.
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -17 April.Christina Parmionova
 
Item # 4&5 - 415 & 423 Evans Ave. Replat
Item # 4&5 - 415 & 423 Evans Ave. ReplatItem # 4&5 - 415 & 423 Evans Ave. Replat
Item # 4&5 - 415 & 423 Evans Ave. Replatahcitycouncil
 
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -16 April.
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -16 April.ECOSOC YOUTH FORUM 2024 - Side Events Schedule -16 April.
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -16 April.Christina Parmionova
 
Youth shaping sustainable and innovative solution - Reinforcing the 2030 agen...
Youth shaping sustainable and innovative solution - Reinforcing the 2030 agen...Youth shaping sustainable and innovative solution - Reinforcing the 2030 agen...
Youth shaping sustainable and innovative solution - Reinforcing the 2030 agen...Christina Parmionova
 
PPT Item # 4&5 - 415 & 423 Evans Ave. Replat.pdf
PPT Item # 4&5 - 415 & 423 Evans Ave. Replat.pdfPPT Item # 4&5 - 415 & 423 Evans Ave. Replat.pdf
PPT Item # 4&5 - 415 & 423 Evans Ave. Replat.pdfahcitycouncil
 
Item # 6 - TBG Partners Landscape Architectural Design Services
Item # 6 - TBG Partners Landscape Architectural Design ServicesItem # 6 - TBG Partners Landscape Architectural Design Services
Item # 6 - TBG Partners Landscape Architectural Design Servicesahcitycouncil
 
April 7th - World Health Day 2024 - My Health. My Right.
April 7th - World Health Day 2024 - My Health. My Right.April 7th - World Health Day 2024 - My Health. My Right.
April 7th - World Health Day 2024 - My Health. My Right.Christina Parmionova
 
ISEIDP in Chikkaballapura, Karnataka, India
ISEIDP in Chikkaballapura, Karnataka, IndiaISEIDP in Chikkaballapura, Karnataka, India
ISEIDP in Chikkaballapura, Karnataka, IndiaTrinity Care Foundation
 
Build Tomorrow’s India Today By Making Charity For Poor Students
Build Tomorrow’s India Today By Making Charity For Poor StudentsBuild Tomorrow’s India Today By Making Charity For Poor Students
Build Tomorrow’s India Today By Making Charity For Poor StudentsSERUDS INDIA
 
2024: The FAR, Federal Acquisition Regulations - Part 23
2024: The FAR, Federal Acquisition Regulations - Part 232024: The FAR, Federal Acquisition Regulations - Part 23
2024: The FAR, Federal Acquisition Regulations - Part 23JSchaus & Associates
 
Madison Cat Project - Foster Training: Lesson 1
Madison Cat Project - Foster Training: Lesson 1Madison Cat Project - Foster Training: Lesson 1
Madison Cat Project - Foster Training: Lesson 1KelleyWasmund
 
In War and conflict, health workers, facilities and supplies are off limits.
In War and conflict, health workers, facilities and supplies are off limits.In War and conflict, health workers, facilities and supplies are off limits.
In War and conflict, health workers, facilities and supplies are off limits.Christina Parmionova
 
1- Phase 8 Hope For Venezuelan Refugees Soup Program-Periods 4-6.pdf
1- Phase 8 Hope For Venezuelan Refugees Soup Program-Periods 4-6.pdf1- Phase 8 Hope For Venezuelan Refugees Soup Program-Periods 4-6.pdf
1- Phase 8 Hope For Venezuelan Refugees Soup Program-Periods 4-6.pdfCristal Montañéz
 
2024 ECOSOC YOUTH FORUM -logistical information - United Nations Economic an...
2024 ECOSOC YOUTH FORUM -logistical information -  United Nations Economic an...2024 ECOSOC YOUTH FORUM -logistical information -  United Nations Economic an...
2024 ECOSOC YOUTH FORUM -logistical information - United Nations Economic an...Christina Parmionova
 
Item # 7 - Demolition & Replacement Structure Processes
Item # 7 - Demolition & Replacement Structure ProcessesItem # 7 - Demolition & Replacement Structure Processes
Item # 7 - Demolition & Replacement Structure Processesahcitycouncil
 
World Health Day 2024 - Zero Discrimination, Affordable treatments, Respectfu...
World Health Day 2024 - Zero Discrimination, Affordable treatments, Respectfu...World Health Day 2024 - Zero Discrimination, Affordable treatments, Respectfu...
World Health Day 2024 - Zero Discrimination, Affordable treatments, Respectfu...Christina Parmionova
 

Recently uploaded (20)

Digital Transformation of the Heritage Sector and its Practical Implications
Digital Transformation of the Heritage Sector and its Practical ImplicationsDigital Transformation of the Heritage Sector and its Practical Implications
Digital Transformation of the Heritage Sector and its Practical Implications
 
World Health Day theme 2024 is 'My health, my right’.
World Health Day theme 2024 is 'My health, my right’.World Health Day theme 2024 is 'My health, my right’.
World Health Day theme 2024 is 'My health, my right’.
 
NL-FR Partnership - Water management roundtable 20240403.pdf
NL-FR Partnership - Water management roundtable 20240403.pdfNL-FR Partnership - Water management roundtable 20240403.pdf
NL-FR Partnership - Water management roundtable 20240403.pdf
 
European Court of Human Rights: Judgment Verein KlimaSeniorinnen Schweiz and ...
European Court of Human Rights: Judgment Verein KlimaSeniorinnen Schweiz and ...European Court of Human Rights: Judgment Verein KlimaSeniorinnen Schweiz and ...
European Court of Human Rights: Judgment Verein KlimaSeniorinnen Schweiz and ...
 
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -17 April.
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -17 April.ECOSOC YOUTH FORUM 2024 - Side Events Schedule -17 April.
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -17 April.
 
Item # 4&5 - 415 & 423 Evans Ave. Replat
Item # 4&5 - 415 & 423 Evans Ave. ReplatItem # 4&5 - 415 & 423 Evans Ave. Replat
Item # 4&5 - 415 & 423 Evans Ave. Replat
 
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -16 April.
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -16 April.ECOSOC YOUTH FORUM 2024 - Side Events Schedule -16 April.
ECOSOC YOUTH FORUM 2024 - Side Events Schedule -16 April.
 
Youth shaping sustainable and innovative solution - Reinforcing the 2030 agen...
Youth shaping sustainable and innovative solution - Reinforcing the 2030 agen...Youth shaping sustainable and innovative solution - Reinforcing the 2030 agen...
Youth shaping sustainable and innovative solution - Reinforcing the 2030 agen...
 
PPT Item # 4&5 - 415 & 423 Evans Ave. Replat.pdf
PPT Item # 4&5 - 415 & 423 Evans Ave. Replat.pdfPPT Item # 4&5 - 415 & 423 Evans Ave. Replat.pdf
PPT Item # 4&5 - 415 & 423 Evans Ave. Replat.pdf
 
Item # 6 - TBG Partners Landscape Architectural Design Services
Item # 6 - TBG Partners Landscape Architectural Design ServicesItem # 6 - TBG Partners Landscape Architectural Design Services
Item # 6 - TBG Partners Landscape Architectural Design Services
 
April 7th - World Health Day 2024 - My Health. My Right.
April 7th - World Health Day 2024 - My Health. My Right.April 7th - World Health Day 2024 - My Health. My Right.
April 7th - World Health Day 2024 - My Health. My Right.
 
ISEIDP in Chikkaballapura, Karnataka, India
ISEIDP in Chikkaballapura, Karnataka, IndiaISEIDP in Chikkaballapura, Karnataka, India
ISEIDP in Chikkaballapura, Karnataka, India
 
Build Tomorrow’s India Today By Making Charity For Poor Students
Build Tomorrow’s India Today By Making Charity For Poor StudentsBuild Tomorrow’s India Today By Making Charity For Poor Students
Build Tomorrow’s India Today By Making Charity For Poor Students
 
2024: The FAR, Federal Acquisition Regulations - Part 23
2024: The FAR, Federal Acquisition Regulations - Part 232024: The FAR, Federal Acquisition Regulations - Part 23
2024: The FAR, Federal Acquisition Regulations - Part 23
 
Madison Cat Project - Foster Training: Lesson 1
Madison Cat Project - Foster Training: Lesson 1Madison Cat Project - Foster Training: Lesson 1
Madison Cat Project - Foster Training: Lesson 1
 
In War and conflict, health workers, facilities and supplies are off limits.
In War and conflict, health workers, facilities and supplies are off limits.In War and conflict, health workers, facilities and supplies are off limits.
In War and conflict, health workers, facilities and supplies are off limits.
 
1- Phase 8 Hope For Venezuelan Refugees Soup Program-Periods 4-6.pdf
1- Phase 8 Hope For Venezuelan Refugees Soup Program-Periods 4-6.pdf1- Phase 8 Hope For Venezuelan Refugees Soup Program-Periods 4-6.pdf
1- Phase 8 Hope For Venezuelan Refugees Soup Program-Periods 4-6.pdf
 
2024 ECOSOC YOUTH FORUM -logistical information - United Nations Economic an...
2024 ECOSOC YOUTH FORUM -logistical information -  United Nations Economic an...2024 ECOSOC YOUTH FORUM -logistical information -  United Nations Economic an...
2024 ECOSOC YOUTH FORUM -logistical information - United Nations Economic an...
 
Item # 7 - Demolition & Replacement Structure Processes
Item # 7 - Demolition & Replacement Structure ProcessesItem # 7 - Demolition & Replacement Structure Processes
Item # 7 - Demolition & Replacement Structure Processes
 
World Health Day 2024 - Zero Discrimination, Affordable treatments, Respectfu...
World Health Day 2024 - Zero Discrimination, Affordable treatments, Respectfu...World Health Day 2024 - Zero Discrimination, Affordable treatments, Respectfu...
World Health Day 2024 - Zero Discrimination, Affordable treatments, Respectfu...
 

Applications & Implications of Big Data for Official Statistics - Emmanuel Letouzé

  • 1. Applications & Implications of Big Data for Official Statistics Emmanuel Letouzé Director & co-Founder Data-Pop Alliance DfID, London, February 26, 2015
  • 2. 1. The Emergence of Big Data & The Statistical Tragedy Framing and surfacing of the issue 2. Big Data and Official Statistics: Substitute, Complement, or “It’s complicated”? 3. The Case of the SDGs A story of fish and fishermen
  • 3. 1. The Emergence of Big Data vs. The Statistical Tragedy Framing and surfacing of the issue
  • 4.
  • 5. Hal Varian’s nowcasting, GDP and light emissions paper.… Line shows returns for “Big Data” on Google Trends between 2007 and 2014; 100=maximum value “We are at the beginning of what I call The Industrial Revolution of Data.” Joe Hellerstein , November 19, 2008 Context: the Big Data rush
  • 7.
  • 8.
  • 9. Google Flu Trend: rise and fall Hope or Hype?
  • 10.
  • 11.
  • 12. 2. Big Data & Official Statistics It’s complicated. Or complex.
  • 13. What is Big Data? 2010-12: the 3 Vs of big data
  • 14. i. Exhaust ii. Web iii. Sensing Crumbs Capacities Communities What is Big Data? Now: the 3 Vs of Big Data
  • 15. Movement of an individual in Rwanda over 4 years using CDRs (Source J. Blumenstock, 2010)
  • 16. The new data ecosystem
  • 17. 1. Early warning 1. Real time awareness 1. Real-time feedback Source: Letouzé, 2012 “What can it be used for?” —Taxonomy of applications (1)
  • 18. 1. Descriptive -e.g. maps, clouds.. 1. Predictive: -forecasting -inference 1. Prescriptive -causal inference Source: Letouzé, Vinck and Meier, 2013 “What can it be used for?” —Taxonomy of applications (2)
  • 19. National Statistical Institutes carry out surveys Telefonica team used their data to ‘predict’ SELs from Cell Phone Usage Predict the present (SELs for non- surveyed regions) and monitor the future (track changes over time) Survey from “a major city in Latin America” Source: “Prediction of Socio-Economic Levels Using Cell- Phone Records” (Telefonica research, 2011) ‘Predicting’ socioeconomic levels?
  • 20. Promoting a people-centered Big Data Revolution
  • 21.
  • 22.
  • 23.
  • 24.
  • 26. Sample bias correction Then: blending of hypothesis based vs. supervised machine learning methods to model bias
  • 27. Source: Letouzé, 2014, based on primary and secondary sources What & how much do we know? ….and does it matter? Poverty prevalence 1990-2030 Fragile States vs. Non-Fragile States
  • 28. Are official statistics ever more than shadows in the cave? If so what are they good for?
  • 29. “Official statistics assumes a key role in ensuring democracy and fostering social progress…[should] ”provide society with knowledge of itself” Enrico Giovanni—former President of Istat Co-Chair of IEG on the Data Revolution “Knowledge is power; statistics is democracy” Former President of Statistics Finland
  • 30. (2) Official Statistics as systems—not reducible to producing (1) (1) Official Statistics as data—entirely defined as product of (2)* * According to Fundamental Principles of Official Statistics What is/are official statistics?
  • 31. Official Statistics (1) Ensure that societies benefit from “knowledge of itself” (according to some political and technical standards) (2) Ensure that societies benefit from the presence of a deliberative public space What is/are the purpose(s) of official statistics?
  • 32. (2) Ensure that societies benefit from the presence of a deliberative public space It’s complex—and it’s political (1) Ensure that societies benefit from “knowledge of itself” (according to some political and technical standards) Official Statistics Big Data
  • 33. Source: Letouzé, 2014, based on primary and secondary sources (How) can data reduce poverty? Poverty prevalence 1990-2030 Fragile States vs. Non-Fragile States
  • 34. …by that Gary King means: it’s about the analytics Jonathan Glemmie The Guardian, Oct 3, 2013
  • 35. 3. The case of the SDGs A story of fish and fishermen
  • 36. SDGs adopted by the OWG Big data examples What is monitored How is monitored Country(ies) Year Advantages of using big data 1. Poverty eradication Satellite data to estimate povertyi Poverty Satellite images, night-lights Global map 2009 International comparable data, which can be updated more frequently Estimating poverty maps with cell- phone recordsii Poverty Cell phone records Cote d’Ivoire 2013-4 Internet-based data to estimate consumer price index and poverty ratesiii Price indexes Online prices at retailers websites Argentina 2013 Cheaper data available at higher frequencies Cell-phone records to predict socio- economic levelsiv Socio-economic levels Cell phone records City in Latin America 2011 Data available more regularly and cheaper than official data; informal economy better reflected 2. End hunger, achieve food security and improved nutrition, and promote sustainable agriculture Mining Indonesian Tweets to understand food price crisesv Food price crises Tweets Indonesia 2014 Uses indicators derived from mobile phone data as a proxy for food security indicatorsvi Food security Cell phone data and airtime credit purchases A country in Central Africa 2014 Use of remote-sensing data for drought assessment and monitoring Drought Remote sensing Afghanistan, India, Pakistanvii 2004 Chinaviii 2008 3. Health Internet-based data to identify influenza breakoutsix Influenza Google search queries US 2009 Real-time data; captures disease cases not officially recorded; data available earlier than official data Data from online searches to monitor influenza epidemicsx Influenza Online searches data China 2013 Detecting influenza epidemics using twitterxi Influenza Twitter Japan 2011 Monitoring influenza outbreaks using twitterxii Influenza Twitter US 2013 Systems to monitor the activity of influenza-like-illness with the aid of volunteers via the internetxiii,xiv Influenza Voluntary reporting through the internet Belgium, Italy, Netherlands, Portugal, United Kingdom, United States ongoi ng Cell-phone data to model malaria spreadxv Malaria Cell-phone data Kenya 2012 Using social and news media to Cholera Social and news Haiti 2012 SDG monitoring & Big Data
  • 37. SDG achievement & Big Data

Editor's Notes

  1. Looking back
  2. Example of our work: Official statistics Why is it important? africas statistical tragedy platos cave Big data == solution? Methods Technical assistance Data literacy Ethics Convening Marcelo Giugale  Fix Africa's Statistics Posted: 12/18/2012 4:17 pm EST Updated: 02/17/2013 5:12 am EST How would you feel if you were on an airplane and the pilot made the following announcement: "This is your captain speaking. I'm happy to report that all of our engines checked fine, we have just climbed to 36,000 feet, will soon reach our cruising speed, and should get to our destination right on time.... I think. You see, the airline has not invested enough in our flight instruments over the past 40 years. Some of them are obsolete, some are inaccurate and some are just plain broken. So, to be honest with you, I'm not sure how good the engines really are. And I can only estimate our altitude, speed and location. Apart from that, sit back, relax and enjoy the ride." This is, in a nutshell, the story of statistics in Africa. Fueled by its many natural resources, the region is growing fast, is finally beginning to reduce poverty and seems headed for success. Or so we think, for there are major problems with its data, problems that call for urgent, game-changing action. First, we don't really know how big (or small) many African economies are. In about half of them, the system of "national accounts" dates back to the 1960s (1968, to be precise); in the other half, it is from 1993. This means that measuring things like how much is produced, consumed or invested is done with methods from the times when computers were rare, the Internet did not exist and nobody spoke about "globalization." That is, the methodology ignores the fact that some industries have disappeared and new ones were born. How badly does this skew the data? Well, to give you an idea, when Ghana used a newer methodology to update its accounts in 2010, it found out that its economy was about 60 percent bigger than it had previously thought -- and the country instantly became "middle-income" in the global ranking. [Old-timers have a neat way to tell when the size of an economy is underestimated in countries with weak institutions: if what the government collects in taxes is equivalent to more than a fifth of the country's "gross domestic product," then gross domestic product is probably larger than what the official numbers show.] Second, the latest poverty counts for Africa are, on average, five years old. So we only have guesstimates of how the global financial, food and fuel crises have impacted the distribution of income, wealth and opportunities in the region. This is because, to count the poor, you need "household surveys" -- those face-to-face, home visits where people are asked how much they earn, own, know and so on. In fifteen African countries, this has been done only once since 2000. Ironically, technology now allows for the surveys to be done not only more frequently, but continuously. You give families a cell-phone free of charge in exchange for them answering a questionnaire, say, twice a month. And you don't need to ask every household; about three thousand are enough -- that's the beauty of statistical sampling. So, why is it not done? Coming to that in a minute. Third, "industrial" surveys are even more infrequent than household surveys -- only a handful of African countries have done at least one in the last ten years. This is a pity. Knowing what your producers are doing -- and what keeps them from producing more -- is critical if you want to design policies that increase employment, productivity and economic growth. To be sure, academics, NGOs, development banks and business organizations carry out sporadic surveys of enterprises for one purpose or another -- from understanding how informal jobs are created to selling logistical services. But regular, comprehensive, nation-wide data is, at best, rare. What's true for African employers is also true for African employees. "Labor market" surveys are few and far between -- most workers, remember, are informal and tend to shy away from answering questions by public officials. So when you ask about the unemployment rate in Africa, you are likely to be given a number that means little, is old, or both.  And how about the good old "census" -- that once-in-a-while count of a country's entire population? Censuses are the only time when we learn how many we are, how fast we are aging, where we live, how we live and lots of other information that helps governments make smart(er) decisions on things like health care, school construction or crime prevention. Experts say that you should have a census every ten years. Sixteen African countries have fallen behind that tempo -- which means that, at the moment, we don't know much about a third of those who live in the region. Counting people is particularly important in African countries that get income from extracting oil, gas or minerals, which is most of them. That income is supposed to be shared among provinces, counties and municipalities on the basis of their population size -- how that's done in practice if the data is outdated or wrong beats a statistician's guess. How does one even start tackling a problem like that? The truth is that a lot of money has been invested in improving Africa's statistics. Most of that money came as donations from well-meaning rich countries, and went to fund "institutional development," that is, to train and equip national statistics offices. According to the "Partnership in Statistics for Development in the 21st Century" (aka "PARIS21"), between 2009 and 2011 alone, Africa received 700 million dollars to build up its capacity to collect data. That led to some progress, but dismally short of what is needed. Why? Mostly politics. Solution? A mix of democracy and technology. It is much more difficult to mess with a country's statistics when people are free to complain about it. Democratization has brought a new appetite for information to the average African citizen, much of it expressed through a data-hungry media. The number of recalcitrant governments that gather data but refuse to release it, or release it when it is obsolete-late, is falling -- slowly but surely. There is even talk of inviting independent experts ("high-level technical commissions") to regularly vet whatever official figures are put out. And the legal walls that protect public statisticians from meddling politicians are hardening -- Senegal pioneered the trend in the early 2000s. This is good, because national statistics offices are like central banks: once you recruit the best technicians, you want to step back and let them do their job. But what will really revolutionize African statistics is communication technology. The continent is embracing cellular telephony with gusto; it is only a question of time before its people can become regular respondents in censuses and surveys. [Disclaimer: this writer works with a team that is trying to do just that through a project code-named, you guessed, "Listening to Africa."] Satellite imagery can now be used to literally see and gauge, from outer space, economic activity in ports, highways and markets. And tracking what people do, search or talk about on the web -- their "data exhaust" -- gives you a sense of what they are up to as workers, consumers and investors. All of this is yet to be deployed in Africa. To fix its data problem, the region should not just bring up to par the statistical systems it currently has; it may also want to leapfrog into tomorrow's systems.
  3. Big data == solution?
  4. Technocratic bias Algorithmic regulation privacy, ethics. Digital divide Big data hubris? More data always better?
  5. What have CDRS been used to predict: Disease dynamics Disaster response Socio economic indicators Personality
  6. Are official stats more than shadows in the cave? UN call for a Data Revolution Post 2015 agenda Expert panel Should cite drawing! Allegory of the cave, Plaots Plato has Socrates describe a gathering of people who have lived chained to the wall of a cave all of their lives, facing a blank wall. The people watch shadows projected on the wall by things passing in front of a fire behind them, and begin to designate names to these shadows. The shadows are as close as the prisoners get to viewing reality. He then explains how the philosopher is like a prisoner who is freed from the cave and comes to understand that the shadows on the wall do not make up reality at all, as he can perceive the true form of reality rather than the mere shadows seen by the prisoners.
  7. Example of stop and frisk --- ==? (how far) do we want a data-driven society? Who owns the data? Could big data as a new class of assets empower people by counteracting natural tendency of capital to accumulate cf Piketty etc?