SlideShare a Scribd company logo
1 of 12
Download to read offline
MK99 – Big Data 
1 
Big data & cross-platform analytics 
MOOC lectures Pr. Clement Levallois
MK99 – Big Data 
2 
Open data for business
MK99 – Big Data 
3 
Preliminary distinctions to be made 
(1) Recognized as a creation of the mind? 
This piece of data in my organization 
(2) Recognized as personal data 
Neither (1) nor (2) 
Intellectual property rights apply 
Personal data protection applies 
Open data is possible 
TOPIC FOR TODAY 
(3) In all cases: concern for cyber security applies
MK99 – Big Data 
4 
Open data = public data? 
•At the origins and still today, debate on « open data » is focusing on giving back public data to the people 
•Instead of being confiscated in data silos by public administrations
MK99 – Big Data 
5 
Example 1: zip codes in the UK 
•Public information, funded by public money 
•Yet, not publicly available, and expensive to get 
•See: « Give us back our crown jewels » (2006) 
•“A local authority such as Swindon has to pay the Ordnance Survey £38,000 a year to use its addresses and geographical data. It also has to pay the Royal Mail £3,000 for every website that includes the facility for people to look up their postcodes. 
Yet it was local authorities, which have a statutory duty to collect street addresses, that collected much of this data.” 
•Take away: innovation is stifled by closed public data
MK99 – Big Data 
6 
Example 2: scientific research 
•Scientists, paid on tax dollars, produce knowledge which is free and useful to fellow scientists, citizens, patients, etc. 
•Yet, their results are published in journals owned by private editors, and which have a very high subscription price, paid by public universities. The general public can’t access them – except if they want to pay $20 / $40, per research article. 
•There is a growing movement for « open access » to scientific research: publicly funded research should be publicly available! 
•Take away: innovation is stifled by closed public data
MK99 – Big Data 
7 
Open data: beyond public data 
•It expands the user base of the data 
•It makes the user base more diverse 
•It accelerates usage of the data 
•It broadens use cases 
Take away: any dataset – not just public datasets issued by govs – can be made more productive if they are opened up.
MK99 – Big Data 
8 
4 benefits for organisations 
1.Crowd sourced innovation 
–Ex: TweetDesk was built on top of Twitter APIs, then bought back by Twitter 
–http://techcrunch.com/2011/05/23/twitter-buys-tweetdeck-for-40-million/ 
2.Develop brand awareness and positive brand attitude 
–Open data = good buzz. Example: Metro Transit Authority 
–“Our apps are whiz kid certified” http://observer.com/2010/12/mta-proudest-of-the-apps-it-didnt-make/ 
3.Contribution to corporate social responsibility (CSR) 
–JC Decaux and the bike-sharing program in Paris: from closed data to open data, with no visible profit motive 
–https://developer.jcdecaux.com/#/home 
–Still a good entry point in the world of digital urban signs for JC Decaux? 
4.Increased and measurable impact on targeted audience 
–The World Bank developed APIs which strengthens and objectifies their outreach 
–http://data.worldbank.org/node/9
MK99 – Big Data 
9 
When is data “open”? 8 principles for “open gov data”, inspiring for open data in general 
1. Complete All public data is made available. Public data is data that is not subject to valid privacy, security or privilege limitations 
2. Primary 
Data is as collected at the source, with the highest possible level of granularity, not in aggregate or modified forms. 
3. Timely 
Data is made available as quickly as necessary to preserve the value of the data. 
4. Accessible 
Data is available to the widest range of users for the widest range of purposes. 
5. Machine processable 
Data is reasonably structured to allow automated processing. 
6. Non-discriminatory 
Data is available to anyone, with no requirement of registration. 
7. Non-proprietary 
Data is available in a format over which no entity has exclusive control. 
8. License-free 
Data is not subject to any copyright, patent, trademark or trade secret regulation. Reasonable privacy, security and privilege restrictions may be allowed. 
Source: 
www.opengovdata.gov
MK99 – Big Data 
10 
Open data: the 5 stars scale 
★ 
make your stuff available on the web (whatever format) 
★★ 
make it available as structured data (e.g. excel instead of image scan of a table) 
★★★ 
non-proprietary format (e.g. csv instead of excel) 
★★★★ 
use URLs to identify things, so that people can point at your stuff 
★★★★ ★ 
link your data to other people’s data to provide context 
I’d add a ★ if the data is made available through APIs instead of files to download, just after the 3rd ★
MK99 – Big Data 
11 
Dos and Don’ts 
•Play the game, don’t fake it 
–Adverse reactions when companies have timid initiatives in open data 
–Example: French Postal service in 2014 
–http://openstreetmap.fr/blogs/cquest/opendata-la-poste-posture-ou-imposture [in French] 
•Treat your developers well 
–Crowd sourced innovation ≠ free labor 
–When organizing hackathons: do it professionally.
MK99 – Big Data 
12 
This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) 
Contact Clement Levallois (levallois [at] em-lyon.com) for more information.

More Related Content

More from Clement Levallois

Education et intelligence artificielle
Education et intelligence artificielleEducation et intelligence artificielle
Education et intelligence artificielleClement Levallois
 
3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications businessClement Levallois
 
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Clement Levallois
 
Presentation of programming languages for beginners
Presentation of programming languages for beginnersPresentation of programming languages for beginners
Presentation of programming languages for beginnersClement Levallois
 
Umigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomUmigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomClement Levallois
 
Data visualization: enjeux pour le business
Data visualization: enjeux pour le businessData visualization: enjeux pour le business
Data visualization: enjeux pour le businessClement Levallois
 
An explanation of machine learning for business
An explanation of machine learning for businessAn explanation of machine learning for business
An explanation of machine learning for businessClement Levallois
 
A Primer on Text Mining for Business
A Primer on Text Mining for BusinessA Primer on Text Mining for Business
A Primer on Text Mining for BusinessClement Levallois
 
The business stakes of data integration
The business stakes of data integrationThe business stakes of data integration
The business stakes of data integrationClement Levallois
 

More from Clement Levallois (13)

Education et intelligence artificielle
Education et intelligence artificielleEducation et intelligence artificielle
Education et intelligence artificielle
 
3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business
 
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
 
Presentation of programming languages for beginners
Presentation of programming languages for beginnersPresentation of programming languages for beginners
Presentation of programming languages for beginners
 
Umigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomUmigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroom
 
Data visualization: enjeux pour le business
Data visualization: enjeux pour le businessData visualization: enjeux pour le business
Data visualization: enjeux pour le business
 
Twitter for beginners
Twitter for beginnersTwitter for beginners
Twitter for beginners
 
An explanation of machine learning for business
An explanation of machine learning for businessAn explanation of machine learning for business
An explanation of machine learning for business
 
Data and personalization
Data and personalizationData and personalization
Data and personalization
 
A Primer on Text Mining for Business
A Primer on Text Mining for BusinessA Primer on Text Mining for Business
A Primer on Text Mining for Business
 
The business stakes of data integration
The business stakes of data integrationThe business stakes of data integration
The business stakes of data integration
 
What is big data?
What is big data?What is big data?
What is big data?
 
What is "data"?
What is "data"?What is "data"?
What is "data"?
 

Open data for business

  • 1. MK99 – Big Data 1 Big data & cross-platform analytics MOOC lectures Pr. Clement Levallois
  • 2. MK99 – Big Data 2 Open data for business
  • 3. MK99 – Big Data 3 Preliminary distinctions to be made (1) Recognized as a creation of the mind? This piece of data in my organization (2) Recognized as personal data Neither (1) nor (2) Intellectual property rights apply Personal data protection applies Open data is possible TOPIC FOR TODAY (3) In all cases: concern for cyber security applies
  • 4. MK99 – Big Data 4 Open data = public data? •At the origins and still today, debate on « open data » is focusing on giving back public data to the people •Instead of being confiscated in data silos by public administrations
  • 5. MK99 – Big Data 5 Example 1: zip codes in the UK •Public information, funded by public money •Yet, not publicly available, and expensive to get •See: « Give us back our crown jewels » (2006) •“A local authority such as Swindon has to pay the Ordnance Survey £38,000 a year to use its addresses and geographical data. It also has to pay the Royal Mail £3,000 for every website that includes the facility for people to look up their postcodes. Yet it was local authorities, which have a statutory duty to collect street addresses, that collected much of this data.” •Take away: innovation is stifled by closed public data
  • 6. MK99 – Big Data 6 Example 2: scientific research •Scientists, paid on tax dollars, produce knowledge which is free and useful to fellow scientists, citizens, patients, etc. •Yet, their results are published in journals owned by private editors, and which have a very high subscription price, paid by public universities. The general public can’t access them – except if they want to pay $20 / $40, per research article. •There is a growing movement for « open access » to scientific research: publicly funded research should be publicly available! •Take away: innovation is stifled by closed public data
  • 7. MK99 – Big Data 7 Open data: beyond public data •It expands the user base of the data •It makes the user base more diverse •It accelerates usage of the data •It broadens use cases Take away: any dataset – not just public datasets issued by govs – can be made more productive if they are opened up.
  • 8. MK99 – Big Data 8 4 benefits for organisations 1.Crowd sourced innovation –Ex: TweetDesk was built on top of Twitter APIs, then bought back by Twitter –http://techcrunch.com/2011/05/23/twitter-buys-tweetdeck-for-40-million/ 2.Develop brand awareness and positive brand attitude –Open data = good buzz. Example: Metro Transit Authority –“Our apps are whiz kid certified” http://observer.com/2010/12/mta-proudest-of-the-apps-it-didnt-make/ 3.Contribution to corporate social responsibility (CSR) –JC Decaux and the bike-sharing program in Paris: from closed data to open data, with no visible profit motive –https://developer.jcdecaux.com/#/home –Still a good entry point in the world of digital urban signs for JC Decaux? 4.Increased and measurable impact on targeted audience –The World Bank developed APIs which strengthens and objectifies their outreach –http://data.worldbank.org/node/9
  • 9. MK99 – Big Data 9 When is data “open”? 8 principles for “open gov data”, inspiring for open data in general 1. Complete All public data is made available. Public data is data that is not subject to valid privacy, security or privilege limitations 2. Primary Data is as collected at the source, with the highest possible level of granularity, not in aggregate or modified forms. 3. Timely Data is made available as quickly as necessary to preserve the value of the data. 4. Accessible Data is available to the widest range of users for the widest range of purposes. 5. Machine processable Data is reasonably structured to allow automated processing. 6. Non-discriminatory Data is available to anyone, with no requirement of registration. 7. Non-proprietary Data is available in a format over which no entity has exclusive control. 8. License-free Data is not subject to any copyright, patent, trademark or trade secret regulation. Reasonable privacy, security and privilege restrictions may be allowed. Source: www.opengovdata.gov
  • 10. MK99 – Big Data 10 Open data: the 5 stars scale ★ make your stuff available on the web (whatever format) ★★ make it available as structured data (e.g. excel instead of image scan of a table) ★★★ non-proprietary format (e.g. csv instead of excel) ★★★★ use URLs to identify things, so that people can point at your stuff ★★★★ ★ link your data to other people’s data to provide context I’d add a ★ if the data is made available through APIs instead of files to download, just after the 3rd ★
  • 11. MK99 – Big Data 11 Dos and Don’ts •Play the game, don’t fake it –Adverse reactions when companies have timid initiatives in open data –Example: French Postal service in 2014 –http://openstreetmap.fr/blogs/cquest/opendata-la-poste-posture-ou-imposture [in French] •Treat your developers well –Crowd sourced innovation ≠ free labor –When organizing hackathons: do it professionally.
  • 12. MK99 – Big Data 12 This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) Contact Clement Levallois (levallois [at] em-lyon.com) for more information.