SlideShare a Scribd company logo
1 of 15
Download to read offline
MK99 – Big Data 1 
Big data & cross-platform analytics 
MOOC lectures Pr. Clement Levallois
MK99 – Big Data 2 
A primer on text mining for business 
• 
Text mining: 
computational methods to find interesting information in texts 
• 
Quasi synonyms: 
– 
natural language processing (abbreviated in NLP) 
– 
computational linguistics (name of a scientific discipline)
MK99 – Big Data 3 
Text… what kinds? 
• 
Books 
• 
Tweets 
• 
Product reviews on Amazon 
• 
LinkedIn profiles 
• 
The whole Wikipedia 
• 
Free text answers in the results of a survey 
• 
Tenders, contracts, laws, … 
• 
Print and online media 
• 
Archival material 
• 
…
MK99 – Big Data 4 
What can be done? 
• 
Sentiment analysis 
– 
Is this piece of text of a positive or negative tone? 
• 
Topic modeling / topic detection 
– 
What is the main theme of this 20-page booklet? 
• 
Semantic disambiguation 
– 
“Paris” is mentioned in this text. Is this Paris Hilton or Paris, France? 
• 
Named Entity Recognition (NER) 
– 
Automatically find the individuals, organizations and events named in the text, and the relations between them. 
• 
Semantic enrichment 
– 
If you searched Google for “TV”, results for “television” will also show up 
• 
Language detection 
– 
“Ich spreche Deutsch” -> this sentence is written in German 
• 
Automatic Translation 
– 
See Google Translate 
•Summarizing 
–Shortening a text while keeping its core message intact 
•Spelling correction 
–Well, that’s easy 
•Topic Classification 
–Is this email a spam or not?
MK99 – Big Data 5 
Amaze me! 
• 
Demo on sentiment analysis 
With a tool by Stanford: http://nlp.stanford.edu:8080/sentiment/rntnDemo.html 
• 
Demo on semantic disambiguation 
With a tool by a collaborative effort: http://dbpedia-spotlight.github.io/demo/ 
(click on “annotate”, and also change the text for one of your own)
MK99 – Big Data 6 
What can’t be done yet (but is actively researched) 
• 
Detection of irony 
• 
Robust translation 
• 
Reasoning beyond Q&A 
What makes things harder 
• 
Non English texts 
• 
Slang and colloquial speech-forms 
• 
Real time processing
MK99 – Big Data 7 
Example of routine operations when working with text (or, how to follow the most basic conversation in comput. linguistics) 
• 
Stemming 
– 
“liked” and “like” will be reduced to their stem “lik” to facilitate further operations 
• 
Lemmatizing 
– 
Grouping “liked”, “like” and “likes” to count them as one basic semantic unit 
• 
Part-of-Speech tagging (aka POS tagging) 
– 
Automatically detecting the grammatical function of the terms used in a sentence, to facilitate translation or else 
• 
“Starting the text analysis with a bag-of-words model” 
– 
Operation which consists in just listing and counting all different words in the text. 
• 
N-grams 
– 
The text “I am Dutch” is made of 3 words: I, am, Dutch. But it can also be interesting to look at bigrams in the text: “I am”, “am Dutch”. Or trigrams: “I am Dutch”. 
– 
When neighboring words are considered together just like we did, they are called n-grams. This can reveal interesting things about frequent expressions used in the text. 
– 
A good example of how useful this can be: visit the Ngram Viewer by Google: https://books.google.com/ngrams
MK99 – Big Data 8 
Chief benefit: Getting to know individuals better 
• 
Without text mining, we have access to “external”, “cold” states of the individual 
– 
Behavior (eg, clicks), external attributes (address, gender, encyclopedia entry), social networks (but relatively cold ones.) 
• 
With text mining, we have access to “internal”, “hot” states: 
- opinions - intentions - preferences - degree of consensus - social networks (who mentions whom: how, in which context) - implicit attributes of the speaker
MK99 – Big Data 9 
How easy is it? 
• 
Too easy… the limit is legal and ethical, not technical 
“Predicting the Political Alignment of Twitter Users” by Conover et al. (2011). 
http://cnets.indiana.edu/wp-content/uploads/conover_prediction_socialcom_pdfexpress_ok_version.pdf 
“Political Tendency Identification in Twitter using Sentiment Analysis Techniques” 
by Pla and Hurtado (2014). http://anthology.aclweb.org/C/C14/C14-1019.pdf 
“Private traits and attributes are predictable from digital records of human behavior” 
by Kosinski et al. (2013). http://www.pnas.org/content/110/15/5802.abstract 
(and this gets even more powerful when mixing text mining, network analysis and machine learning)
MK99 – Big Data 10 
What use for text mining in a business context? 
1. 
Client facing 
2. 
Business management 
3. 
Business development
MK99 – Big Data 11 
1. Market facing activities 
• 
Refined scoring: propensity scores (including churn), scoring of prospects 
•Refined individualization of campaigns 
–ads, email campaigns, coupons, etc. 
•Better community management 
–Getting a clear and precise picture of how customers and prospects perceive, talk about, and engage with your brand / product / industry.
MK99 – Big Data 12 
2. Business Management 
• 
Organizational mapping 
– 
Getting a view of the organization through text flows. 
– 
Example: getting a view on the activity of a business school through a map of its scientific publications. 
• 
HRM 
– 
Finding talents in niche industries, based on the mining of their profiles 
• 
Marketing research 
– 
refined segmentation + targeting + positioning, measuring customer satisfaction, perceptual mapping.
MK99 – Big Data 13 
3. Business development 
• 
Developing adjunct services 
– 
product recommendation systems (eg, Amazon’s) 
– 
detection and matching of needs (eg, detection of complaints / mood changes) 
– 
product enhancements (eg, content enrichment through localization/personalization) 
• 
Developing new products entirely, based on 
– 
different search engines 
– 
alert systems / automated systems based on monitoring textual input 
– 
knowledge databases 
– 
new forms of content curation / high value info creation + delivery
MK99 – Big Data 14 
Interesting players 
through their “Data Services” package 
+ many APIs listed on www.programmableweb.com
MK99 – Big Data 15 
This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) 
Contact Clement Levallois (levallois [at] em-lyon.com) for more information.

More Related Content

What's hot

Evolving social data mining and affective analysis
Evolving social data mining and affective analysis  Evolving social data mining and affective analysis
Evolving social data mining and affective analysis Athena Vakali
 
Social Media Mining: An Introduction
Social Media Mining: An IntroductionSocial Media Mining: An Introduction
Social Media Mining: An IntroductionAli Abbasi
 
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsIntroduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsMike Kujawski
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 
efficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksefficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksswathi78
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social networkFiras Husseini
 
Incentive compatible privacy preserving data analysis
Incentive compatible privacy preserving data analysisIncentive compatible privacy preserving data analysis
Incentive compatible privacy preserving data analysisJPINFOTECH JAYAPRAKASH
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachAndry Alamsyah
 
Social Media Data Mining
Social Media Data MiningSocial Media Data Mining
Social Media Data MiningRyan Reede
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)SocialMediaMining
 
Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)SocialMediaMining
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisInfini Graph
 
Social media analytics - Making sense of Big Data
Social media analytics - Making sense of Big DataSocial media analytics - Making sense of Big Data
Social media analytics - Making sense of Big DataHenrik Hammer Eliassen
 
Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)SocialMediaMining
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information RetrievalMatthew Lease
 
Presentation big data and social media final_video
Presentation big data and social media final_videoPresentation big data and social media final_video
Presentation big data and social media final_videoramikaurraminder
 
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Andry Alamsyah
 

What's hot (20)

Evolving social data mining and affective analysis
Evolving social data mining and affective analysis  Evolving social data mining and affective analysis
Evolving social data mining and affective analysis
 
Social Media Mining: An Introduction
Social Media Mining: An IntroductionSocial Media Mining: An Introduction
Social Media Mining: An Introduction
 
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT ToolsIntroduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
Introduction to the Responsible Use of Social Media Monitoring and SOCMINT Tools
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
efficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksefficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networks
 
Social Data Mining
Social Data MiningSocial Data Mining
Social Data Mining
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social network
 
Incentive compatible privacy preserving data analysis
Incentive compatible privacy preserving data analysisIncentive compatible privacy preserving data analysis
Incentive compatible privacy preserving data analysis
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network Approach
 
Social Media Data Mining
Social Media Data MiningSocial Media Data Mining
Social Media Data Mining
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)
 
Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)Social Media Mining - Chapter 5 (Data Mining Essentials)
Social Media Mining - Chapter 5 (Data Mining Essentials)
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & Analysis
 
Social media analytics - Making sense of Big Data
Social media analytics - Making sense of Big DataSocial media analytics - Making sense of Big Data
Social media analytics - Making sense of Big Data
 
Social media with big data analytics
Social media with big data analyticsSocial media with big data analytics
Social media with big data analytics
 
Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)Social Media Mining - Chapter 2 (Graph Essentials)
Social Media Mining - Chapter 2 (Graph Essentials)
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Presentation big data and social media final_video
Presentation big data and social media final_videoPresentation big data and social media final_video
Presentation big data and social media final_video
 
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
 

Viewers also liked

Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...khcoder
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text MiningMinha Hwang
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Gephi Tutorial Visualization
Gephi Tutorial VisualizationGephi Tutorial Visualization
Gephi Tutorial VisualizationGephi Consortium
 
Docker for Java Developers
Docker for Java DevelopersDocker for Java Developers
Docker for Java DevelopersNGINX, Inc.
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackBoden Russell
 
Facebook Network Analysis using Gephi
Facebook Network Analysis using GephiFacebook Network Analysis using Gephi
Facebook Network Analysis using GephiSarah Joy Murray
 

Viewers also liked (8)

Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
Quick Start Tutorial of KH Coder 2: Quantitative Content Analysis or Text Min...
 
Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Gephi Tutorial Visualization
Gephi Tutorial VisualizationGephi Tutorial Visualization
Gephi Tutorial Visualization
 
Textmining Introduction
Textmining IntroductionTextmining Introduction
Textmining Introduction
 
Docker for Java Developers
Docker for Java DevelopersDocker for Java Developers
Docker for Java Developers
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStack
 
Facebook Network Analysis using Gephi
Facebook Network Analysis using GephiFacebook Network Analysis using Gephi
Facebook Network Analysis using Gephi
 

Similar to A Primer on Text Mining for Business

Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologiesenterprisesearchmeetup
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social MediaSeth Grimes
 
Km cognitive computing overview by ken martin 19 jan2015
Km   cognitive computing overview by ken martin 19 jan2015Km   cognitive computing overview by ken martin 19 jan2015
Km cognitive computing overview by ken martin 19 jan2015HCL Technologies
 
KM - Cognitive Computing overview by Ken Martin 13Apr2016
KM - Cognitive Computing overview by Ken Martin 13Apr2016KM - Cognitive Computing overview by Ken Martin 13Apr2016
KM - Cognitive Computing overview by Ken Martin 13Apr2016HCL Technologies
 
Why Social Media Matters to You and Your Agency
Why Social Media Matters to You and Your AgencyWhy Social Media Matters to You and Your Agency
Why Social Media Matters to You and Your Agencygvaughan
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
 
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3Dave Pollard
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisOpen Analytics
 
Information Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the FieldInformation Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the FieldNick Berry
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementTrey Grainger
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectbodaceacat
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSara-Jayne Terp
 
Semantic engagement
Semantic engagementSemantic engagement
Semantic engagementSTIinnsbruck
 
When to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudWhen to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudMeaningCloud
 
Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Stanford University
 
Text analytics on social media
Text analytics on social mediaText analytics on social media
Text analytics on social mediaVenkatramanan P.R.
 
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]gvaughan
 

Similar to A Primer on Text Mining for Business (20)

Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
 
Km cognitive computing overview by ken martin 19 jan2015
Km   cognitive computing overview by ken martin 19 jan2015Km   cognitive computing overview by ken martin 19 jan2015
Km cognitive computing overview by ken martin 19 jan2015
 
KM - Cognitive Computing overview by Ken Martin 13Apr2016
KM - Cognitive Computing overview by Ken Martin 13Apr2016KM - Cognitive Computing overview by Ken Martin 13Apr2016
KM - Cognitive Computing overview by Ken Martin 13Apr2016
 
AKM PPT C4 ASSET FORMATION
AKM PPT C4 ASSET FORMATIONAKM PPT C4 ASSET FORMATION
AKM PPT C4 ASSET FORMATION
 
Why Social Media Matters to You and Your Agency
Why Social Media Matters to You and Your AgencyWhy Social Media Matters to You and Your Agency
Why Social Media Matters to You and Your Agency
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
Kmwi2008 Pollard From Content To Context And From Collection To Connection V3
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysis
 
Information Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the FieldInformation Architecture for Retail Web Sites: Lessons from the Field
Information Architecture for Retail Web Sites: Lessons from the Field
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
Semantic engagement
Semantic engagementSemantic engagement
Semantic engagement
 
When to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning CloudWhen to use the different text analytics tools - Meaning Cloud
When to use the different text analytics tools - Meaning Cloud
 
Ola ei nov. 22 2103
Ola ei nov. 22 2103Ola ei nov. 22 2103
Ola ei nov. 22 2103
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016
 
Text analytics on social media
Text analytics on social mediaText analytics on social media
Text analytics on social media
 
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
How to Manage Social Media for the Busy Professional - 40 Plus DC [long]
 

More from Clement Levallois

Part 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accountsPart 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accountsClement Levallois
 
Education et intelligence artificielle
Education et intelligence artificielleEducation et intelligence artificielle
Education et intelligence artificielleClement Levallois
 
3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications businessClement Levallois
 
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Clement Levallois
 
Presentation of programming languages for beginners
Presentation of programming languages for beginnersPresentation of programming languages for beginners
Presentation of programming languages for beginnersClement Levallois
 
Umigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomUmigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomClement Levallois
 
Data visualization: enjeux pour le business
Data visualization: enjeux pour le businessData visualization: enjeux pour le business
Data visualization: enjeux pour le businessClement Levallois
 
An explanation of machine learning for business
An explanation of machine learning for businessAn explanation of machine learning for business
An explanation of machine learning for businessClement Levallois
 

More from Clement Levallois (9)

Part 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accountsPart 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accounts
 
Education et intelligence artificielle
Education et intelligence artificielleEducation et intelligence artificielle
Education et intelligence artificielle
 
3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business
 
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
 
Presentation of programming languages for beginners
Presentation of programming languages for beginnersPresentation of programming languages for beginners
Presentation of programming languages for beginners
 
Umigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomUmigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroom
 
Data visualization: enjeux pour le business
Data visualization: enjeux pour le businessData visualization: enjeux pour le business
Data visualization: enjeux pour le business
 
Twitter for beginners
Twitter for beginnersTwitter for beginners
Twitter for beginners
 
An explanation of machine learning for business
An explanation of machine learning for businessAn explanation of machine learning for business
An explanation of machine learning for business
 

Recently uploaded

Welding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsWelding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsIndiaMART InterMESH Limited
 
Excvation Safety for safety officers reference
Excvation Safety for safety officers referenceExcvation Safety for safety officers reference
Excvation Safety for safety officers referencessuser2c065e
 
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptxGo for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptxRakhi Bazaar
 
business environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxbusiness environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxShruti Mittal
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Americas Got Grants
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Peter Ward
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerAggregage
 
WSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfWSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfJamesConcepcion7
 
14680-51-4.pdf Good quality CAS Good quality CAS
14680-51-4.pdf  Good  quality CAS Good  quality CAS14680-51-4.pdf  Good  quality CAS Good  quality CAS
14680-51-4.pdf Good quality CAS Good quality CAScathy664059
 
WSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfWSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfJamesConcepcion7
 
Interoperability and ecosystems: Assembling the industrial metaverse
Interoperability and ecosystems:  Assembling the industrial metaverseInteroperability and ecosystems:  Assembling the industrial metaverse
Interoperability and ecosystems: Assembling the industrial metaverseSiemens
 
20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdfChris Skinner
 
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdftrending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdfMintel Group
 
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...ssuserf63bd7
 
Jewish Resources in the Family Resource Centre
Jewish Resources in the Family Resource CentreJewish Resources in the Family Resource Centre
Jewish Resources in the Family Resource CentreNZSG
 
Data Analytics Strategy Toolkit and Templates
Data Analytics Strategy Toolkit and TemplatesData Analytics Strategy Toolkit and Templates
Data Analytics Strategy Toolkit and TemplatesAurelien Domont, MBA
 
Unveiling the Soundscape Music for Psychedelic Experiences
Unveiling the Soundscape Music for Psychedelic ExperiencesUnveiling the Soundscape Music for Psychedelic Experiences
Unveiling the Soundscape Music for Psychedelic ExperiencesDoe Paoro
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMVoces Mineras
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxmbikashkanyari
 
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh JiPsychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh Jiastral oracle
 

Recently uploaded (20)

Welding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsWelding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan Dynamics
 
Excvation Safety for safety officers reference
Excvation Safety for safety officers referenceExcvation Safety for safety officers reference
Excvation Safety for safety officers reference
 
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptxGo for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
Go for Rakhi Bazaar and Pick the Latest Bhaiya Bhabhi Rakhi.pptx
 
business environment micro environment macro environment.pptx
business environment micro environment macro environment.pptxbusiness environment micro environment macro environment.pptx
business environment micro environment macro environment.pptx
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...
 
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon Harmer
 
WSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdfWSMM Technology February.March Newsletter_vF.pdf
WSMM Technology February.March Newsletter_vF.pdf
 
14680-51-4.pdf Good quality CAS Good quality CAS
14680-51-4.pdf  Good  quality CAS Good  quality CAS14680-51-4.pdf  Good  quality CAS Good  quality CAS
14680-51-4.pdf Good quality CAS Good quality CAS
 
WSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfWSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdf
 
Interoperability and ecosystems: Assembling the industrial metaverse
Interoperability and ecosystems:  Assembling the industrial metaverseInteroperability and ecosystems:  Assembling the industrial metaverse
Interoperability and ecosystems: Assembling the industrial metaverse
 
20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf
 
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdftrending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
 
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
Horngren’s Financial & Managerial Accounting, 7th edition by Miller-Nobles so...
 
Jewish Resources in the Family Resource Centre
Jewish Resources in the Family Resource CentreJewish Resources in the Family Resource Centre
Jewish Resources in the Family Resource Centre
 
Data Analytics Strategy Toolkit and Templates
Data Analytics Strategy Toolkit and TemplatesData Analytics Strategy Toolkit and Templates
Data Analytics Strategy Toolkit and Templates
 
Unveiling the Soundscape Music for Psychedelic Experiences
Unveiling the Soundscape Music for Psychedelic ExperiencesUnveiling the Soundscape Music for Psychedelic Experiences
Unveiling the Soundscape Music for Psychedelic Experiences
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQM
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
 
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh JiPsychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
 

A Primer on Text Mining for Business

  • 1. MK99 – Big Data 1 Big data & cross-platform analytics MOOC lectures Pr. Clement Levallois
  • 2. MK99 – Big Data 2 A primer on text mining for business • Text mining: computational methods to find interesting information in texts • Quasi synonyms: – natural language processing (abbreviated in NLP) – computational linguistics (name of a scientific discipline)
  • 3. MK99 – Big Data 3 Text… what kinds? • Books • Tweets • Product reviews on Amazon • LinkedIn profiles • The whole Wikipedia • Free text answers in the results of a survey • Tenders, contracts, laws, … • Print and online media • Archival material • …
  • 4. MK99 – Big Data 4 What can be done? • Sentiment analysis – Is this piece of text of a positive or negative tone? • Topic modeling / topic detection – What is the main theme of this 20-page booklet? • Semantic disambiguation – “Paris” is mentioned in this text. Is this Paris Hilton or Paris, France? • Named Entity Recognition (NER) – Automatically find the individuals, organizations and events named in the text, and the relations between them. • Semantic enrichment – If you searched Google for “TV”, results for “television” will also show up • Language detection – “Ich spreche Deutsch” -> this sentence is written in German • Automatic Translation – See Google Translate •Summarizing –Shortening a text while keeping its core message intact •Spelling correction –Well, that’s easy •Topic Classification –Is this email a spam or not?
  • 5. MK99 – Big Data 5 Amaze me! • Demo on sentiment analysis With a tool by Stanford: http://nlp.stanford.edu:8080/sentiment/rntnDemo.html • Demo on semantic disambiguation With a tool by a collaborative effort: http://dbpedia-spotlight.github.io/demo/ (click on “annotate”, and also change the text for one of your own)
  • 6. MK99 – Big Data 6 What can’t be done yet (but is actively researched) • Detection of irony • Robust translation • Reasoning beyond Q&A What makes things harder • Non English texts • Slang and colloquial speech-forms • Real time processing
  • 7. MK99 – Big Data 7 Example of routine operations when working with text (or, how to follow the most basic conversation in comput. linguistics) • Stemming – “liked” and “like” will be reduced to their stem “lik” to facilitate further operations • Lemmatizing – Grouping “liked”, “like” and “likes” to count them as one basic semantic unit • Part-of-Speech tagging (aka POS tagging) – Automatically detecting the grammatical function of the terms used in a sentence, to facilitate translation or else • “Starting the text analysis with a bag-of-words model” – Operation which consists in just listing and counting all different words in the text. • N-grams – The text “I am Dutch” is made of 3 words: I, am, Dutch. But it can also be interesting to look at bigrams in the text: “I am”, “am Dutch”. Or trigrams: “I am Dutch”. – When neighboring words are considered together just like we did, they are called n-grams. This can reveal interesting things about frequent expressions used in the text. – A good example of how useful this can be: visit the Ngram Viewer by Google: https://books.google.com/ngrams
  • 8. MK99 – Big Data 8 Chief benefit: Getting to know individuals better • Without text mining, we have access to “external”, “cold” states of the individual – Behavior (eg, clicks), external attributes (address, gender, encyclopedia entry), social networks (but relatively cold ones.) • With text mining, we have access to “internal”, “hot” states: - opinions - intentions - preferences - degree of consensus - social networks (who mentions whom: how, in which context) - implicit attributes of the speaker
  • 9. MK99 – Big Data 9 How easy is it? • Too easy… the limit is legal and ethical, not technical “Predicting the Political Alignment of Twitter Users” by Conover et al. (2011). http://cnets.indiana.edu/wp-content/uploads/conover_prediction_socialcom_pdfexpress_ok_version.pdf “Political Tendency Identification in Twitter using Sentiment Analysis Techniques” by Pla and Hurtado (2014). http://anthology.aclweb.org/C/C14/C14-1019.pdf “Private traits and attributes are predictable from digital records of human behavior” by Kosinski et al. (2013). http://www.pnas.org/content/110/15/5802.abstract (and this gets even more powerful when mixing text mining, network analysis and machine learning)
  • 10. MK99 – Big Data 10 What use for text mining in a business context? 1. Client facing 2. Business management 3. Business development
  • 11. MK99 – Big Data 11 1. Market facing activities • Refined scoring: propensity scores (including churn), scoring of prospects •Refined individualization of campaigns –ads, email campaigns, coupons, etc. •Better community management –Getting a clear and precise picture of how customers and prospects perceive, talk about, and engage with your brand / product / industry.
  • 12. MK99 – Big Data 12 2. Business Management • Organizational mapping – Getting a view of the organization through text flows. – Example: getting a view on the activity of a business school through a map of its scientific publications. • HRM – Finding talents in niche industries, based on the mining of their profiles • Marketing research – refined segmentation + targeting + positioning, measuring customer satisfaction, perceptual mapping.
  • 13. MK99 – Big Data 13 3. Business development • Developing adjunct services – product recommendation systems (eg, Amazon’s) – detection and matching of needs (eg, detection of complaints / mood changes) – product enhancements (eg, content enrichment through localization/personalization) • Developing new products entirely, based on – different search engines – alert systems / automated systems based on monitoring textual input – knowledge databases – new forms of content curation / high value info creation + delivery
  • 14. MK99 – Big Data 14 Interesting players through their “Data Services” package + many APIs listed on www.programmableweb.com
  • 15. MK99 – Big Data 15 This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) Contact Clement Levallois (levallois [at] em-lyon.com) for more information.