2. Agenda
• Social Media: An INT perspective
• Common Analytic Pitfalls
• An Analytic Framework
• Case Study: Brand Management
– Problem Definition
– Source Selection
– Data Capture
– Data Reporting
– Data Analysis
• Ways Forward, Future Analysis
• Questions?
3. Intelligence
• Intelligence is information that has been
transformed to meet an operational need
Operational Lens
Data Intelligence
5. Social Media: The INT Perspective
Social Media gets the best
and worst of three disciplines:
HUMINT
– HUMINT
• Pros: Reveals intentions
• Cons: Can be unreliable
– OSINT
• Pros: Fast, Accessible
OSINT SIGINT • Cons: Noise
– SIGINT
• Pros: Network, High Volume
• Cons: Noise
6. Social Media Analysis Goals
• Need to have an end-goal with value to the
organization (operational lens)
• Need to ensure cyclical feedback occurs from
collection, processing, analysis, and
consumption
• Need to make sure that a particular network is
the right source for the task
7. Common Misconceptions
• Social media is not a panacea
– Not everyone uses social media
– Users of social media use it unevenly
– User behavior changes based on situations
• Just because people can talk about anything
does not mean they talk about everything all the
time.
8. Common Pitfalls
• The important thing is often not what people are
saying… but why they are saying it.
• Reporting tools rarely help dig into the why.
• Many common tools, reports, and metrics are
actually misleading:
– Word clouds atomize message context
– Sentiment metrics are often highly inaccurate
– Information in aggregate hides more than it reveals
9.
10.
11. Dangers of Disintegration
Source: Matthew Auer, Policy Studies Journal,
Volume 39, Issue 4, pages 709–736, Nov 2011
12. Analytic Framework
• Data Capture (DC)
• Data Reporting (DR)
• Data Analysis (DA)
– 1. What to measure
– 2. What the data is saying
– 3. What should be done based on the data
Source: Avinash Kaushik, Occam’s Razor Blog
http://www.kaushik.net/avinash/web-analytics-consulting-
framework-smarter-decisions/
14. Choosing a Platform
• Social media is still new, evolving; and so
is how we use it.
– Static approaches to social media are flawed
from the outset
– No one metric or set of metrics will always let
you know what is happening
• Need an adaptive platform to facilitate
data capture, reporting, and analysis
15. Case Study: Brand Management
• Industry: Gaming
– Experiencing 10% growth annually
– Overall revenue expected to exceed $80
billion by 2014
• In May, Zenimax Online Studios
announced Elder Scrolls Online
– Elder Scrolls V: Skyrim 2nd largest game of
2011
16. Problem Definition
• As a brand manager, how can I use social
media to track and understand public
attitudes toward my product?
• Challenge is getting relevant information
– Query too large = false positives
– Query too small = miss potential information
17. Source: Twitter
• Twitter has some of the best
analytic potential
– High volume traffic
– High volume user-base
– Open API
• Not without limitations:
– 140 characters
– Limited historical / lookback
18. Platform: Infinit.e
Infinit.e is a
scalable
framework for Visualizing
Analyzing
Retrieving
Enriching
Storing
Collecting
Unstructured documents
&
Structured records
19. Platform: Infinit.e
• Infinit.e supports the extraction of entities
and creation of associations using a
combination of built in enrichment libraries
and 3rd party NLP APIs.
20. Data Capture – Initial Query
• Twitter search for “Elder Scrolls Online”
– Simplest possible way to access information
– RSS feed for 10 days (Jun 27 – July 6 2012)
22. Data Capture – Entity Map
Hashtag TwitterHandle URL
Who
TwitterHandle
What
Hashtags, Keywords,
URLs
When
Time, Date
Unstructured Keywords Where
Time / Date Stamp Geo (if Available)
23. Data Reporting
• Used Infinit.e’s Flash U/I Widget Framework
– Document Browser (Individual Tweets)
– Entity Significance (Top Entities)
– Sentiment (Top Entities w/ Sentiment)
– Query Metrics (Breakdowns of Query Results)
• Framework allows for additional
visualizations to be constructed as needed
• Export options also available for manual
review (e.g. graphml, excel, pdf)
27. Data Analysis
• Analysis needs to be rooted in the
operational need:
“How can I use social media to track and
understand public attitudes toward my
product”
• Emphasis on hypothesis generation,
testing, and experimentation
28. Data Analysis -> Capture
• Hash tags from an initial subset of Tweets
fed back into the initial query
Initial
Expanded Query
Query
Results
Results
Twitter
29. Data Analysis - Hashtags
• Top hashtags were
almost all generic /
more abstract
– Undermines tracking and
understanding
– Top hashtags tied to
franchise, not to the
game
30. Data Analysis - Sentiment
• Converted URLs into derivative sources
• 35% additional sources
• Larger text sources offer potential value with
sentiment analysis that tweets alone cannot offer
31. Data Analysis - Sentiment
• Top negative and positive scores provided
glimpses into aggregate attitudes
• Provide starting points for additional analysis
32. Data Analysis - Recommendations
• Actionable recommendations allow
decision makers to make changes
33. Future Data Analysis
• Initial conclusions should be starting points
for new analysis
• Broad entity capture allows for:
– Key influencer identification
– Clustering of tweets for segmentation
– Map / Reduce for aggregate functions
35. Expandable Model
• Identify key influencers on specific topics
• Look at relationships between websites /
blogs and Twitter use (cross-network
analysis)
36. Counting and Summing
• “Traditional” business intelligence analytics
problems solved using aggregate functions:
– Sum
– Count
– Average
– Min
– Max
– Etc.
37. Clustering - Topic
• Topic Extraction
– Key words -> Categories
– Categories -> Related Categories
Keyword Topic Key Value
graphics graphics graphics gameplay.pdf
screenshots graphics story gameplay.pdf
resolution graphics company corporate.txt
quests story … …
zenimax company … …
… …
39. Take-Aways
• All data providers can and do change their
formats; users flock to and abandon
platforms – what works today may not
work tomorrow.
• Whatever platform you choose to do
analysis, make sure it’s open and
adaptable or your investment may
degrade over time.
40. Take Aways (Things to Avoid)
• Data puking (less is more)
• Metrics that cannot be tied to actions
• Visualizations / reports that remove
context
• Taking dashboards at face value
41. Take Aways (Things to Do)
• Segment data rather than work in aggregate
• Look for the why behind the message
• Always return to the source material
• Explore alternative explanations
• Always consider the ultimate goal
42. Thank You!
Andrew Strite
www.ikanow.com
astrite@ikanow.com
github.com/ikanow/Infinit.e
Editor's Notes
Given my background, I come at the social media problem from an intelligence analysis perspective. This comes with a certain set of vocabulary and paradigms, but I believe they are useful for understanding how to frame out an effective analytic framework.