Synopsis:
The speaker will address the need to rethink classical approaches to analysis and predictive modeling. He will examine "iterative analytics" and extremely fine grained segmentation down to a single customer -- ultimately building one model per customer or millions of predictive models delivering on the promise of "segment of one" . The speaker will also address the speed at which all this has to work to maintain a competitive advantage for innovative businesses.
Speaker:
Afshin Goodarzi, Chief Analyst 1010data
A veteran of analytics, Goodarzi has led several teams in designing, building and delivering predictive analytics and business analytical products to a diverse set of industries. Prior to joining 1010data, Goodarzi was the Managing Director of Mortgage at Equifax, responsible for the creation of new data products and supporting analytics to the financial industry. Previously, he led the development of various classes of predictive models aimed at the mortgage industry during his tenure at Loan Performance (Core Logic). Earlier on he had worked at BlackRock, the research center for NYNEX (present day Verizon) and Norkom Technologies. Goodarzi's publications span the fields of data mining, data visualization, optimization and artificial intelligence.
Sponsor:
1010Data [ http://1010data.com ]
Microsoft NERD [ http://microsoftnewengland.com ]
Cognizeus [ http://cognizeus.com ]
Rethinking classical approaches to analysis and predictive modeling
1. 1
Predic(ve
Analy(cs
on
a
Big
Data
Scale!
Afshin
Goodarzi
afshin@1010data.com
April, 2014
2. 2
About
1010data
• Founded
in
2000
• Based
in
NYC
• Big
Data
analyAcs
plaCorm
in
the
cloud
• Library
of
pre-‐built
analyAcal
applicaAons
• Speed,
power
and
flexibility
second
to
none
3. 3
We
Host/Analyze
14+
Trillion
Rows
of
Data
All Quotes and Trades since 2003 on NYSE are done on 1010data
All mortgages ever issued are analyzed on 1010data
Nearly all real-estate transactions are completed on 1010data
Big Data - Granular Data - Time series Data
All data for ~35,000 Retail outlets across the US are analyzed on 1010data
4. 4
A
Typical
BI
Technology
Stack
Administrators
Data Sources
ETL
Inter-‐Enterprise
Users
EDW
Data
Cubes/
Marts
ReporAng
/
VisualizaAon
Analysis
/
Modeling
9. 9
Predic(ve
Analy(cs
on
a
Big
Data
Scale!
Big
Data
mandated
AnalyAcs
and
predicAve
modeling
-‐
an
example:
The
larger
data
sets
have
mandated
more
rigorous
sampling
strategies
as
tradiAonal
systems
have
not
kept
up
with
the
computaAonal
needs
of
predicAve
analyAc
soluAons
on
Big
Data.
• Can
we
use
all
but
a
small
holdout
set
in
predicAve
modeling?
• What
are
the
challenges?
• What
is
an
approach
that
works?
• Are
the
results
any
good?
• Is
this
soluAon
only
applicable
to
one
industry?
10. 10
Common
Predic(ve
Modeling
Approach
" CPU
intensive
&
error
prone
steps:
» Data
selecAon
» IV
to
DV
relaAonship
» TransformaAons
» Sampling
and
validaAon
» Model
esAmaAon
» Model
tesAng
» Repeat
10
hlp://onlinepubs.trb.org/onlinepubs/nchrp/cd-‐22/v2chapter5.html
11. 11
“One
Segment”
=>
“A
Segment
of
One”
“Any
customer
can
have
a
car
painted
any
color
that
he
wants
so
long
as
it
is
black.”
re:
the
Model-‐T
in
1909
(from
My
Life
and
Work
,
Henry
Ford,
1922,
Chap.
4,
p.71)
12. 12
Harry
Truman
displays
a
copy
of
the
Chicago
Daily
Tribune
newspaper
that
erroneously
reported
the
elecAon
of
Thomas
Dewey
in
1948.
Truman’s
narrow
victory
embarrassed
pollsters,
members
of
his
own
party,
and
the
press
who
had
predicted
a
Dewey
landslide.
13. 13
Build
A
30
Day
Shopping
List
For
Each
Loyal
Shopper
at
a
Retail
Chain
Shopper
SKU
Probability
of
purchase
in
the
next
30
days
A.
Smith
12345
90%
A.
Smith
23567
85%
A.
Smith
….
A.
Smith
87996
30%
POS
Loyalty
Econ
House
prices
Mortgage
Rates
BLS
-‐
Unemployment
Inventory
With
Permission
from
A&P
14. 14
If
The
Shopper
Bought
“It”
Before
Will
They
Buy
“It”
Again?
" Classical
modeling:
variables
as
either
posiAvely
or
negaAvely
correlated
with
target
" Shoppers
don’t
behave
the
same!
" The
demographics
alributes
have
distribuAons
for
each
variable!
16. 16
All
sources
of
Prepay
as
analyzed
in
1989
D
R
M
Interest
Rates
House
prices
Unemployment
Loan
Age
Cost
of
opAon
Regional
economy
I
hlp://www.freeusandworldmaps.com/html/US_CounAes/US_CounAes.html
hlp://www.tradingeconomics.com/united-‐states/unemployment-‐rate
hlp://www.wfa.gov/
hlp://www.richmondfed.org/banking/markets_trends_and_staAsAcs/trends/pdf/delinquency_and_foreclosure_rates.pdf