More Related Content Similar to Machinelearning: The next step in manufacturing performance (20) More from Blackberry&Cross (20) Machinelearning: The next step in manufacturing performance 1. 1 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
The Next Step in
Manufacturing Performance
MACHINE
LEARNING
August 9, 2018
Cheryl Pammer
Statistician, Minitab Inc
2. 2 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Over 90% of Fortune 100 companies use Minitab to
improve the quality of their products and processes.
4. 4 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
As a Minitab trainer, statistical consultant, software
designer, Cheryl is immersed in the day-to-day problems
faced by practitioners. Cheryl aims to provide the sound
statistical strategies and software tools that professionals
need to solve the real-world problems they face on the job.
Cheryl has a M.S. in Statistics from The Pennsylvania State
University and is currently working on a graduate certificate
in Data Mining and Applications from Stanford University.
Meet the Presenter:
Cheryl Pammer
Minitab Statistician
5. 5 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Agenda
• Creating a Culture of Analytics
• Overview of Machine Learning
• Moving From Regression to
Regression Trees
• Conclusions and Questions
5
6. 6 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Creating a Culture of Analytics
• Analytics should not reside solely in an
IT department
• Analytics should not reside solely with
Data Scientists
• The goal should be to move
everyone’s skills further along the
analytics curve
6
Visualization
Descriptive
Stats
Inference
Regression
Regression
Trees (CART)
Transformational organizations will develop a
culture of analytics throughout the organization
7. 7 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
The Democratization of Data Science
– Jonathan Cornelissen, “The Democratization of Data Science”, Harvard Business Review, July 27, 2018
7
These days every industry is drenched in data,
and the organizations that succeed are those that most
quickly make sense of their data in order to adapt to
what’s coming. The best way to enable fast discovery and
deeper insights is to disperse data science expertise
across an organization.
8. 8 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
CRISP-DM
Cross-industry standard
process for data mining
• Business Understanding
• Data Understanding
• Data Preparation
• Modeling
• Model Evaluation
• Model Deployment
9. 9 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Machine Learning Models
MACHINE
LEARNING
UNSUPERVISED LEARNING
SUPERVISED LEARNING
Group and interpret data
based on input data only.
Finds hidden patterns or
intrinsic structures in data.
Develops predictive models
using input and output data.
Applied if you have known
data for the output you are
trying to predict.
CLUSTERING
CLASSIFICATION
REGRESSION
10. 10 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Regression Challenges
Regression and logistic regression often don’t work well, particularly with larger
observational data sets:
• Everything is statistically significant.
• Determination of predictors to include in model is challenging.
• Local effects are ignored.
• Relationships may be nonlinear.
• Complex interactions exist.
• Extreme outliers exist.
• Many missing values.
1
11. 11 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Let’s Get Started . . .
• Fitting regression model using Minitab 18
• Stepwise regression to eliminate variables
• Residual plots to check model parametric
assumptions
• Classification and regression trees – CART
• Identify variables causing excessive water loss
• Prioritizing most important variables (Hot Spots)
Water Content
Loss Project
A pharmaceutical
company needs to
determine the root cause
of excessive water loss
occurring in a specific
chemical formulation
Practice Data Set
Simulated data from current customer
Live Demo
12. 12 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Machine Learning Terminology
ROOT Node
Decision Node
Splitting
Terminal Node
Decision Node
Terminal Node Terminal Node
Branch
[SPM
Logo]
✓ Decision trees quickly find the X’s that best divide the data into
distinct regions relative to Y
✓ Y can be continuous or categorical
Response Variable = Dependent Variable = Target Variable
This is what we are trying to predict
Examples: Water content loss
Algorithm = Method Used = Technique
This is the method that we will use to both predict the target variable and
discover the relationships, if any, between the predictors and the target.
Examples: CART decision trees, gradient boosted trees, Random Forests,
TreeNet
SPM’s team of data scientists were the founders of machine learning over 40 years ago and the original architects
behind CART, MARS, Random Forests and TreeNet modeling engines
Leo Breiman, Jerome Friedman, Richard Olshen, Charles Stone. (1984) Classification and Regression Trees.
Predictor Variables = Predictors = Factors
This is what we use to predict the response.
The Categorical predictors are:
Analyst Tank 'Zone Test' 'Filter
Changed'
The Continuous predictors are:
'Calculated Value' Volume 'Filter
Weight' Flow 'Nozzle Offset' 'Filter
Accumulation'
13. 13 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Classification and Regression Trees
• Nonparametric machine learning for classification and
regression
• Stepwise procedure in which predictors enter the model one at
a time
• Procedure works by carving a high-dimensional data space into
small to moderate set of regions
• A prediction is produced for each region
SPM CART Tree
14. 14 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Basic Machine Learning Algorithms
1
Least Squares Regression Model
Regression Trees, CART algorithm
You likely already know more than you might think
• Supervised (Y and X’s) includes Regression, Logistic Regression, CART, Random Forests,
Gradient-Boosted Trees
• Unsupervised (only X’s) includes K-Means Clustering, Hierarchical Clustering, Principal Components
Analysis
15. 15 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Classification and Regression Trees
• Nonparametric machine learning for classification and
regression
• Stepwise procedure in which predictors enter the model one at
a time
• Procedure works by carving a high-dimensional data space into
small to moderate set of regions
• A prediction is produced for each region
SPM CART Tree
16. 16 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Adding CART to Your Toolkit
• Regression trees are a natural extension from regression
and binary logistic regression. They are useful for
• determining the key drivers of an outcome
• for predicting what will happen next
• As data sets grow larger and more complicated, regression
trees become a vital tool to
• be faster, more accurate
• save time
• uncover relationships you might not have seen before
• Everyone can benefit from having some knowledge around
machine learning algorithms and terminology
• Get started, and lead the effort to create a
culture of analytics in your organization!
✓ Accommodate larger data sets
✓ Address non-linear relationships
✓ Local effects detection
✓ Uncover complex interactions
✓ Missing values ok
17. 17 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
➢ 30+ presentations
– view the program now
➢ Get expert advice in The Minitab Lab
➢ Participate in new
User-Centered Design Studio
Visit: Insights page for more information
18. 18 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
What Minitab Offers
Products
Services
Powerful statistical
software everyone can use.
Analyze data and present
results with confidence.
Quality improvement. Improved.
The tools and reports you need to
guarantee process and product
excellence.
Learn to see your data.
Online learning solution
to master statistics and
Minitab® anytime, anywhere.
Fast, highly accurate
platform for data mining
and predictive analytics.
Training
Learn Minitab and Companion
first-hand by attending public or
customized trainings in your facilities
according to your requirements.
Statistical
Consulting
Get personalized help with statistical
challenges you face, including collecting
the right data, interpreting your analysis
and much more.
Support
Assistance with installation and
implementation, updating of
versions, use of software and
interpretation of results.
19. 19 Machine Learning: The Next Step in Manufacturing Performance
© 2018 Minitab, Inc.
Questions?
Please write your questions in the
questions pane at any time.
Ready to get started? Try adding CART
to your toolkit. Download the Trial