SlideShare a Scribd company logo
1 of 64
Download to read offline
Create a MVPS (minimal viable prediction service) in 90 mins
Jim Dowling
@jim_dowling
CEO, Hopsworks
WORKSHOP
Build a production ML system
with only Python on free
serverless services
Jim Dowling - CEO & Co-Founder of Hopsworks
and an Associate Professor at KTH Royal
Institute of Technology. Co-inventor of the
open-source Hopsworks platform.
Presenter;
https://www.serverless-ml.org - Free Online Course
Serverless ML Community Discord
Register on app.hopsworks.ai
Register on modal.ai
Register on huggingface.co
- create a new “space”
SLIDES:
1
The scope of Data Scientist’s work has
expanded to build Prediction Services
It’s not enough to just train models anymore
Data
Engineer
Data
Scientist
ML
Engineer
Data Dump Model
Enterprise
Data
Model
Serving
models need operational data
If you work in this kind of Enterprise, Conway’s Law implies you will only train models
// What is the scope of a Data Scientist’s work?
Business
Value
Real-Time
Data
Real time prediction services
Personalized search /
recommendations
High Business Value
Batch Prediction Services
Demand forecasting, risk
estimation, financial planning
Good Business Value
Batch
Data
Static
Data
One-off Reports
Medium Business Value
// Increase Business Value with more up-to-date Data
1. Train a Model on a static dataset and
produce an Evaluation Report
OR
2. Build a
Minimal Viable Prediction Service
to show your stakeholders how/where
your model will be used
// What is the scope of a Data Scientist’s work?
https://id2223kth.github.io/assignments/project/ServerlessMLProjectsID22232023.html
// Build Minimal Viable Prediction Services
AI Powered
Product & Services
Monitoring
& Compliance
New Data
Historical Data
Output
Source
data data
Models
1. Feature
engineering 3. Inference
2. Model Training
data
data
data
logs
models
data models
Model Registry
ML Pipelines & MLOps
data
// ML System = Feature + Training + Inference Pipelines
Feature Store
2
Write Feature, Training, Inference pipelines
to build an Air Quality Prediction Service
Hopsworks, Modal, Streamlit/Gradio
1. Feature Pipeline => Features/Labels
2. Training Pipeline => Model
3. Inference Pipeline => Predictions
Data =>
Features/Labels =>
Model/Features =>
// ML Systems should have 3 independent Machine Learning Pipelines
Prediction
Problem with
Business KPIs
& Data Sources
Prediction
Consumer
// Minimal Viable Prediction service - Iteratively Develop it
MVPs
(Minimal Viable
Prediction service)
Feature
Pipeline
Inference
Pipeline
Training
Pipeline
feature-pipeline.py
batch-inference-
pipeline.py
training-pipeline.ipynb
Interactive UI
(app.py)
Weather Data
Air Quality Data
// Today’s Serverless ML Air Quality Prediction Service
Dashboard
(Github Pages)
User Interface
with Maps
Predictions
Data
features
model
Model Registry
features,labels models
data
predict
data
Feature View
Feature
Group
Feature
Group
Feature
Group
Hopsworks
Feature View
Streaming Data
event bus
Batch Data
databases &
warehouse
Feature
Pipeline
Online Inference
Pipeline
Training
Pipeline
ML Pipelines & MLOps
Hopsworks: write to Feature Groups, read from Feature Views
features
Data Sources: Open-Meteo and AQPA
https://open-meteo.com/en/docs/air-quality-api
https://aqicn.org/api/
city_name date wind_speed_max wind_direction_dominant wind_gusts_max temp_max
<entity_id> <event_time> <numerical feature> <categorical feature> <numerical feature> <numerical feature>
string datetime double string double double
berlin 2022-01-01 14.3 ne 22.4 22.7
dublin 2022-04-01 9.3 n 18.2 25.4
seattle 2022-07-01 11.1 nw 15.2 20.8
tacoma 2022-10-01 1.3 w 2.5 28.4
Row
Feature value.
Store unencoded to
maximize reuse over
many models.
Feature vector.
Set of feature values with
the same primary key.
Feature Types
Feature Group - weather
entity_id and event_time
uniquely identify each row.
They are not features.
city_name date pm2_5
<entity_id> <event_time> <numerical feature>
string datetime double
berlin 2022-01-01 5.3
dublin 2022-04-01 2.3
seattle 2022-07-01 3.1
tacoma 2022-10-01 4.3
Feature Group - air_quality
Possible Label
Column could be a target
for a prediction problem
3
Get started
Create an account on Hopsworks
● Use Conda or virtual environments to manage your python dependencies on
your laptop. See more info on how to manage your Python environment here.
● git clone https://github.com/jimdowling/air_quality.git
cd air_quality
conda create -n serverlessml python=3.9
conda activate serverlessml
pip install -r requirements.txt
// Getting Started…
1. First, create an account on
https://app.hopsworks.ai
2. Click on “User Settings”
3. Create and Save an “API Key”
Register on app.hopsworks.ai
Add HOPSWORKS_API_KEY as a Environment
variable secret
Create an account on Modal
(might need some time to be approved)
// modal.ai
1. Create an account
on Hugging Face
2. Create a “Space”
3. Create a Streamlit App
// huggingface.ai
1. Add your HOPSWORKS_API_KEY as a Repo Secret
3 Feature Pipelines
WORKSHOP
Streaming or On-Demand
Data Pipelines
Backfill data
csv file
Events
msgs in a Kafka Topic
user request input
Incremental Writes
write Pandas DataFrames
to the Feature Store
Batch Data
Pipelines
Static
Data
// Batch and Backfill Feature Pipelines
// What features are computed in a feature pipeline?
● Grouped Aggregations
● Binning
● Filtering
● Rolling/lag features
● Combining data from multiple tables using a common join key.
● Mapping: Transforming values using a user-defined function (UDF).
● Stream mapping: transforming values from using a UDF and
accumulated state.
● Time series analysis: Analyzing or aggregating data over time, such as
identifying trends, patterns, or anomalies.
● NLP - classifying text and outputting features (e.g., sentiment)
● Clustering, dimensionality reduction, etc
Generally, features are not encoded in Feature Pipelines as it prevents
reuse of features across models and slows down writes.
Pandas 1.0
10 GBs 100 GBs TBs PBs
Data Size
Pandas 2.x+, Polars
Spark or SQL (BQ/Snowflake/etc)
Spark Streaming
Batch
Streaming
Flink
// Choose the best framework for your feature pipeline
Difficulty
// Write Weather Pandas DataFrame to Hopsworks
weather_df = # 1. read today’s data in as a Pandas DataFrame
# 2. create features for in Pandas DataFrame
weather_fg = fs.get_or_create_feature_group(name="weather",
version=1,
description="Weather Daily Updates",
primary_key=['city'],
event_time=‘date’
)
weather_fg.insert(weather_df) # 3. write Pandas DataFrame to Feature Group
# …
air_quality_df = # 1. read the most recent air quality observations
# 2. create features for in Pandas DataFrame
air_quality_fg = fs.get_or_create_feature_group(name="air_quality",
version=1,
description="City Air Quality Data",
primary_key=['city'],
expectation_suite=expectation_suite,
event_time='date'
)
air_quality_fg.insert(air_quality_df) # 3. write DataFrame to Feature Group
# …
// Write Air Quality Pandas DataFrame to Hopsworks
// Feature Functions in a Python module
def moving_average(df, window=7):
df[f'mean_{window}_days'] = df.groupby('city_name')['pm2_5'] 
.rolling(window=window).mean().reset_index(0,drop=True).shift(1)
features/air_quality.py
One Python module per Feature Group
stub = modal.Stub("air_quality_daily")
image = modal.Image.debian_slim().pip_install(["hopsworks"])
@stub.function(image=image, schedule=modal.Period(days=1),
secret=modal.Secret.from_name("jim-hopsworks-ai"))
def g():
…
if __name__ == "__main__":
stub.deploy("air_quality_daily")
with stub.run():
g()
// Schedule your Feature Pipeline with Modal
Define program
dependencies
and program
schedule, env
variables
Deploy main()
function as a
scheduled
program on
modal
schedule:
- cron: '0 14 * * *'
..
steps:
- name: checkout repo content
uses: actions/checkout@v3
- name: setup python
uses: actions/setup-python@v3
with:
python-version: '3.9'
working-directory: .
- name: install python packages
run:
pip install -r requirements.txt
- name: execute Feature Pipeline
env:
HOPSWORKS_API_KEY: ${{ secrets.HOPSWORKS_API_KEY }}
CONTINENT: "Seattle"
run: jupyter nbconvert --to notebook --execute 2_feature_pipeline.ipynb
// Schedule Notebooks with Github Actions
// Create Data Validation Rules in Great Expectations
from great_expectations.core import ExpectationSuite, ExpectationConfiguration
expectation_suite = ExpectationSuite(
expectation_suite_name="transaction_suite")
expectation_suite.add_expectation(
ExpectationConfiguration(
expectation_type="expect_column_values_to_be_between",
kwargs={
"column":"pm2_5",
"min_value":"0.0",
"max_value":"1000.0",
}
)
)
Feature
Pipeline
Data Sources
Data
warehouse
Applications
-
Services
Validate
Data
Unstructured
Data
Great Expectations
Expectation Suite
Feature 1
Expectation Suite
Feature 2
Validation Reports
Feature 1
Validation Reports
Feature 2
Feature Group 1
Data
Feature Group 2
Data
Monitor
Search
Alert
// Data Validation with Great Expectations
https://www.hopsworks.ai/post/data-validation-for-enterprise-ai-using-great-expectations-with-hopsworks
https://docs.hopsworks.ai/3.1/user_guides/fs/feature_group/data_validation/
// Great Expectations in Hopsworks
3 Training Pipeline
WORKSHOP
Redundant
Feature
Irrelevant
Feature
A similar feature
is already
selected
The feature has
no predictive
power
Prohibited
Feature
Feature cannot be
used. May be
context dependent
Useful
Feature
It has predictive
power for my
prediction problem
Infeasible
Feature
Useful feature that
can’t be computed
for some reason
// Feature Selection
FEATURE VIEW
Select and Join Features
city_name date wind_speed_max wind_direction_dominant wind_gusts_max temp_max
<entity_id> <event_time> <numerical feature> <categorical feature> <numerical feature> <numerical feature>
string datetime double string double double
berlin 2022-01-01 14.3 ne 22.4 22.7
dublin 2022-04-01 9.3 n 18.2 25.4
seattle 2022-07-01 11.1 nw 15.2 20.8
tacoma 2022-10-01 1.3 w 2.5 28.4
city_name date pm2_5
<entity_id> <event_time> <numerical feature>
string datetime double
berlin 2022-01-01 5.3
dublin 2022-04-01 2.3
seattle 2022-07-01 3.1
tacoma 2022-10-01 4.3
// Select Features and Create a Feature View
Label ts
Feature ts
// Data Modelling: Training Data is a Fact/Dimension Model
Feature ts
Feature ts
Feature ts Feature ts
ts=TimeStamp
Join Features to
Labels with Entity IDs
select features
(wind_speed_max,wind_speed_dir, …)
air_quality_fv
FeatureView
filter (city_name == “Seattle”)
transform = { “wind_speed_max” : min_max_scaler }
label = [“pm2_5”]
weather
Feature Group
air_quality
Feature Group
Training Data Inference Data
Optional Steps
join on
city, date
// Building a Feature View from existing Features
// Select Features and create a Feature View
select
features
from
feature
groups
create
feature
view from
selected
features
fg_air_quality = fs.get_feature_group(name="air_quality", version=1)
fg_weather = fs.get_feature_group(name="weather", version=1)
selected = fg_air_quality.select(['pm2_5').join(fg_weather.select_all())
fv = fs.create_feature_view(name="air_quality_fv",
version=1,
description="Weather and Air Quality",
labels=['pm2_5'],
query=selected
)
Time
city date max_temp wind_direction_dominant … pm2_5
<entity_id> <event_time> min_max_scaler ordinal_encoder … min_max_scaler
string datetime double string … double
Amsterdam 2022-01-01 21.4 w … 11.0
Berlin 2022-04-01 25.3 nw … 10.2
Seattle 2022-07-01 12.1 n … 1.4
Dublin 2022-10-01 15.3 ne … 2.2
Dehli 2022-11-01 33.9 n … 38.2
New York 2022-12-01 5.1 e … 19.2
Portland 2023-01-01 -2.4 ssw … 3.2
Training
Data-v2
Batch
Inference
Data
Feature View - air_quality_fv
Feature
vector
Training
Data-v1
// Feature Views - Create Data Snapshots for Training or Inference
// Feature Views - Create Data Snapshots for Training or Inference
city date max_temp wind_direction_dominant … pm2_5
<entity_id> <event_time> min_max_scaler ordinal_encoder … min_max_scaler
string datetime double string … double
FeatureView for Model-FeedForwardNet
city date max_temp wind_direction_dominant … pm2_5
<entity_id> <event_time> label_encoder …
string datetime double string … double
FeatureView for Model-XGBoost
city date max_temp wind_direction_dominant … pm2_5
<entity_id> <event_time> min_max_scaler ordinal_encoder … min_max_scaler
string datetime double string … double
FeatureView for Model-LogisticRegression
Model-Dependent
Transformations
// Feature Views - Model-Dependent Transformations
Model-Dependent
Transformations
Model-Dependent
Transformations
X_train, X_test, y_train, y_test = fv.train_test_split(test_size=0.2)
categorical_transformer=Pipeline(steps=[("encoder",
OneHotEncoder(handle_unknown="ignore"))])
preprocessor = ColumnTransformer(transformers=[ 
("cat", categorical_transformer, categorical_feature_ids)])
clf = Pipeline(steps=[("preprocessor", preprocessor), ("regressor", XGBRegressor())])
clf.fit(X_train, y_train)
// Model Training with Feature Views
get train &
test set
model-dependent
feature encoding
The pipeline
encodes features,
then train model
joblib.dump(clf, 'air_quality_model/xgboost_pipeline.pkl')
input_schema = Schema(X_test)
output_schema = Schema(y_test)
aq_model = mr.sklearn.create_model("air_quality_model",
metrics={'accuracy': accuracy},
input_example=X_test.sample().to_numpy(),
model_schema=ModelSchema(input_schema=input_schema, output_schema=output_schema))
fraud_model.save('air_quality_model')
// Store the trained Model in a Model Registry
serialize
model
package
up model
and upload
to model
registry
get model
schema
from DFs
4 Inference Pipelines
WORKSHOP
fv = fs.get_feature_view(name="air_quality_fv", version=1)
df = feature_view.get_batch_data(start_time=today)
mr = project.get_model_registry()
model = mr.get_model("lending_model", version=1)
model_dir = model.download()
model = joblib.load(model_dir + "/air_quality_model.pkl")
predictions_df = model.predict(df)
// Batch Inference Pipeline - predict air quality on latest features
download
inference
data
download
model
make
predictions
fv = fs.get_feature_view(name="air_quality_fv", version=1)
mr = project.get_model_registry()
model = mr.get_model("air_quality_model", version=1)
model_dir = model.download()
model = joblib.load(model_dir + "/air_quality_model.pkl")
def air_quality(city_name):
arr = fv.get_feature_vector({"city_name": city_name}, passed_features={})
y_pred = model.predict(np.asarray(arr).reshape(1, -1))
get feature
view &
download
model
make a
prediction with
precomputed
and
user-supplied
features
// Online Inference Pipeline - interactive prediction service
fv = fs.get_feature_view(name="air_quality_fv", version=1)
mr = project.get_model_registry()
model = mr.get_model("air_quality", version=1)
model_dir = model.download()
model = joblib.load(model_dir + "/air_quality_model.pkl")
# FLINK CALLS THE METHOD BELOW FOR EVERY EVENT
def air_quality(city_name):
arr = fv.get_feature_vector({"city_name": city_name}, passed_features={})
y_pred = model.predict(np.asarray(arr).reshape(1, -1))
get feature
view &
download
model
make a
prediction with
precomputed
and
user-supplied
features
// Streaming Inference Pipeline
5 Principles of MLOps
WORKSHOP
● ML-enabled products evolve over time
○ The available input data (features) change over time
○ The target you are trying to predict changes over time
● Automate the testing and deployment of ML-enabled Products
○ Safe incremental updates and a tighter iteration loop
● To this end, features and models must be tested
○ Tests should run automatically as part of a CI/CD workflow
// MLOps Principles
// MLOps according to Hopsworks (1) Automated Testing
Untrusted Raw Data
● ML-Apps
build-on models tested with A/B tests
● Models
tested with model validation tests
● Features
tested with data validation and unit tests
and corrected with imputation/encoding
Tested Features
Tested Models
Tested
ML-Apps
air_quality_v1
air_quality_v1
air_quality_v2
air_quality_v2
// MLOps according to Hopsworks (2) Versioning of Features, Models
Models
Rollback
Upgrade
weather_v1
air_quality_v1
Features
weather_v1
air_quality_v2
air_quality_v1
Models
air_quality_v2
Features
// MLOps according to Hopsworks (2) Versioning of Features, Models
Models
air_quality_v2
Rollback
Upgrade
weather_v1
air_quality_v1
air_quality_v2
Features
air_quality_v1
weather_v1
air_quality_v2
air_quality_v1
Models
air_quality_v1
air_quality_v2
Features
// MLOps according to Hopsworks (2) Versioning of Features, Models
Models
air_quality_v2
Rollback
Upgrade
weather_v1
air_quality_v1
air_quality_v2
Features
air_quality_v1
weather_v1
air_quality_v2
air_quality_v1
Models
air_quality_v1
air_quality_v2
Features
DEVELOPMENT
Feature Store
PRODUCTION
Feature Store
Test Run
Unit
testing
Feature
development
Entity
Creation STAGING
Feature Store
Scheduled
Pull Request
Manual
Execution
CI/CD Execution
On-demand
Orchestrated
& Scheduled
Merge
backfilled
// CI/CD for Feature Pipelines
https://www.hopsworks.ai/post/optimize-your-mlops-workflow-with-a-feature-store-ci-cd-and-github-actions
Evaluate and
Validate Model
Training Data
Evaluation
Sets (for Bias)
Main Branch
Jenkins
PyTest
Pull Request Trigger
bias,behaviour,
performance tests
deployment-test
Model Training
// CI/CD for Model Training
STAGING
Model Registry
PyTest
Manual Trigger Development Branch
Evaluate and
Validate Model
Training Data
Evaluation
Sets (for Bias)
bias,behaviour,
performance tests
deployment-test
Model Training
DEVELOPMENT
Model Registry
HTTPS
client
KServe
Blue: model-A1
Green: model-A2
Network
Endpoint
(Istio)
Hopsworks
deploy model-A2
with 10% of traffic
read pre-computed features
10% traffic
(mirrored)
100% traffic
// CI/CD for Deployed Models
PRODUCTION
Feature Store
weather_v1
air_quality_v1
air_quality_v1
air_quality_v1
training_data_v1
Feature Groups
Training Data
Models
Connector BQ
Connector
Snowflake
Connector S3
Connector
Redshift
air_quality
Model Deployments
Feature View
Source Data
// Lineage for Features and Models
● In Hopsworks, you can make non-breaking schema changes that do not require
updating the schema version.
● Appending features with a default value is a non-breaking schema change
● Breaking schema changes require updating the schema version for a Feature Group.
fg1 = fs.create_feature_group(name=”example”, version=1)
df = fg1.read()
fg2 = fs.create_feature_group(name=”example”, version=2, features=new_features, …)
fg2.insert(df) #backfill the new feature group with data from the prev version
from hsfs.feature import Feature
features = [
Feature(name="id",type="int",online_type="int"),
Feature(name="name",type="string",online_type="varchar(20)")]
fg = fs.get_feature_group(name="example", version=1)
fg.append_features(features)
// Schema Versioning
6 Ideas for Serverless ML Systems?
WORKSHOP
ChatGPT / GPT-4
(model of world)
1. History,
Context
Personalized
Prompt
Feature Store
Prompt
2.Hand-crafted
Prompts
Vector DB
(model of user)
embedding
Personalized ChatGPT
You have 10,000 customers and 8.7% churn annually. This
is a relative high amount churn, so you should estimate
churn at a fine granularity using this specific churn model:
….
ChatGPT
A churn model helps predict the probability of a customer
leaving you within a given time period. Here is a code
snippet in Python for estimating customer churn:
…..
Help me design a churn model for my customers!
// IDEA FOR SERVERLESS ML SYSTEM - Personalized GPT-4
Show love with a star!
SERVERLESS MACHINE LEARNING
www.serverless-ml.org
https://github.com/featurestoreorg/serverless-ml-course ⭐

More Related Content

What's hot

Spark streaming
Spark streamingSpark streaming
Spark streamingWhiteklay
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...Databricks
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Slim Baltagi
 
AirBNB's ML platform - BigHead
AirBNB's ML platform - BigHeadAirBNB's ML platform - BigHead
AirBNB's ML platform - BigHeadKarthik Murugesan
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBen Stopford
 
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Edureka!
 
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019GoDataDriven
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Databricks
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...confluent
 
Let the alpakka pull your stream
Let the alpakka pull your streamLet the alpakka pull your stream
Let the alpakka pull your streamEnno Runne
 
Streaming sql and druid
Streaming sql and druid Streaming sql and druid
Streaming sql and druid arupmalakar
 
Hoodie: Incremental processing on hadoop
Hoodie: Incremental processing on hadoopHoodie: Incremental processing on hadoop
Hoodie: Incremental processing on hadoopPrasanna Rajaperumal
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streamingdatamantra
 
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingKubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingTheofilos Papapanagiotou
 

What's hot (20)

Spark streaming
Spark streamingSpark streaming
Spark streaming
 
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
 
AirBNB's ML platform - BigHead
AirBNB's ML platform - BigHeadAirBNB's ML platform - BigHead
AirBNB's ML platform - BigHead
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka Streams
 
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
Spark Interview Questions and Answers | Apache Spark Interview Questions | Sp...
 
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
 
Apache flink
Apache flinkApache flink
Apache flink
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Spark
SparkSpark
Spark
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
 
Spark streaming: Best Practices
Spark streaming: Best PracticesSpark streaming: Best Practices
Spark streaming: Best Practices
 
Let the alpakka pull your stream
Let the alpakka pull your streamLet the alpakka pull your stream
Let the alpakka pull your stream
 
Streaming sql and druid
Streaming sql and druid Streaming sql and druid
Streaming sql and druid
 
Hoodie: Incremental processing on hadoop
Hoodie: Incremental processing on hadoopHoodie: Incremental processing on hadoop
Hoodie: Incremental processing on hadoop
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streaming
 
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingKubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model Serving
 

Similar to Serverless ML Workshop with Hopsworks at PyData Seattle

MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...Piyush Kumar
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreMoritz Meister
 
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptxDowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptxLex Avstreikh
 
Building a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkBuilding a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkDatabricks
 
Spark and machine learning in microservices architecture
Spark and machine learning in microservices architectureSpark and machine learning in microservices architecture
Spark and machine learning in microservices architectureStepan Pushkarev
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfJim Dowling
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaGoDataDriven
 
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...Yahoo Developer Network
 
Data Science in the Elastic Stack
Data Science in the Elastic StackData Science in the Elastic Stack
Data Science in the Elastic StackRochelle Sonnenberg
 
Create cross-platform apps that interact with Microsoft Graph and Office 365 ...
Create cross-platform apps that interact with Microsoft Graph and Office 365 ...Create cross-platform apps that interact with Microsoft Graph and Office 365 ...
Create cross-platform apps that interact with Microsoft Graph and Office 365 ...Codemotion
 
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...Chester Chen
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupJim Dowling
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigmJim Dowling
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseHao Chen
 
Zotonic tutorial EUC 2013
Zotonic tutorial EUC 2013Zotonic tutorial EUC 2013
Zotonic tutorial EUC 2013Arjan
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsJohann Schleier-Smith
 

Similar to Serverless ML Workshop with Hopsworks at PyData Seattle (20)

MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature Store
 
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptxDowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
 
Building a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkBuilding a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache Spark
 
Spark and machine learning in microservices architecture
Spark and machine learning in microservices architectureSpark and machine learning in microservices architecture
Spark and machine learning in microservices architecture
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdf
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
 
Data Science in the Elastic Stack
Data Science in the Elastic StackData Science in the Elastic Stack
Data Science in the Elastic Stack
 
Create cross-platform apps that interact with Microsoft Graph and Office 365 ...
Create cross-platform apps that interact with Microsoft Graph and Office 365 ...Create cross-platform apps that interact with Microsoft Graph and Office 365 ...
Create cross-platform apps that interact with Microsoft Graph and Office 365 ...
 
Xamarin microsoft graph
Xamarin microsoft graphXamarin microsoft graph
Xamarin microsoft graph
 
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigm
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
Zotonic tutorial EUC 2013
Zotonic tutorial EUC 2013Zotonic tutorial EUC 2013
Zotonic tutorial EUC 2013
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time Applications
 

More from Jim Dowling

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfJim Dowling
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdfJim Dowling
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Jim Dowling
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022Jim Dowling
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Jim Dowling
 
Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Jim Dowling
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money LaunderingJim Dowling
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingJim Dowling
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityJim Dowling
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020Jim Dowling
 
The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines Jim Dowling
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyJim Dowling
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleJim Dowling
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Jim Dowling
 
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019Jim Dowling
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesJim Dowling
 
The Feature Store in Hopsworks
The Feature Store in HopsworksThe Feature Store in Hopsworks
The Feature Store in HopsworksJim Dowling
 
Jfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocksJfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocksJim Dowling
 
Berlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on HopsBerlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on HopsJim Dowling
 

More from Jim Dowling (20)

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdf
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021
 
Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money Laundering
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020
 
The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines The Bitter Lesson of ML Pipelines
The Bitter Lesson of ML Pipelines
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, Sunnyvale
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019
 
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
 
The Feature Store in Hopsworks
The Feature Store in HopsworksThe Feature Store in Hopsworks
The Feature Store in Hopsworks
 
Jfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocksJfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocks
 
Berlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on HopsBerlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on Hops
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 

Serverless ML Workshop with Hopsworks at PyData Seattle

  • 1. Create a MVPS (minimal viable prediction service) in 90 mins Jim Dowling @jim_dowling CEO, Hopsworks WORKSHOP Build a production ML system with only Python on free serverless services
  • 2. Jim Dowling - CEO & Co-Founder of Hopsworks and an Associate Professor at KTH Royal Institute of Technology. Co-inventor of the open-source Hopsworks platform. Presenter; https://www.serverless-ml.org - Free Online Course Serverless ML Community Discord
  • 3. Register on app.hopsworks.ai Register on modal.ai Register on huggingface.co - create a new “space” SLIDES:
  • 4. 1 The scope of Data Scientist’s work has expanded to build Prediction Services It’s not enough to just train models anymore
  • 5. Data Engineer Data Scientist ML Engineer Data Dump Model Enterprise Data Model Serving models need operational data If you work in this kind of Enterprise, Conway’s Law implies you will only train models // What is the scope of a Data Scientist’s work?
  • 6. Business Value Real-Time Data Real time prediction services Personalized search / recommendations High Business Value Batch Prediction Services Demand forecasting, risk estimation, financial planning Good Business Value Batch Data Static Data One-off Reports Medium Business Value // Increase Business Value with more up-to-date Data
  • 7. 1. Train a Model on a static dataset and produce an Evaluation Report OR 2. Build a Minimal Viable Prediction Service to show your stakeholders how/where your model will be used // What is the scope of a Data Scientist’s work?
  • 9. AI Powered Product & Services Monitoring & Compliance New Data Historical Data Output Source data data Models 1. Feature engineering 3. Inference 2. Model Training data data data logs models data models Model Registry ML Pipelines & MLOps data // ML System = Feature + Training + Inference Pipelines Feature Store
  • 10. 2 Write Feature, Training, Inference pipelines to build an Air Quality Prediction Service Hopsworks, Modal, Streamlit/Gradio
  • 11. 1. Feature Pipeline => Features/Labels 2. Training Pipeline => Model 3. Inference Pipeline => Predictions Data => Features/Labels => Model/Features => // ML Systems should have 3 independent Machine Learning Pipelines
  • 12. Prediction Problem with Business KPIs & Data Sources Prediction Consumer // Minimal Viable Prediction service - Iteratively Develop it MVPs (Minimal Viable Prediction service) Feature Pipeline Inference Pipeline Training Pipeline
  • 13. feature-pipeline.py batch-inference- pipeline.py training-pipeline.ipynb Interactive UI (app.py) Weather Data Air Quality Data // Today’s Serverless ML Air Quality Prediction Service Dashboard (Github Pages)
  • 14. User Interface with Maps Predictions Data features model Model Registry features,labels models data predict data Feature View Feature Group Feature Group Feature Group Hopsworks Feature View Streaming Data event bus Batch Data databases & warehouse Feature Pipeline Online Inference Pipeline Training Pipeline ML Pipelines & MLOps Hopsworks: write to Feature Groups, read from Feature Views features
  • 15. Data Sources: Open-Meteo and AQPA https://open-meteo.com/en/docs/air-quality-api https://aqicn.org/api/
  • 16. city_name date wind_speed_max wind_direction_dominant wind_gusts_max temp_max <entity_id> <event_time> <numerical feature> <categorical feature> <numerical feature> <numerical feature> string datetime double string double double berlin 2022-01-01 14.3 ne 22.4 22.7 dublin 2022-04-01 9.3 n 18.2 25.4 seattle 2022-07-01 11.1 nw 15.2 20.8 tacoma 2022-10-01 1.3 w 2.5 28.4 Row Feature value. Store unencoded to maximize reuse over many models. Feature vector. Set of feature values with the same primary key. Feature Types Feature Group - weather entity_id and event_time uniquely identify each row. They are not features.
  • 17. city_name date pm2_5 <entity_id> <event_time> <numerical feature> string datetime double berlin 2022-01-01 5.3 dublin 2022-04-01 2.3 seattle 2022-07-01 3.1 tacoma 2022-10-01 4.3 Feature Group - air_quality Possible Label Column could be a target for a prediction problem
  • 18. 3 Get started Create an account on Hopsworks
  • 19. ● Use Conda or virtual environments to manage your python dependencies on your laptop. See more info on how to manage your Python environment here. ● git clone https://github.com/jimdowling/air_quality.git cd air_quality conda create -n serverlessml python=3.9 conda activate serverlessml pip install -r requirements.txt // Getting Started…
  • 20. 1. First, create an account on https://app.hopsworks.ai 2. Click on “User Settings” 3. Create and Save an “API Key” Register on app.hopsworks.ai
  • 21. Add HOPSWORKS_API_KEY as a Environment variable secret Create an account on Modal (might need some time to be approved) // modal.ai
  • 22. 1. Create an account on Hugging Face 2. Create a “Space” 3. Create a Streamlit App // huggingface.ai
  • 23. 1. Add your HOPSWORKS_API_KEY as a Repo Secret
  • 25. Streaming or On-Demand Data Pipelines Backfill data csv file Events msgs in a Kafka Topic user request input Incremental Writes write Pandas DataFrames to the Feature Store Batch Data Pipelines Static Data // Batch and Backfill Feature Pipelines
  • 26. // What features are computed in a feature pipeline? ● Grouped Aggregations ● Binning ● Filtering ● Rolling/lag features ● Combining data from multiple tables using a common join key. ● Mapping: Transforming values using a user-defined function (UDF). ● Stream mapping: transforming values from using a UDF and accumulated state. ● Time series analysis: Analyzing or aggregating data over time, such as identifying trends, patterns, or anomalies. ● NLP - classifying text and outputting features (e.g., sentiment) ● Clustering, dimensionality reduction, etc Generally, features are not encoded in Feature Pipelines as it prevents reuse of features across models and slows down writes.
  • 27. Pandas 1.0 10 GBs 100 GBs TBs PBs Data Size Pandas 2.x+, Polars Spark or SQL (BQ/Snowflake/etc) Spark Streaming Batch Streaming Flink // Choose the best framework for your feature pipeline Difficulty
  • 28. // Write Weather Pandas DataFrame to Hopsworks weather_df = # 1. read today’s data in as a Pandas DataFrame # 2. create features for in Pandas DataFrame weather_fg = fs.get_or_create_feature_group(name="weather", version=1, description="Weather Daily Updates", primary_key=['city'], event_time=‘date’ ) weather_fg.insert(weather_df) # 3. write Pandas DataFrame to Feature Group # …
  • 29. air_quality_df = # 1. read the most recent air quality observations # 2. create features for in Pandas DataFrame air_quality_fg = fs.get_or_create_feature_group(name="air_quality", version=1, description="City Air Quality Data", primary_key=['city'], expectation_suite=expectation_suite, event_time='date' ) air_quality_fg.insert(air_quality_df) # 3. write DataFrame to Feature Group # … // Write Air Quality Pandas DataFrame to Hopsworks
  • 30. // Feature Functions in a Python module def moving_average(df, window=7): df[f'mean_{window}_days'] = df.groupby('city_name')['pm2_5'] .rolling(window=window).mean().reset_index(0,drop=True).shift(1) features/air_quality.py One Python module per Feature Group
  • 31. stub = modal.Stub("air_quality_daily") image = modal.Image.debian_slim().pip_install(["hopsworks"]) @stub.function(image=image, schedule=modal.Period(days=1), secret=modal.Secret.from_name("jim-hopsworks-ai")) def g(): … if __name__ == "__main__": stub.deploy("air_quality_daily") with stub.run(): g() // Schedule your Feature Pipeline with Modal Define program dependencies and program schedule, env variables Deploy main() function as a scheduled program on modal
  • 32. schedule: - cron: '0 14 * * *' .. steps: - name: checkout repo content uses: actions/checkout@v3 - name: setup python uses: actions/setup-python@v3 with: python-version: '3.9' working-directory: . - name: install python packages run: pip install -r requirements.txt - name: execute Feature Pipeline env: HOPSWORKS_API_KEY: ${{ secrets.HOPSWORKS_API_KEY }} CONTINENT: "Seattle" run: jupyter nbconvert --to notebook --execute 2_feature_pipeline.ipynb // Schedule Notebooks with Github Actions
  • 33. // Create Data Validation Rules in Great Expectations from great_expectations.core import ExpectationSuite, ExpectationConfiguration expectation_suite = ExpectationSuite( expectation_suite_name="transaction_suite") expectation_suite.add_expectation( ExpectationConfiguration( expectation_type="expect_column_values_to_be_between", kwargs={ "column":"pm2_5", "min_value":"0.0", "max_value":"1000.0", } ) )
  • 34. Feature Pipeline Data Sources Data warehouse Applications - Services Validate Data Unstructured Data Great Expectations Expectation Suite Feature 1 Expectation Suite Feature 2 Validation Reports Feature 1 Validation Reports Feature 2 Feature Group 1 Data Feature Group 2 Data Monitor Search Alert // Data Validation with Great Expectations https://www.hopsworks.ai/post/data-validation-for-enterprise-ai-using-great-expectations-with-hopsworks
  • 37. Redundant Feature Irrelevant Feature A similar feature is already selected The feature has no predictive power Prohibited Feature Feature cannot be used. May be context dependent Useful Feature It has predictive power for my prediction problem Infeasible Feature Useful feature that can’t be computed for some reason // Feature Selection
  • 38. FEATURE VIEW Select and Join Features city_name date wind_speed_max wind_direction_dominant wind_gusts_max temp_max <entity_id> <event_time> <numerical feature> <categorical feature> <numerical feature> <numerical feature> string datetime double string double double berlin 2022-01-01 14.3 ne 22.4 22.7 dublin 2022-04-01 9.3 n 18.2 25.4 seattle 2022-07-01 11.1 nw 15.2 20.8 tacoma 2022-10-01 1.3 w 2.5 28.4 city_name date pm2_5 <entity_id> <event_time> <numerical feature> string datetime double berlin 2022-01-01 5.3 dublin 2022-04-01 2.3 seattle 2022-07-01 3.1 tacoma 2022-10-01 4.3 // Select Features and Create a Feature View
  • 39. Label ts Feature ts // Data Modelling: Training Data is a Fact/Dimension Model Feature ts Feature ts Feature ts Feature ts ts=TimeStamp Join Features to Labels with Entity IDs
  • 40. select features (wind_speed_max,wind_speed_dir, …) air_quality_fv FeatureView filter (city_name == “Seattle”) transform = { “wind_speed_max” : min_max_scaler } label = [“pm2_5”] weather Feature Group air_quality Feature Group Training Data Inference Data Optional Steps join on city, date // Building a Feature View from existing Features
  • 41. // Select Features and create a Feature View select features from feature groups create feature view from selected features fg_air_quality = fs.get_feature_group(name="air_quality", version=1) fg_weather = fs.get_feature_group(name="weather", version=1) selected = fg_air_quality.select(['pm2_5').join(fg_weather.select_all()) fv = fs.create_feature_view(name="air_quality_fv", version=1, description="Weather and Air Quality", labels=['pm2_5'], query=selected )
  • 42. Time city date max_temp wind_direction_dominant … pm2_5 <entity_id> <event_time> min_max_scaler ordinal_encoder … min_max_scaler string datetime double string … double Amsterdam 2022-01-01 21.4 w … 11.0 Berlin 2022-04-01 25.3 nw … 10.2 Seattle 2022-07-01 12.1 n … 1.4 Dublin 2022-10-01 15.3 ne … 2.2 Dehli 2022-11-01 33.9 n … 38.2 New York 2022-12-01 5.1 e … 19.2 Portland 2023-01-01 -2.4 ssw … 3.2 Training Data-v2 Batch Inference Data Feature View - air_quality_fv Feature vector Training Data-v1 // Feature Views - Create Data Snapshots for Training or Inference
  • 43. // Feature Views - Create Data Snapshots for Training or Inference
  • 44. city date max_temp wind_direction_dominant … pm2_5 <entity_id> <event_time> min_max_scaler ordinal_encoder … min_max_scaler string datetime double string … double FeatureView for Model-FeedForwardNet city date max_temp wind_direction_dominant … pm2_5 <entity_id> <event_time> label_encoder … string datetime double string … double FeatureView for Model-XGBoost city date max_temp wind_direction_dominant … pm2_5 <entity_id> <event_time> min_max_scaler ordinal_encoder … min_max_scaler string datetime double string … double FeatureView for Model-LogisticRegression Model-Dependent Transformations // Feature Views - Model-Dependent Transformations Model-Dependent Transformations Model-Dependent Transformations
  • 45. X_train, X_test, y_train, y_test = fv.train_test_split(test_size=0.2) categorical_transformer=Pipeline(steps=[("encoder", OneHotEncoder(handle_unknown="ignore"))]) preprocessor = ColumnTransformer(transformers=[ ("cat", categorical_transformer, categorical_feature_ids)]) clf = Pipeline(steps=[("preprocessor", preprocessor), ("regressor", XGBRegressor())]) clf.fit(X_train, y_train) // Model Training with Feature Views get train & test set model-dependent feature encoding The pipeline encodes features, then train model
  • 46. joblib.dump(clf, 'air_quality_model/xgboost_pipeline.pkl') input_schema = Schema(X_test) output_schema = Schema(y_test) aq_model = mr.sklearn.create_model("air_quality_model", metrics={'accuracy': accuracy}, input_example=X_test.sample().to_numpy(), model_schema=ModelSchema(input_schema=input_schema, output_schema=output_schema)) fraud_model.save('air_quality_model') // Store the trained Model in a Model Registry serialize model package up model and upload to model registry get model schema from DFs
  • 48. fv = fs.get_feature_view(name="air_quality_fv", version=1) df = feature_view.get_batch_data(start_time=today) mr = project.get_model_registry() model = mr.get_model("lending_model", version=1) model_dir = model.download() model = joblib.load(model_dir + "/air_quality_model.pkl") predictions_df = model.predict(df) // Batch Inference Pipeline - predict air quality on latest features download inference data download model make predictions
  • 49. fv = fs.get_feature_view(name="air_quality_fv", version=1) mr = project.get_model_registry() model = mr.get_model("air_quality_model", version=1) model_dir = model.download() model = joblib.load(model_dir + "/air_quality_model.pkl") def air_quality(city_name): arr = fv.get_feature_vector({"city_name": city_name}, passed_features={}) y_pred = model.predict(np.asarray(arr).reshape(1, -1)) get feature view & download model make a prediction with precomputed and user-supplied features // Online Inference Pipeline - interactive prediction service
  • 50. fv = fs.get_feature_view(name="air_quality_fv", version=1) mr = project.get_model_registry() model = mr.get_model("air_quality", version=1) model_dir = model.download() model = joblib.load(model_dir + "/air_quality_model.pkl") # FLINK CALLS THE METHOD BELOW FOR EVERY EVENT def air_quality(city_name): arr = fv.get_feature_vector({"city_name": city_name}, passed_features={}) y_pred = model.predict(np.asarray(arr).reshape(1, -1)) get feature view & download model make a prediction with precomputed and user-supplied features // Streaming Inference Pipeline
  • 51. 5 Principles of MLOps WORKSHOP
  • 52. ● ML-enabled products evolve over time ○ The available input data (features) change over time ○ The target you are trying to predict changes over time ● Automate the testing and deployment of ML-enabled Products ○ Safe incremental updates and a tighter iteration loop ● To this end, features and models must be tested ○ Tests should run automatically as part of a CI/CD workflow // MLOps Principles
  • 53. // MLOps according to Hopsworks (1) Automated Testing Untrusted Raw Data ● ML-Apps build-on models tested with A/B tests ● Models tested with model validation tests ● Features tested with data validation and unit tests and corrected with imputation/encoding Tested Features Tested Models Tested ML-Apps
  • 54. air_quality_v1 air_quality_v1 air_quality_v2 air_quality_v2 // MLOps according to Hopsworks (2) Versioning of Features, Models Models Rollback Upgrade weather_v1 air_quality_v1 Features weather_v1 air_quality_v2 air_quality_v1 Models air_quality_v2 Features
  • 55. // MLOps according to Hopsworks (2) Versioning of Features, Models Models air_quality_v2 Rollback Upgrade weather_v1 air_quality_v1 air_quality_v2 Features air_quality_v1 weather_v1 air_quality_v2 air_quality_v1 Models air_quality_v1 air_quality_v2 Features
  • 56. // MLOps according to Hopsworks (2) Versioning of Features, Models Models air_quality_v2 Rollback Upgrade weather_v1 air_quality_v1 air_quality_v2 Features air_quality_v1 weather_v1 air_quality_v2 air_quality_v1 Models air_quality_v1 air_quality_v2 Features
  • 57. DEVELOPMENT Feature Store PRODUCTION Feature Store Test Run Unit testing Feature development Entity Creation STAGING Feature Store Scheduled Pull Request Manual Execution CI/CD Execution On-demand Orchestrated & Scheduled Merge backfilled // CI/CD for Feature Pipelines https://www.hopsworks.ai/post/optimize-your-mlops-workflow-with-a-feature-store-ci-cd-and-github-actions
  • 58. Evaluate and Validate Model Training Data Evaluation Sets (for Bias) Main Branch Jenkins PyTest Pull Request Trigger bias,behaviour, performance tests deployment-test Model Training // CI/CD for Model Training STAGING Model Registry PyTest Manual Trigger Development Branch Evaluate and Validate Model Training Data Evaluation Sets (for Bias) bias,behaviour, performance tests deployment-test Model Training DEVELOPMENT Model Registry
  • 59. HTTPS client KServe Blue: model-A1 Green: model-A2 Network Endpoint (Istio) Hopsworks deploy model-A2 with 10% of traffic read pre-computed features 10% traffic (mirrored) 100% traffic // CI/CD for Deployed Models PRODUCTION Feature Store
  • 60. weather_v1 air_quality_v1 air_quality_v1 air_quality_v1 training_data_v1 Feature Groups Training Data Models Connector BQ Connector Snowflake Connector S3 Connector Redshift air_quality Model Deployments Feature View Source Data // Lineage for Features and Models
  • 61. ● In Hopsworks, you can make non-breaking schema changes that do not require updating the schema version. ● Appending features with a default value is a non-breaking schema change ● Breaking schema changes require updating the schema version for a Feature Group. fg1 = fs.create_feature_group(name=”example”, version=1) df = fg1.read() fg2 = fs.create_feature_group(name=”example”, version=2, features=new_features, …) fg2.insert(df) #backfill the new feature group with data from the prev version from hsfs.feature import Feature features = [ Feature(name="id",type="int",online_type="int"), Feature(name="name",type="string",online_type="varchar(20)")] fg = fs.get_feature_group(name="example", version=1) fg.append_features(features) // Schema Versioning
  • 62. 6 Ideas for Serverless ML Systems? WORKSHOP
  • 63. ChatGPT / GPT-4 (model of world) 1. History, Context Personalized Prompt Feature Store Prompt 2.Hand-crafted Prompts Vector DB (model of user) embedding Personalized ChatGPT You have 10,000 customers and 8.7% churn annually. This is a relative high amount churn, so you should estimate churn at a fine granularity using this specific churn model: …. ChatGPT A churn model helps predict the probability of a customer leaving you within a given time period. Here is a code snippet in Python for estimating customer churn: ….. Help me design a churn model for my customers! // IDEA FOR SERVERLESS ML SYSTEM - Personalized GPT-4
  • 64. Show love with a star! SERVERLESS MACHINE LEARNING www.serverless-ml.org https://github.com/featurestoreorg/serverless-ml-course ⭐