Hotels.com’s Journey to Becoming an Algorithmic Business… Exponential Growth in Data Science Whilst Migrating to Spark+Cloud all at the Same Time with Matt Fryer
In the last year Hotels.com has begun it’s journey to becoming an algorithmic business. Matt will talk about their experiences of exponential growth in Data Science Algorithms whilst at the same time the team have migrated to using Spark as their core underlying architecture from SAS / SQL, migrated to the cloud from on-premise are transforming the capability of the data science function. He will also highlight the key enablers that have made this successful including CEO support, the internal concepts of organic intelligence and how Databricks has helped make this happen. He will also highlight the pitfalls on the journey.
Similar to Hotels.com’s Journey to Becoming an Algorithmic Business… Exponential Growth in Data Science Whilst Migrating to Spark+Cloud all at the Same Time with Matt Fryer
Similar to Hotels.com’s Journey to Becoming an Algorithmic Business… Exponential Growth in Data Science Whilst Migrating to Spark+Cloud all at the Same Time with Matt Fryer (20)
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Hotels.com’s Journey to Becoming an Algorithmic Business… Exponential Growth in Data Science Whilst Migrating to Spark+Cloud all at the Same Time with Matt Fryer
1. Confidential - donot distribute
Hotels.com’sjourneyto becoming
anAlgorithmicBusiness
Matthew Fryer
VP, Chief Data Science Officer
mfryer@hotels.com
2. Confidential - donot distribute
Part of Expedia, Inc. family
385,000 properties
89 countries
39 languages
>27m Hotels.com Rewards Members
Home of Captain Obvious
Billions of Recommendations, based on real-time Data per day
Hotels.com
6. Confidential - donot distribute
“Artificial Intelligence Will Be
Travel’s Next Big Thing”
Barry Diller
Chairman & Senior Executive,
Expedia, Inc.
3M’s are disruptive
technology
Mobile
Messaging / NLP
Machine Learning
9. Confidential - donot distribute 9
Core Elementsof our Data ScienceCloud Platform
Databricks Unified Platform
Maestro – Our Internally Developed
Platform on AWS
(EMR, Spark, R-Studio, Intellij, SBT, Jupyter,
Zeppelin, Unit / QA, Metastore, Apache Airflow,
Keras, Tensorflow)
Proof of Concept on Google
Cloud, Beam, Spark &
Tensorflow
10. Confidential - donot distribute
DatabricksUnifiedPlatform
Chart is in1hourblocks, y axis = numberof 32coreinstances
10
• Key asset to the success of data science at Hotels.com
• Key in driving up data scientist productivity / efficiency / flexibility
• Helps make our data science lifecycle operate much easier and
faster driving speed to market
• Reliable / secure + facilitates ‘Highly Elastic’ workflows exploiting
cost effective spot instance on AWS.
12. Confidential - donot distribute
Reference: The Influence of Visuals in Online Hotel Research and Booking Behaviour
Imagesarean importantfactorwhilechoosinga hotel
12
0% 10% 20% 30% 40% 50% 60% 70% 80%
Loyalty Program
Reviews
Hotel Brand
Star Rating
Destination Info
Images
Hotel Info
Factors other than price/location
Very Imporant/Important Important Very Important
13. Confidential - donot distribute
ComputerVisionproblemswetry to tackle
13
Near Duplicate Detection
Scene Classification Image Ranking
18. Confidential - donot distribute
Accuracy& ConfusionMatrix
18
• After many manual / long
winded iterations and
regularization processes
tuning hyperparameters
• We achieved good
accuracy and low
confusion matrix
19. Confidential - donot distribute
Optimizingthe photo orderfor improvedcustomer
experiences
19
Original Model
Reference: Radisson Blu Edwardian Berkshire Hotel, London
20. Confidential - donot distribute
Findingthe right hotel in our marketplace is core to
our customers needs.
21. Confidential - donot distribute
Kensington
Bloomsbury
Heathrow
Canary
Wharf
Paddington
Westminster
London City
Airport
Chelsea
Battersea
Wimbledon
Wembley
City of
London
As an exampledifferentusersegmentsliketo stayin
differentlocations