SlideShare a Scribd company logo
1 of 17
Download to read offline
Session Replays: Unlocking
User Experience with Data
Saurabh Mathur, Founder, browsee.io
About Me
● Founder Browsee.io
● Previously, build a Customer Data Platform startup
called Connecto. Acquired by Cardekho.com
● Ex - Google
Session Replays
● Video Like Replay of User
Behavior
● How is this made possible ?
Mutation Observers
● Starts with the initial page HTML
● Record all changes on a page
● Record all mouse movements, scroll and
clicks with their elements and positions
https://developer.mozilla.org/en-US/docs/Web/API/MutationObserver
Websockets vs HTTP
● Even transmitting this
information over HTTP would
have significant overhead
● Websockets allow fast
transmission of this data over
a persistent connection
Source:
http://blog.arungupta.me/rest-vs-websocket-comparison-benchmarks/
Data for User Experience
● About 1TB of recordings data every day
persisted upto a year
● For every heatmap, we aggregate every
click and every scroll along with time
spent by the user in their heatmaps
along with.
● About ~2TB of time series data
● About 1TB of Full Page snapshots
Data Storage
● Initially we used Cassandra as our
primary storage for replay data
● We soon moved to ScyllaDB and
were getting better throughput
per node as well as read latency.
Source:
https://www.scylladb.com/2021/08/24/apache-cassandra-4-0-vs-scyll
a-4-4-comparing-performance//
Size Tiered Compaction
● The default replication strategies
is the Size Tiered Compaction
Strategy
● This is the oldest and the default
strategy for both ScyllaDB and
Cassandra.
Size Tiered Compaction
● Major Drawback is it requires about
25-50% free space for compaction.
● Storage is one of our largest cost
head.
● 25-50% disk wastage directly
impacts cost
Level Tiered Compaction
● Keeps small SStables each of size
160MB
● Keeps adding data to next level
sstables. Creates new table if size
is breached.
Level Tiered Compaction
● In our experience it significantly
increased the write processing.
● No improvement in read latency
● The cost we were saving in disk
was getting lost in compute cycles
Incremental Compaction
Strategy
● A variant of Size Tiered Compaction. It
only addresses the largest compaction
step which requires free space.
● At that step, it breaks the compacting
sstables into runs and keeps
compacting the runs while deleting the
completed ones. So much lesser.
Data Security at Rest
● Users can choose which data
center to save and process their
data.
● We also have blacklist caches at
queueing as well as rest to
immediately respond to Data
cleanup requests.
Session Tagging
● Browsee marks several user insights
on session recordings depending on
the user’s Behavior
● Temporal click events like rage clicks
● Repeated events or quickly browsing
through several pages
Live Sessions
● Kafka’s Real time processors allow us to
process session tagging in near real
time.
● Users need real time session views
● We also need incremental session
updates on client side to fetch new data
as the users are watching a live session
Roadmap
● Navigation graph - A graph based time series data which scales
exponentially with site’s pages
● Bot Detection Models
● More User Behavior Models like Multi User Analysis, User Defined Tags
Thanks!

More Related Content

Similar to Session Replays: Unlocking User Experience with Data

Adventures in Observability - Clickhouse and Instana
Adventures in Observability - Clickhouse and InstanaAdventures in Observability - Clickhouse and Instana
Adventures in Observability - Clickhouse and InstanaMarcel Birkner
 
Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
 Adventures in Observability: How in-house ClickHouse deployment enabled Inst... Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
Adventures in Observability: How in-house ClickHouse deployment enabled Inst...Altinity Ltd
 
La vita nella corsia di sorpasso; A tutta velocità, XPages!
La vita nella corsia di sorpasso; A tutta velocità, XPages!La vita nella corsia di sorpasso; A tutta velocità, XPages!
La vita nella corsia di sorpasso; A tutta velocità, XPages!Ulrich Krause
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...HostedbyConfluent
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...NETWAYS
 
Performance Monitoring at Spreadshirt
Performance Monitoring at SpreadshirtPerformance Monitoring at Spreadshirt
Performance Monitoring at SpreadshirtMartin Breest
 
Intro to XPages for Administrators (DanNotes, November 28, 2012)
Intro to XPages for Administrators (DanNotes, November 28, 2012)Intro to XPages for Administrators (DanNotes, November 28, 2012)
Intro to XPages for Administrators (DanNotes, November 28, 2012)Per Henrik Lausten
 
Data Lessons Learned at Scale
Data Lessons Learned at ScaleData Lessons Learned at Scale
Data Lessons Learned at ScaleCharlie Reverte
 
Caching for Microservices Architectures: Session II - Caching Patterns
Caching for Microservices Architectures: Session II - Caching PatternsCaching for Microservices Architectures: Session II - Caching Patterns
Caching for Microservices Architectures: Session II - Caching PatternsVMware Tanzu
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA
 
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...Kim Greene
 
Managing Memory & Locks - Series 1 Memory Management
Managing  Memory & Locks - Series 1 Memory ManagementManaging  Memory & Locks - Series 1 Memory Management
Managing Memory & Locks - Series 1 Memory ManagementDAGEOP LTD
 
Final presentasi gnome asia
Final presentasi gnome asiaFinal presentasi gnome asia
Final presentasi gnome asiaAnton Siswo
 
A Real Time Web Analytics System
A Real Time Web Analytics SystemA Real Time Web Analytics System
A Real Time Web Analytics SystemMahesh Patwardhan
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithNETWAYS
 
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...Flink Forward
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 

Similar to Session Replays: Unlocking User Experience with Data (20)

Adventures in Observability - Clickhouse and Instana
Adventures in Observability - Clickhouse and InstanaAdventures in Observability - Clickhouse and Instana
Adventures in Observability - Clickhouse and Instana
 
Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
 Adventures in Observability: How in-house ClickHouse deployment enabled Inst... Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
Adventures in Observability: How in-house ClickHouse deployment enabled Inst...
 
La vita nella corsia di sorpasso; A tutta velocità, XPages!
La vita nella corsia di sorpasso; A tutta velocità, XPages!La vita nella corsia di sorpasso; A tutta velocità, XPages!
La vita nella corsia di sorpasso; A tutta velocità, XPages!
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
Web Performance Optimization
Web Performance OptimizationWeb Performance Optimization
Web Performance Optimization
 
Performance Monitoring at Spreadshirt
Performance Monitoring at SpreadshirtPerformance Monitoring at Spreadshirt
Performance Monitoring at Spreadshirt
 
Intro to XPages for Administrators (DanNotes, November 28, 2012)
Intro to XPages for Administrators (DanNotes, November 28, 2012)Intro to XPages for Administrators (DanNotes, November 28, 2012)
Intro to XPages for Administrators (DanNotes, November 28, 2012)
 
Data Lessons Learned at Scale
Data Lessons Learned at ScaleData Lessons Learned at Scale
Data Lessons Learned at Scale
 
Log Files
Log FilesLog Files
Log Files
 
Cookie
CookieCookie
Cookie
 
Caching for Microservices Architectures: Session II - Caching Patterns
Caching for Microservices Architectures: Session II - Caching PatternsCaching for Microservices Architectures: Session II - Caching Patterns
Caching for Microservices Architectures: Session II - Caching Patterns
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
 
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
Adm07 The Health Check Extravaganza for IBM Social and Collaboration Environm...
 
Managing Memory & Locks - Series 1 Memory Management
Managing  Memory & Locks - Series 1 Memory ManagementManaging  Memory & Locks - Series 1 Memory Management
Managing Memory & Locks - Series 1 Memory Management
 
Final presentasi gnome asia
Final presentasi gnome asiaFinal presentasi gnome asia
Final presentasi gnome asia
 
A Real Time Web Analytics System
A Real Time Web Analytics SystemA Real Time Web Analytics System
A Real Time Web Analytics System
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles Judith
 
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 

Recently uploaded

The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 

Recently uploaded (20)

The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 

Session Replays: Unlocking User Experience with Data

  • 1. Session Replays: Unlocking User Experience with Data Saurabh Mathur, Founder, browsee.io
  • 2. About Me ● Founder Browsee.io ● Previously, build a Customer Data Platform startup called Connecto. Acquired by Cardekho.com ● Ex - Google
  • 3. Session Replays ● Video Like Replay of User Behavior ● How is this made possible ?
  • 4. Mutation Observers ● Starts with the initial page HTML ● Record all changes on a page ● Record all mouse movements, scroll and clicks with their elements and positions https://developer.mozilla.org/en-US/docs/Web/API/MutationObserver
  • 5. Websockets vs HTTP ● Even transmitting this information over HTTP would have significant overhead ● Websockets allow fast transmission of this data over a persistent connection Source: http://blog.arungupta.me/rest-vs-websocket-comparison-benchmarks/
  • 6. Data for User Experience ● About 1TB of recordings data every day persisted upto a year ● For every heatmap, we aggregate every click and every scroll along with time spent by the user in their heatmaps along with. ● About ~2TB of time series data ● About 1TB of Full Page snapshots
  • 7. Data Storage ● Initially we used Cassandra as our primary storage for replay data ● We soon moved to ScyllaDB and were getting better throughput per node as well as read latency. Source: https://www.scylladb.com/2021/08/24/apache-cassandra-4-0-vs-scyll a-4-4-comparing-performance//
  • 8. Size Tiered Compaction ● The default replication strategies is the Size Tiered Compaction Strategy ● This is the oldest and the default strategy for both ScyllaDB and Cassandra.
  • 9. Size Tiered Compaction ● Major Drawback is it requires about 25-50% free space for compaction. ● Storage is one of our largest cost head. ● 25-50% disk wastage directly impacts cost
  • 10. Level Tiered Compaction ● Keeps small SStables each of size 160MB ● Keeps adding data to next level sstables. Creates new table if size is breached.
  • 11. Level Tiered Compaction ● In our experience it significantly increased the write processing. ● No improvement in read latency ● The cost we were saving in disk was getting lost in compute cycles
  • 12. Incremental Compaction Strategy ● A variant of Size Tiered Compaction. It only addresses the largest compaction step which requires free space. ● At that step, it breaks the compacting sstables into runs and keeps compacting the runs while deleting the completed ones. So much lesser.
  • 13. Data Security at Rest ● Users can choose which data center to save and process their data. ● We also have blacklist caches at queueing as well as rest to immediately respond to Data cleanup requests.
  • 14. Session Tagging ● Browsee marks several user insights on session recordings depending on the user’s Behavior ● Temporal click events like rage clicks ● Repeated events or quickly browsing through several pages
  • 15. Live Sessions ● Kafka’s Real time processors allow us to process session tagging in near real time. ● Users need real time session views ● We also need incremental session updates on client side to fetch new data as the users are watching a live session
  • 16. Roadmap ● Navigation graph - A graph based time series data which scales exponentially with site’s pages ● Bot Detection Models ● More User Behavior Models like Multi User Analysis, User Defined Tags