SlideShare a Scribd company logo
1 of 39
Download to read offline
Cross Region
Data Replication
Design Considerations
Itai Friendinger itai@forter.com
Our financial institutions remain strong, and the American
economy will be open for business as well.
2/40
TX Fraud
Decision
100ms
Decision as a Service Example
if isFraud(tx.address,tx.payment) {
return DECLINE;
} else {
return APPROVE;
}
TX Decision
3/40
Event Processor
1000ms
Change Account Address
Change Account Payment
Unified People Store
TX
partial update
read
Decision as a Service Example
TX Fraud
Decision
100ms
TX Decision
4/40
Design ‫בסדר‬ ‫יהיה‬
TX Fraud
Decision
TX Decision
Event
Processor
People
Store
raw event
● No Cross Region Replication
5/40
Design ‫עליי‬
● Cron Sync every 3 hours
● Replication != Reconciliation
● Replication != Backup
TX Fraud
Decision
Event
Processor
People
Store
TX Fraud
Decision
TX Decision
Event
Processor
People
Store
raw event
Cron Sync
raw event
TXDecision
6/40
● Read-Only RDS Replica
● Proxying data into a single Data Center
● Requires quarterly failover drills
● Cannot stand a real disaster for long
Design ‫פסדר‬ ‫יאללה‬
TX Fraud
Decision
Event
Forwarder
People
Store
TX Fraud
Decision
TX Decision
Event
Processor
People
Store
raw event
RDS Replication
raw event
TX
Forwarding
Decision
7/40
Design ‫אחד‬ ‫במחיר‬ ‫שניים‬
● CloudEndure DRaaS
● Point In Time Recovery
● Requires quarterly failover drills
● For existing apps (Enterprises)
People
Store
TX Fraud
Decision
TX Decision
Event
Processor
People
Store
raw event
Block Device
Replication
8/40
Design ‫חכה‬ ‫חכה‬
● Google Cloud Spanner Is Here
Geo Distributed Transactions Is Coming
● For green-field apps (Startups)
TX Fraud
Decision
Event
Processor
People
Store
TX Fraud
Decision
TX Decision
Event
Processor
People
Store
raw event
Transactions
raw event
TXDecision
9/40
Design ‫סמוך‬
● Out-Of-The-Box
Real-Time
Bi-Directional
Data-Center Aware
Replication
● Write Conflict resolution
TX Fraud
Decision
TX Decision
Event
Processor
People
Store
raw event
2Way Replication
TX Fraud
Decision
Event
Processor
People
Store
raw event
TXDecision
10/40
Design ‫שלה‬ ‫אחות‬
● Replication of Raw Events
● State Divergence
TX Fraud
Decision
TX Decision
Event
Processor
People
Store
raw event
2way Replication
TX Fraud
Decision
Event
Processor
People
Store
raw event
TXDecision
11/40
Read Consistency Guarantees
Loosely based on Consistency Explained Through Baseball by Doug Terry
● Strong ⇒ 2:2
○ See all previous writes
● Read own Writes
○ See all writes performed by reader
● Monotonic ⇒ 2:1
○ See all writes since the beginning till N seconds ago
● Eventual ⇒ 1:2
○ See the writes in different order (some still missing)
time partial
update
state
15m Hapoel =1 1:0
32m Maccabi =1 1:1
89m Hapoel =2 2:1
91m Maccabi =2 2:2
14/40
Hello Couchbase
read-mutate-write of entire state
Client reaches cluster’s primary node
Conflict Prevention CAS
Optimizations: subdocument API
Strong
node
us-west-2b
node
us-west-2c
Event Processor
(read/m/write)
TX Decision
(read)
Strong
16/40
Hello Couchbase
XDCR replicates entire state between clusters
Optimizations: dedup by key, metadata first
Strong
Monotonic
XDCR
node
us-west-2b
node
us-west-2c
Event Processor
(read/m/write)
node
us-east-1c
node
us-east-1b
TX Decision
(read)
TX Decision
(read)
Strong
17/40
Couchbase Last Write Wins
Conflict Resolution - LWW erases losing side
Remember: NTP, no “sudo date”
Document Version =
read-own-writes
Monotonic
XDCR
node
us-west-2b
node
us-west-2c
Event Processor
(read/m/write)
node
us-east-1c
node
us-east-1b
TX Decision
(read)
TX Decision
(read)
Monotonic
read-own-writes
Event Processor
(read/m/write)
‫סמוך‬
Design
Conflict Resolution
48bit timestamp
Conflict Prevention
16bit CAS
19/40
Hello Cassandra
node
us-west-2b
node
us-west-2a
node
us-west-2c
Event Processor
(partial update)
node
us-east-1b
TX Decision
(read)
Client reaches closest node, blocks until LOCAL_QUARUM
No Conflict Prevention ⇒ Use partial updates or inserts
Strong (?)
node
us-east-1c
node
us-east-1a
TX Decision (read)
21/40
Cassandra Last Write Wins per Column
Two clients update payment and address
of same person with exactly same client timestamps.
(?) (?)
update payment
wins
update address
wins
node
us-west-2b
node
us-west-2a
node
us-west-2c
Event Processor
(partial update)
node
us-east-1c
node
us-east-1a
node
us-east-1b
TX Decision
(read)
TX Decision (read)
Event Processor
(partial update)
‫סמוך‬
Design
23/40
Cassandra Multi Value per Column
Update different columns of same person
Conflict resolution in TX Decision (on read)
(?) (?)
update payment1,
address1
update payment2,
address2
node
us-west-2b
node
us-west-2a
node
us-west-2c
Event Processor
(partial update)
node
us-east-1c
node
us-east-1a
node
us-east-1b
TX Decision
(read)
TX Decision (read)
Event Processor
(partial update)
‫סמוך‬
Design
25/40
Kafka
Kafka
us-west-2
Event Source
(insert)
Kafka
us-east-1
TX Decision
(read)
Event
Processor
Event
Processor
S3 versioned
us-east-1
TX Decision
(read)
S3 versioned
us-west-2
(?) (?)
Event Source
(insert)
mirror(s)
us-west
mirror(s)
us-west
mirror(s)
us-west mirror(s)
us-west
mirror(s)
us-west
mirror(s)
us-east
inserts
Conflict resolution in Event Processor
Will both regions converge into the same state?
‫שלו‬ ‫אח‬
Design
27
Converging events into state
● Duplicate events
○ Idempotent compare-and-set(x, 2, 5)
○ De-duplication 2 +3 +3 = 5
○ Rollback
● Unordered events
○ Commutative 2+3=3+2
○ reordering window (requires state)
● Bulk/Parallel event processing
○ Associative (2+3)+4 = 2+(3+4)
29/40
Kafka Streams API - zooming in
Kafka
us-west-2
Event Source
(insert)
Kafka
us-east-1
TX Decision
(read)
Event
Processor
Event
Processor
S3 versioned
us-east-1
TX Decision
(read)
S3 versioned
us-west-2
(?) (?)
Event Source
(insert)
mirror(s)
us-west
mirror(s)
us-west
mirror(s)
us-west mirror(s)
us-west
mirror(s)
us-west
mirror(s)
us-east
inserts
‫שלו‬ ‫אח‬
Design
Kafka Streams API
Kafka
MirrorMaker
(?)
Kafka
S3 Connector
Kafka Stream API
‫סמוך‬
Design
Event Source
(insert)
builder.stream("kstream1","kstream2")
.filter(predicate)
.transform(processor)
.to("ktable")
S3
kstream1
kstream2
ktable
30/40
Kafka Processor API and Local Store
Kafka
MirrorMaker
(?)
Kafka
S3 Connector
Kafka Stream API
‫סמוך‬
Design
Event Source
(insert)
kstream1
kstream2
ktable
Map process(Map event) {
Map state = kvStore.get(event.key);
state.putAll(event); // not commutative (order matters)
kvStore.put(event.key, state);
return state;
}
S3
32/40
CRDT Graph Model
Conflict-free Replicated Data Type
Idempotent, Commutative, Associative
● Insert Only Graph
● Address / Payment / Person Objects
G-Set: Growing Set CRDT
Conflict-free Replicated Data Type
Idempotent, Commutative, Associative
A B
us-west-2 event us-east-1 state
{A,B} {A,B}
G-Set: Growing Set CRDT
Conflict resolution method: merge sets
A
C
B
us-west-2 event us-east-1 state
{A,B} {A,B}
{A,C} {A,B,C}
Comprised of two G-Sets (added and tombstone)
A B
us-west-2 event us-east-1 state
add: {A,B}
rmv: {A}
add: {A,B}
rmv: {A}
2P-Set: Two Phase Set CRDT
A
C
B
us-west-2 event us-east-1 state
add: {A,B}
rmv: {A}
add: {A,B}
rmv: {A}
add: {A,C}
rmv: {B,D}
add: {A,B,C}
rmv: {A,B,D}
Always grows
Garbage Collection algorithms exist.
2P-Set: Two Phase Set CRDT
D
A
C
B
us-west-2 event us-east-1 state
add: {A,B}
rmv: {A}
add: {A,B}
rmv: {A}
add: {A,C}
rmv: {B,D}
add: {A,B,C}
rmv: {A,B,D}
add: {D} add: {A,B,C,D}
rmv: {A,B,D}
Always grows
Garbage Collection algorithms exist.
2P-Set: Two Phase Set CRDT
A
C
B
us-west-2 event us-east-1 state
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
2P2P-Graph CRDT
2P-Set for vertices, 2P-Set for edges
resolution method: remove wins
A
C
B
us-west-2 event us-east-1 state
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {}
rmv_v: {A}
add_e: {}
rmv_e: {}
2P2P-Graph CRDT
2P-Set for vertices, 2P-Set for edges
resolution method: remove wins
A
C
B
us-west-2 event us-east-1 state
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {}
rmv_v: {A}
add_e: {}
rmv_e: {}
add_v: {A,B,C}
rmv_v: {A}
add_e: {AB,AC,BC}
rmv_e: {AB,AC}
2P2P-Graph CRDT
2P-Set for vertices, 2P-Set for edges
resolution method: remove wins
AD
C
B
us-west-2 event us-east-1 state
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {}
rmv_v: {A}
add_e: {}
rmv_e: {}
add_v: {A,B,C}
rmv_v: {A}
add_e: {AB,AC,BC}
rmv_e: {AB,AC}
add_v: {D}
rmv_v: {}
add_e: {AD}
rmv_e: {}
2P2P-Graph CRDT
2P-Set for vertices, 2P-Set for edges
resolution method: remove wins
AD
C
B
us-west-2 event us-east-1 state
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {A,B,C}
rmv_v: {}
add_e: {AB,AC,BC}
rmv_e: {}
add_v: {}
rmv_v: {A}
add_e: {}
rmv_e: {}
add_v: {A,B,C}
rmv_v: {A}
add_e: {AB,AC,BC}
rmv_e: {AB,AC}
add_v: {D}
rmv_v: {}
add_e: {AD}
rmv_e: {}
add_v: {A,B,C,D}
rmv_v: {A}
add_e: {AB,AC,BC,AD}
rmv_e: {AB,AC,AD}
2P2P-Graph CRDT
2P-Set for vertices, 2P-Set for edges
resolution method: remove wins
Sometimes the state won't converge easily
● Missing events (broken links)
○ integrity checks
○ repair
● Rerunning bulk events after downtime
○ Clocks: Event vs. Ingestion vs. Processor vs. Logical
○ Enrichment: IP address reputation changes daily
37/40
Background Reconciliator
Reconciliation: Compare hash (Merkle) trees
Compensation: Merge CRDT states
client2 (read)
us-west-2a
S3 versioned
us-west-2
client1 (read)
us-east-1b
S3 versioned
us-east-1
Background
Reconciliator
38/40
Takeaways
● Define business need for cross region
Availability, Latency, Residency, Analytics
● Know your NoSQL
Couchbase != Cassandra != Kafka
● Ask about CRDTs
LWW-Register, MV-Register, 2P-Sets, 2P2P-Graphs
● Use Reconciliation
● Dedicated Fiber and Atomic clocks ARE COMING
40/40
“The Internet was designed to be an academic medium.
It was not designed to handle this level of transactions”
Fred Matteson @ schwab.com 1999
Advanced Topics
● ‫מרקחת‬ ‫לבית‬ ‫מאשר‬ ‫מטבחים‬ ‫לבית‬ ‫דומה‬ ‫יותר‬ ‫האמתי‬ ‫העולם‬
● Multi Data Center Topologies
○ Star (SPOF, simple)
○ Ring (TLV ←→ Eilat ←→ Jerusalem←→ TLV)
○ Mesh (resilient, complex)
● Data Residency
○ Separate PII from data
○ Peek at other data centers ad-hoc

More Related Content

Recently uploaded

Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 

Recently uploaded (20)

Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 

Featured

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

Reversim 2017 cross region data replication design considerations

  • 1. Cross Region Data Replication Design Considerations Itai Friendinger itai@forter.com
  • 2. Our financial institutions remain strong, and the American economy will be open for business as well. 2/40
  • 3. TX Fraud Decision 100ms Decision as a Service Example if isFraud(tx.address,tx.payment) { return DECLINE; } else { return APPROVE; } TX Decision 3/40
  • 4. Event Processor 1000ms Change Account Address Change Account Payment Unified People Store TX partial update read Decision as a Service Example TX Fraud Decision 100ms TX Decision 4/40
  • 5. Design ‫בסדר‬ ‫יהיה‬ TX Fraud Decision TX Decision Event Processor People Store raw event ● No Cross Region Replication 5/40
  • 6. Design ‫עליי‬ ● Cron Sync every 3 hours ● Replication != Reconciliation ● Replication != Backup TX Fraud Decision Event Processor People Store TX Fraud Decision TX Decision Event Processor People Store raw event Cron Sync raw event TXDecision 6/40
  • 7. ● Read-Only RDS Replica ● Proxying data into a single Data Center ● Requires quarterly failover drills ● Cannot stand a real disaster for long Design ‫פסדר‬ ‫יאללה‬ TX Fraud Decision Event Forwarder People Store TX Fraud Decision TX Decision Event Processor People Store raw event RDS Replication raw event TX Forwarding Decision 7/40
  • 8. Design ‫אחד‬ ‫במחיר‬ ‫שניים‬ ● CloudEndure DRaaS ● Point In Time Recovery ● Requires quarterly failover drills ● For existing apps (Enterprises) People Store TX Fraud Decision TX Decision Event Processor People Store raw event Block Device Replication 8/40
  • 9. Design ‫חכה‬ ‫חכה‬ ● Google Cloud Spanner Is Here Geo Distributed Transactions Is Coming ● For green-field apps (Startups) TX Fraud Decision Event Processor People Store TX Fraud Decision TX Decision Event Processor People Store raw event Transactions raw event TXDecision 9/40
  • 10. Design ‫סמוך‬ ● Out-Of-The-Box Real-Time Bi-Directional Data-Center Aware Replication ● Write Conflict resolution TX Fraud Decision TX Decision Event Processor People Store raw event 2Way Replication TX Fraud Decision Event Processor People Store raw event TXDecision 10/40
  • 11. Design ‫שלה‬ ‫אחות‬ ● Replication of Raw Events ● State Divergence TX Fraud Decision TX Decision Event Processor People Store raw event 2way Replication TX Fraud Decision Event Processor People Store raw event TXDecision 11/40
  • 12. Read Consistency Guarantees Loosely based on Consistency Explained Through Baseball by Doug Terry ● Strong ⇒ 2:2 ○ See all previous writes ● Read own Writes ○ See all writes performed by reader ● Monotonic ⇒ 2:1 ○ See all writes since the beginning till N seconds ago ● Eventual ⇒ 1:2 ○ See the writes in different order (some still missing) time partial update state 15m Hapoel =1 1:0 32m Maccabi =1 1:1 89m Hapoel =2 2:1 91m Maccabi =2 2:2 14/40
  • 13. Hello Couchbase read-mutate-write of entire state Client reaches cluster’s primary node Conflict Prevention CAS Optimizations: subdocument API Strong node us-west-2b node us-west-2c Event Processor (read/m/write) TX Decision (read) Strong 16/40
  • 14. Hello Couchbase XDCR replicates entire state between clusters Optimizations: dedup by key, metadata first Strong Monotonic XDCR node us-west-2b node us-west-2c Event Processor (read/m/write) node us-east-1c node us-east-1b TX Decision (read) TX Decision (read) Strong 17/40
  • 15. Couchbase Last Write Wins Conflict Resolution - LWW erases losing side Remember: NTP, no “sudo date” Document Version = read-own-writes Monotonic XDCR node us-west-2b node us-west-2c Event Processor (read/m/write) node us-east-1c node us-east-1b TX Decision (read) TX Decision (read) Monotonic read-own-writes Event Processor (read/m/write) ‫סמוך‬ Design Conflict Resolution 48bit timestamp Conflict Prevention 16bit CAS 19/40
  • 16. Hello Cassandra node us-west-2b node us-west-2a node us-west-2c Event Processor (partial update) node us-east-1b TX Decision (read) Client reaches closest node, blocks until LOCAL_QUARUM No Conflict Prevention ⇒ Use partial updates or inserts Strong (?) node us-east-1c node us-east-1a TX Decision (read) 21/40
  • 17. Cassandra Last Write Wins per Column Two clients update payment and address of same person with exactly same client timestamps. (?) (?) update payment wins update address wins node us-west-2b node us-west-2a node us-west-2c Event Processor (partial update) node us-east-1c node us-east-1a node us-east-1b TX Decision (read) TX Decision (read) Event Processor (partial update) ‫סמוך‬ Design 23/40
  • 18. Cassandra Multi Value per Column Update different columns of same person Conflict resolution in TX Decision (on read) (?) (?) update payment1, address1 update payment2, address2 node us-west-2b node us-west-2a node us-west-2c Event Processor (partial update) node us-east-1c node us-east-1a node us-east-1b TX Decision (read) TX Decision (read) Event Processor (partial update) ‫סמוך‬ Design 25/40
  • 19. Kafka Kafka us-west-2 Event Source (insert) Kafka us-east-1 TX Decision (read) Event Processor Event Processor S3 versioned us-east-1 TX Decision (read) S3 versioned us-west-2 (?) (?) Event Source (insert) mirror(s) us-west mirror(s) us-west mirror(s) us-west mirror(s) us-west mirror(s) us-west mirror(s) us-east inserts Conflict resolution in Event Processor Will both regions converge into the same state? ‫שלו‬ ‫אח‬ Design 27
  • 20. Converging events into state ● Duplicate events ○ Idempotent compare-and-set(x, 2, 5) ○ De-duplication 2 +3 +3 = 5 ○ Rollback ● Unordered events ○ Commutative 2+3=3+2 ○ reordering window (requires state) ● Bulk/Parallel event processing ○ Associative (2+3)+4 = 2+(3+4) 29/40
  • 21. Kafka Streams API - zooming in Kafka us-west-2 Event Source (insert) Kafka us-east-1 TX Decision (read) Event Processor Event Processor S3 versioned us-east-1 TX Decision (read) S3 versioned us-west-2 (?) (?) Event Source (insert) mirror(s) us-west mirror(s) us-west mirror(s) us-west mirror(s) us-west mirror(s) us-west mirror(s) us-east inserts ‫שלו‬ ‫אח‬ Design
  • 22. Kafka Streams API Kafka MirrorMaker (?) Kafka S3 Connector Kafka Stream API ‫סמוך‬ Design Event Source (insert) builder.stream("kstream1","kstream2") .filter(predicate) .transform(processor) .to("ktable") S3 kstream1 kstream2 ktable 30/40
  • 23. Kafka Processor API and Local Store Kafka MirrorMaker (?) Kafka S3 Connector Kafka Stream API ‫סמוך‬ Design Event Source (insert) kstream1 kstream2 ktable Map process(Map event) { Map state = kvStore.get(event.key); state.putAll(event); // not commutative (order matters) kvStore.put(event.key, state); return state; } S3 32/40
  • 24. CRDT Graph Model Conflict-free Replicated Data Type Idempotent, Commutative, Associative ● Insert Only Graph ● Address / Payment / Person Objects
  • 25. G-Set: Growing Set CRDT Conflict-free Replicated Data Type Idempotent, Commutative, Associative A B us-west-2 event us-east-1 state {A,B} {A,B}
  • 26. G-Set: Growing Set CRDT Conflict resolution method: merge sets A C B us-west-2 event us-east-1 state {A,B} {A,B} {A,C} {A,B,C}
  • 27. Comprised of two G-Sets (added and tombstone) A B us-west-2 event us-east-1 state add: {A,B} rmv: {A} add: {A,B} rmv: {A} 2P-Set: Two Phase Set CRDT
  • 28. A C B us-west-2 event us-east-1 state add: {A,B} rmv: {A} add: {A,B} rmv: {A} add: {A,C} rmv: {B,D} add: {A,B,C} rmv: {A,B,D} Always grows Garbage Collection algorithms exist. 2P-Set: Two Phase Set CRDT
  • 29. D A C B us-west-2 event us-east-1 state add: {A,B} rmv: {A} add: {A,B} rmv: {A} add: {A,C} rmv: {B,D} add: {A,B,C} rmv: {A,B,D} add: {D} add: {A,B,C,D} rmv: {A,B,D} Always grows Garbage Collection algorithms exist. 2P-Set: Two Phase Set CRDT
  • 30. A C B us-west-2 event us-east-1 state add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} 2P2P-Graph CRDT 2P-Set for vertices, 2P-Set for edges resolution method: remove wins
  • 31. A C B us-west-2 event us-east-1 state add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {} rmv_v: {A} add_e: {} rmv_e: {} 2P2P-Graph CRDT 2P-Set for vertices, 2P-Set for edges resolution method: remove wins
  • 32. A C B us-west-2 event us-east-1 state add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {} rmv_v: {A} add_e: {} rmv_e: {} add_v: {A,B,C} rmv_v: {A} add_e: {AB,AC,BC} rmv_e: {AB,AC} 2P2P-Graph CRDT 2P-Set for vertices, 2P-Set for edges resolution method: remove wins
  • 33. AD C B us-west-2 event us-east-1 state add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {} rmv_v: {A} add_e: {} rmv_e: {} add_v: {A,B,C} rmv_v: {A} add_e: {AB,AC,BC} rmv_e: {AB,AC} add_v: {D} rmv_v: {} add_e: {AD} rmv_e: {} 2P2P-Graph CRDT 2P-Set for vertices, 2P-Set for edges resolution method: remove wins
  • 34. AD C B us-west-2 event us-east-1 state add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {A,B,C} rmv_v: {} add_e: {AB,AC,BC} rmv_e: {} add_v: {} rmv_v: {A} add_e: {} rmv_e: {} add_v: {A,B,C} rmv_v: {A} add_e: {AB,AC,BC} rmv_e: {AB,AC} add_v: {D} rmv_v: {} add_e: {AD} rmv_e: {} add_v: {A,B,C,D} rmv_v: {A} add_e: {AB,AC,BC,AD} rmv_e: {AB,AC,AD} 2P2P-Graph CRDT 2P-Set for vertices, 2P-Set for edges resolution method: remove wins
  • 35. Sometimes the state won't converge easily ● Missing events (broken links) ○ integrity checks ○ repair ● Rerunning bulk events after downtime ○ Clocks: Event vs. Ingestion vs. Processor vs. Logical ○ Enrichment: IP address reputation changes daily 37/40
  • 36. Background Reconciliator Reconciliation: Compare hash (Merkle) trees Compensation: Merge CRDT states client2 (read) us-west-2a S3 versioned us-west-2 client1 (read) us-east-1b S3 versioned us-east-1 Background Reconciliator 38/40
  • 37. Takeaways ● Define business need for cross region Availability, Latency, Residency, Analytics ● Know your NoSQL Couchbase != Cassandra != Kafka ● Ask about CRDTs LWW-Register, MV-Register, 2P-Sets, 2P2P-Graphs ● Use Reconciliation ● Dedicated Fiber and Atomic clocks ARE COMING 40/40
  • 38. “The Internet was designed to be an academic medium. It was not designed to handle this level of transactions” Fred Matteson @ schwab.com 1999
  • 39. Advanced Topics ● ‫מרקחת‬ ‫לבית‬ ‫מאשר‬ ‫מטבחים‬ ‫לבית‬ ‫דומה‬ ‫יותר‬ ‫האמתי‬ ‫העולם‬ ● Multi Data Center Topologies ○ Star (SPOF, simple) ○ Ring (TLV ←→ Eilat ←→ Jerusalem←→ TLV) ○ Mesh (resilient, complex) ● Data Residency ○ Separate PII from data ○ Peek at other data centers ad-hoc