Choosing the Right Database: Exploring MySQL Alternatives for Modern Applications by Bhanu Jamwal, Head of Solution Engineering, PingCAP at the Mydbops Opensource Database Meetup 14.
This presentation discusses the challenges in choosing the right database for modern applications, focusing on MySQL alternatives. It highlights the growth of new applications, the need to improve infrastructure, and the rise of cloud-native architecture.
The presentation explores alternatives to MySQL, such as MySQL forks, database clustering, and distributed SQL. It introduces TiDB as a distributed SQL database for modern applications, highlighting its features and top use cases.
Case studies of companies benefiting from TiDB are included. The presentation also outlines TiDB's product roadmap, detailing upcoming features and enhancements.
4. The Need to Improve Underlying Infrastructure
AI & Machine Learning
Databases & Data Centers
Networking
Containers
Compute
Virtual Machines
Storage
Security
Cloud Computing
5. The Rise of Cloud-Native Architecture
Client Apps
Data Center
Traditional Web App
SPA Web App
HTML
TypeScript/Angular 2
Docker Host
API
Gateway
Web App
Identity
Microservice
(STS+Users)
Catalog
Microservice
Basket
Microservice
Marketing
Microservice
Locations
Microservice
Relational
Database
Ordering
Microservice
Relational
Database
Relational
Database
Redis
Cache
NoSQL
Database
NoSQL
Database
Ordering API
GracePeriod Worker Svc.
Event
Bus
(Publish/Subscribe
Channel)
6. A Constant Barrier to Reliable
Transactions at Scale
Legacy Data Storage Architecture:
7. Data Growth: A challenge
MYSQL
DB1
MySQL 1
DB 1
DB2
MySQL 2
DB3
MySQL 3
DB4
MySQL 4
Data distribution logic
in application
Data access logic in
application
Manual Sharding
DB 2
DB 3
DB 4
8. Intent to grow creates Complexity
Master
Hot
Standby
Read
Replica
Master
Hot
Standby
Read
Replica
Master
Hot
Standby
Read
Replica
Master
Hot
Standby
Read
Replica
Master
Hot
Standby
Read
Replica
Read
Replica
Hot
Standby
Read
Replica
Hot
Standby
Read
Replica
Hot
Standby
Read
Replica
Hot
Standby
Read
Replica
Hot
Standby
Application
Reads
Writes
Shard 1 Shard 2 Shard 3 Shard 4 Shard 5
Region 1 - Active Application Region 2 - Passive
Shard 1 Shard 2 Shard 3 Shard 4 Shard 5
25 DB Nodes
20 Replication
Channels
Operational
complexity
Failover
complexity
Resharding
complexity
Reads
11. MySQL Limitations
Scalability High Availability Real-Time Analytics Handle Modern
applications
Unstable performance when
scaling write-intensive
applications.
Setting up HA requires careful
planning & configuration. The
replication can result in lag,
potential data
inconsistencies
Analytical queries can
impact transactional
processing
Adapting to cloud native
architecture poses
challenges to traditional,
monolithic systems like
MySQL
13. Adopt MySQL Forks
PROS
❖ Introduce features and performance improvements
❖ Provide continuity of support, potential
enhancements.
CONS
❖ Still face challenges when dealing with highly
concurrent, write-intensive workloads,
❖ Quality of support can vary, and some enterprises
may be hesitant to rely on community-driven
projects.
14. Adopt Database clustering
PROS
❖ Allows for horizontal scaling of sharded data
❖ Complexity of sharding is abstracted from developers
❖ Queries are automatically routed to appropriate
shards
CONS
❖ Maintenance overhead, if components such as
vtgate or vttablet goes down, takes 6-8 hours to
rebuild
❖ Partition management is manual
❖ Heavy infrastructure requirements
16. What is Distributed SQL?
An evolved database
architecture built from
the ground up to
deliver the
transactional
guarantees of
relational databases
and the horizontal
scale of NoSQL
databases.
Zone 1 Zone 2 Zone 3
Region 1
(Primary)
Region 2
Compute
Storage
Region 3
Region 1
(Secondary)
Region 2
Region 3
Region 1
(Secondary)
Region 2
Region 3
20. What Makes TiDB So Advanced?
MySQL compatible, the
TiDB SQL Layer separates
compute from storage to
make scaling simpler,
delivering a true
cloud-native architecture.
21. What Makes TiDB So Advanced?
The Placement Driver
(PD) Layer functions just
like a full-time DBA,
monitoring millions of
shards and performing
hundreds of operations
per minute.
22. What Makes TiDB So Advanced?
Consisting of row and
column-based storage
engines, the TiDB Storage
Layer offers built-in high
availability and strong
consistency that can
auto-scale to hundreds of
nodes and petabytes of
data.
23. The Advantages of TiDB
Horizontal Scaling
Grants total transparency into
data workloads without
manual sharding.
High Availability
Guarantees auto-failover and
self-healing for continuous
access to data.
Mixed Workloads
Streamlined tech stack makes it
easier to produce real-time
analytics.
MySQL Compatibility
Enjoy the most MySQL
compatible distributed SQL
database on the planet.
Multi-Cloud
Deploy database clusters
anywhere in the world.
Open Source
Unlock business innovation with
a database that’s 100% open
source.
Robust Security
Protect data with
enterprise-grade encryption
both in-flight and at-rest.
24. Top Use Cases for TiDB
MySQL Alternative
Migrate to a more affordable
and elastic MySQL alternative
that supports real-time
analytics right out of the box.
Real-Time Analytics
Enable your business to process
and query new data as it’s
created to guide decision
making, enhance resource
utilization, and improve
customer experiences.
Application
Modernization
Boost developer productivity with
a modern, distributed SQL
database that offers true elastic
scale and relentless reliability
combined with mixed
workload processing.
Tech Stack Unification
Reduce costs and system
complexity with a unified data
stack that can replace traditional
relational databases, NoSQL
databases, and lightweight
data warehouses.
25. TiDB is Trusted by Global Innovation Leaders
3000 + global adopters use TiDB in production
33. Row Engine Row Engine
Columnar
Engine
Columnar
Engine
Analytical Workers
TiDB-Svr MPP MPP
MPP MPP
Compute Engine
TiDB-Svr
TiDB-Svr TiDB-Svr
Row Engine
Trx Storage
Row Engine
Columnar
Engine
Analytical Storage
Columnar
Engine
PD PD PD
Data Placement Mgr & Timestamp Oracle
Monitoring Alerting Diagnosis
TiDB Data Platform
Data
Migration
Lightning
TiProxy
BR
/
PiTR
TiSpark
TiCDC
TiUp
Managed
Service TiOperator
Deploy Anywhere & Any Form
Streaming
MySQL
Connectors
BI Tools via
MySQL protocol
Client Applications
Data Files at Various Storages
MySQL Compatible Databases
Streaming
Downstream Ecosystem
Sink
Storage
Version 1
34. TiDB Storage with Multi-Model & Indexing Techniques
Sales Order
Federated Query Engine with multi-tenancy
CRM Ads Business Intelligence
Tenant 1 Tenant 2 Tenant 3 Tenant 2 Tenant 3 Tenant 4 Tenant 5 Tenant 6 Tenant 7
Data Integration
Data
Files
Data
Files
Data
Files
Data Importing
Data Federation
TiCDC
Version 2
35. Highlighted Features
● General Purpose - Built for general purpose and suitable for various mission-critical applications.
● Extreme Scalability - Scales to hundreds of terabytes, handling hundreds of thousands transactions per second
● Seamlessly Elastic - Scale in and out as needed anytime without interfering with the business
● Zero Downtime - Get back to normal in seconds when your machine or rack goes down.
● Easy to Tame - MySQL compatible with smooth operations on hundreds of machines.
● Online Schema Changes - Change schemas anytime anywhere without pausing your applications.
● Real-time Insights - Analyze current data in a consistent way with only one simple command.
● AI-Powered - Use natural language to describe what you want.
● Run Anywhere - Deploy on private DCs, various cloud environments via bare metal, K8s, or managed services.
● Proven Technology - A trusted solution by many big names for mission-critical use cases.
37. Major Industries Using TiDB
MySQL
Replacement
Application
Modernization
Real-Time
Analytics
NoSQL
Replacement
Single View Operational
Data
Management
Tech Stack
Unification
Migrate to a more
affordable and
elastic MySQL
alternative that
supports real-time
analytics right out of
the box.
Boost developer
productivity with a
modern, distributed
SQL database that
offers true elastic
scale and relentless
reliability combined
with mixed workload
processing.
Enable your
business to process
and query new data
as it’s created to
guide decision
making, enhance
resource utilization,
and improve
customer
experiences.
Scale your modern
applications with
better performance
and
consistency—all
without worrying
about the limitations
that come with
NoSQL databases.
Extract the value of
your data across
multiple businesses
for all real-time
applications while
ensuring strong
consistency.
Deliver smooth
database operations
with zero downtime
for schema
changes, hardware
failure, or upgrades.
Reduce costs and
system complexity
with a unified data
stack that can
replace traditional
relational
databases, NoSQL
databases, and
lightweight data
warehouses.
Block
Databricks
Airbnb
Airtable
Snap
RD Station
Certik
Amber AI
Nuro AI
Pinterest Databricks Databricks
Niantic
Catalyst
Airbnb
Pinterest
38. Top Use Cases for TiDB
MySQL
Replacement
Application
Modernization
Real-Time
Analytics
NoSQL
Replacement
Single View Operational
Data
Management
Tech Stack
Unification
Migrate to a more
affordable and
elastic MySQL
alternative that
supports real-time
analytics right out of
the box.
Boost developer
productivity with a
modern, distributed
SQL database that
offers true elastic
scale and relentless
reliability combined
with mixed workload
processing.
Enable your
business to process
and query new data
as it’s created to
guide decision
making, enhance
resource utilization,
and improve
customer
experiences.
Scale your modern
applications with
better performance
and
consistency—all
without worrying
about the limitations
that come with
NoSQL databases.
Extract the value of
your data across
multiple businesses
for all real-time
applications while
ensuring strong
consistency.
Deliver smooth
database operations
with zero downtime
for schema
changes, hardware
failure, or upgrades.
Reduce costs and
system complexity
with a unified data
stack that can
replace traditional
relational
databases, NoSQL
databases, and
lightweight data
warehouses.
Block
Databricks
Airbnb
Airtable
Snap
RD Station
Certik
Amber AI
Nuro AI
Pinterest Databricks Databricks
Niantic
Catalyst
Airbnb
Pinterest
40. Problem Solution Results
Airbnb is building a data service
layer to function as a single source
of truth for its business. Currently,
the data service layer is powered by
multiple Amazon Aurora instances
and application-layer sharding.
However, the company experienced
many critical issues with Amazon
Aurora including forced MySQL
upgrades, lack of database visibility,
limited write scalability, and poor
technical support.
To reduce the number of Amazon
Aurora clusters they have to
maintain while also supporting
business growth, the team
decided to build its data service
layer with TiDB.
By deploying TiDB, Airbnb now
enjoys built-in horizontal
scalability, improved visibility into
all database instances, less
infrastructure to maintain, and
cross-Kubernetes deployments.
Airbnb’s success with TiDB has
allowed the company to:
● Build a data service with
consolidated
infrastructure
● Create a single source of
truth for its key-value
data store
● Embrace TiDB as a
key-value and relational
database
Airbnb
A leading online marketplace for booking short- and long-term
homestays and experiences.
41. Problem Solution Results
Databricks’ was using MySQL to
manage its control plane
containing multiple cloud
services in AWS, GCP, and Azure
supporting users, cluster
management, and web
applications.
As the company’s Azure cloud
usage grew by 3x, Azure MySQL
was unable to handle the
increased load: queries became
slow or even unresponsive for
large customers.
Databricks evaluated TiDB
against Azure MySQL and found
the former outperformed the
latter in latency and QPS with
comparable hardware resources.
Development teams were also
impressed by TiDB’s horizontal
scalability, no manual sharding,
and zero downtime for scale-in
and scale-out.
With TiDB, Databricks has seen
better performance with no
MySQL scalability issues.
Additional benefits include:
● Lower average P99 and
p999 latencies
● More than 10x QPS
compared to Azure MySQL
● Reduced hardware costs
and maintenance burden
as the company can now
host several control plane
services in a single cluster
Databricks
An enterprise software company founded by the creators of Apache
Spark that develops a web-based Spark platform.
42. Problem Solution Results
Pinterest had long recognized
the need to optimize its data
storage system to accelerate
innovation in a ML platform for
enhanced user
recommendations.
Using HBase, the company was
carrying a large data footprint
that was becoming way too
expensive, way too time
consuming and complex to
manage, and too slow to meet
user expectations.
Pinterest was impressed with
TiDB’s distributed SQL
architecture and how it allows
developers to build storage
applications faster without
making painful tradeoffs.
The company decided to adopt
TiDB to replace HBase. This has
led to better data consistencies,
a lower total cost of ownership
(TCO), and more powerful
features.
This partnership with PingCAP to
use TiDB is reaping major
benefits for Pinterest, including:
● 30-90% reduction in tail
latencies
● 50%+ reduction in
hardware instance costs
● Reduced complexity from
6 systems to 1
● Stronger consistency
between tables and
indexes
Pinterest
An image sharing and online social media service for saving and
discovery of information using images, animated GIFs, and videos.
43. Problem Solution Results
The Square / Cash App team
needed a solution with the
combination of scale, simplicity,
and strong consistency.
Their architecture was sharded
MySQL on Vitess. This solution
required major application
refactoring, complicated
sharding orchestration, and
lacked strong transactional
consistency.
The Square / Cash App team
chose TiDB. Being MySQL
compatible made the transition
simple.
TiDB eliminated the need for
manual sharding, significantly
simplifying their architecture.
TiDB also provided strong
consistency through full ACID
compliance.
Eliminating sharding enabled
simple, no downtime scale in
and scale out capabilities with
no application refactoring.
Successfully meeting project
requirements on one data
platform meant that the team
could turn their focus to the
innovation needs of the business.
With TiDB in place, they were able
to achieve better performance
results—with more data—and no
sharding:
● 4,000 QPS with 5ms response
time
● With 2TB data on TiDB
Square / Block
A global financial services company focused on helping small and
medium businesses accept credit card payments and use mobile
devices as payment point-of-sale systems.
44. Problem Solution Results
Catalyst used PostgreSQL to
handle all the data it collected
externally. However, as its
business grew and data sources
expanded quickly, PostgreSQL
wasn’t able to keep up with its
needs.
Catalyst tried to store the data
as JSON documents, but this
impacted query performance.
Additionally, due to the increased
amount of data being stored,
costs skyrocketed.
To handle these increasing
demands, Catalyst redesigned
its entire data processing and
storage system from the ground
up.
TiDB was chosen to power this
new architecture’s data serving
layer for pre-processing data for
real-time customer queries.
By adopting TiDB, Catalyst’s CSP
now provides:
● Better customer
experience with 60x faster
query responses
● A more resilient system
● Amplified data storage
and processing
● Real-time analytical
capabilities
Catalyst also reduced its overall
storage, operation, and
maintenance costs.
Catalyst
A Customer Success Platform (CSP) that helps teams centralize siloed
customer data, get a clear line of sight into customer health, and scale
customer journeys that drive retention and growth.
45. Problem Solution Results
Flipkart runs one of India’s
largest MySQL fleets with over
400 applications, thousands of
microservices, and countless
varieties of end-to-end
e-commerce operations
running across 700+ MySQL
clusters.
Flipkart’s tech stack became
ever more complex to process
and store this amount and
variety of data. However, as its
business kept growing, previous
database solutions started to
hit their limits.
After thorough evaluations and
testing, TiDB beat out several
other vendors due to its
horizontal scalability, high
availability, distributed SQL data
model, and being able to
withstand high data throughputs
with low latency.
With TiDB, Flipkart simplified its
applications by retaining its SQL
data model while guaranteeing
complete ACID compliance.
Additionally, the company no
longer needed to manage
multiple shards, nodes, and
replication channels, making
operations easier to manage.
Since TiDB was deployed, there
have been no single point of
failure or system downtimes.
Flipkart
A leading online Indian e-commerce company owned by Walmart that
sells everything from books to consumer electronics, home essentials, and
lifestyle products.
47. Scalable by Design - Roadmap
Mid 2023 End of 2023 2-3 years projection
Core Ability
Multi-RocksDB storage
engine
Increased write velocity, faster
scaling operations, larger clusters
Dynamic Region
Reliable and consistent
performance for tremendous data
Cascades Optimizer
A new smarter optimizer
architecture
General plan cache
Improve general read performance
Mixed Workload
Processing
TiFlash performance
boost
TiFlash optimization such as late
materialization, runtime filter, etc
Unlimited transaction size
For large batch processing
Ecosystem
Distributed TiCDC on
Single Table
Distributed replication in static
mode
Import major
performance boost
Expecting 3-4 times improvements
M:N Source and Sink
Replication
Fully dynamic distributed replication
with MySQL support
* Only selected features are presented.
48. Versatile by Nature - Roadmap
Mid 2023 End of 2023 2-3 years projection
Multi-tenancy Phase I
Quota based resource group
Multi-tenancy Phase II
Fine-grained resource control, isolation to reduce
cost
Multi-model Support
Support more data models other than KV & JSON
and relational model
Production ready TiCDC sink to S3
and Azure object store
Enhance ecosystem to better work with big data
MySQL 8.0
Makes TiDB compatible with MySQL 8.0
Federated Query
Query engine across multiple storages
Full text search & GIS
Flexible indexing techniques for more scenarios
UDF
User defined functions
* Only selected features are presented.
49. Reliable by Default - Reliable Roadmap
Mid 2023 End of 2023 2-3 years projection
TiCDC/PiTR recovery objectives
enhancements
Increase business continuity and minimize the
impact of system failures
Improved cluster/node level fault
tolerance
Resilience enhancement
Enhanced TiDB memory
management
Less to no OOM Crash
Tiproxy
Zero downtime scaling / upgrading
TiFlash spill to disk
Avoid TiFlash OOM
Global Table
End-to-end data correctness check
Prevent data error or corruptions through TiCDC
Polished Hint Mechanism
Make hint mechanism more elegant and covering
more cases
* Only selected features are presented.
50. Reliable by Default - Security Roadmap
Mid 2023 End of 2023 2-3 years projection
JWT authentication
secure, standard authentication
Column-level/row-level access
control
Finer-grained access control
Enhanced client-side encryption
LDAP integration
Authenticate via LDAP server over TLS
Unified TLS CA/Key rotation policy
Enhanced security and operational efficiency for
all TiDB components
Enhanced data masking
Audit log enhancement
Enhance with greater details
* Only selected features are presented.
51. Miscellaneous Operations - Roadmap
Mid 2023 End of 2023 2-3 years projection
Fastest online DDL distributed
framework
Complete the distributed framework to support
fastest online DDL
Analyze speed up on large tables
Under new DDL framework & resource control
AI-indexing
AI model as the index
SQL-based data import
User-friendly operational enhancement
SQL-based data management
for TiCDC, data migration, and backup&restore
tools
Heterogeneous migration support
Migrate from PG, Oracle or SQL Server
Table Level Flashback
SQL support for traveling a single table to
specific time point
Automatic pause/resume DDL
during upgrade
Ensure a smooth upgrade experience
Re-invented AI-SQL performance
advisor
Your AI SQL tuning expert
Production Ready TTL
Easier data life cycle management
TiCDC to support multiple
upstreams
(i.e., N:1 TiDB to TiCDC)
Enhanced data lifecycle
management
* Only selected features are presented.