Mohammad Zaid Patel from Mydbops, embarked on a journey through PostgreSQL table partitioning.
✅ Why Data Organization?
Understand the importance and benefits of organized data in databases.
✅ Advantages of Organizing Your Data:
Better retrieval, improved performance, data integrity, and efficient storage.
✅ Data Organization Techniques:
Index creation, data archival, schemas, functional naming, and relationships.
✅ Table Partitioning in PostgreSQL:
Dive into the design technique of dividing large tables for efficient data management.
✅ Types of Table Partitioning:
Range, List, and Hash methods for unique data organization.
✅ Partitioning Techniques in PostgreSQL:
Manual and using pg_partman extension for streamlined partition creation.
✅ Limitations of Table Partitioning:
Considerations and challenges associated with this technique.
✅ Best Practices for Partitioned Table Maintenance:
Tips on choosing the right partition key, understanding query patterns, and more.
#mydbops #postgresql #mywebinar #webinar #data #database #partitioning #dataorganization #queryperformance #indexing #dataarchival #scalability #dataanalysis #pg_partman #databaseperformance #maintenance #dbms #dba #opensource #highavailability
Data Organisation: Table Partitioning in PostgreSQL
1. Mastering data organisation:
Deep dive into PostgreSQL table partitioning
Presented by
Mohammad Zaid Patel
Mydbops
Mydbops MyWebinar - 30
Dec 23rd, 2023
2. About Me
● PostgreSQL Database consultant
● Experienced in PostgreSQL and related DB
technologies
● Active Blogger
● Tech Speaker
● Likes Cricket, Music & Comedy
4. ● Why data organisation ?
● Advantages of data organisation
● Techniques of data organisation
● Table partitioning in PostgreSQL
● Table partition techniques
● Demo for table partitioning using pg_partman
● Limitations of table partitioning
● Best practices for table partitioning
Agenda
6. ● Databases are used for storing and retrieving data
● Database becomes a black box
● Piled up with un-organised data
● Performance degradation of the database
Why data organization?
7. ● Organization of data is very essential for
the database functionality
● Plays crucial roles in database efficiency
● Assures data integrity
● Improves the usability of the stored data
Why data organization?
10. Advantages of organizing your data
● Better data retrieval
● Improved query performance
● Data integrity
● Efficient storage and resource utilization
● Ease in maintenance
● Ease in scalability of the database
● Data analysis and reporting
12. Data organization techniques
Index creation :
- Creates a data structure that allows quick access to rows based
on specific column values
- Faster data retrieval for conditions involving indexed columns
- Types: B-tree, GIN,Hash indexes etc
Data Archival:
- Process of moving the old data to a different location
- Clean up database
- Helps in managing the data efficiently over time
13. Data organization techniques
Schemas:
- Collection of database objects organized into a named namespace
- Provides a way to logically group and organize database objects within a
database
- Multiple schemas with each schema having its own set of objects.
Functional naming of database objects:
- Helps in noting the type of database objects e.g. indexes, views ,
sequences etc
- Easier for the users to understand the database structure and work with
the objects more effectively
14. Data organization techniques
Relationships among database objects:
- Avoids creation of "orphaned" records and maintaining consistency across
related tables.
- Efficient queries that involve data from multiple tables like JOIN
operations
Table partitioning:
- Involves dividing large tables into smaller, more manageable pieces
- Improves query performance, data management, and maintenance tasks
- Data distribution across partitions.
16. Table Partitioning in PostgreSQL
● Database design technique that involves dividing large tables into smaller,
more manageable segments called partitions
id Name Cricket club
01 Virat RCB
02 Dhoni CSK
03 Rohit MI
04 Hardik MI
05 Siraj RCB
06 Jadeja CSK
id Name Cricket club
01 Virat RCB
05 Siraj RCB
02 Dhoni CSK
06 Jadeja CSK
03 Rohit MI
04 Hardik MI
Child table-1
Child table-2
Child table-3
Normal Table Partitioned Table
Parent table
17. ● Partition key, determines how data is distributed across partitions
● Efficient data management, especially for operations like data insertion, updates, and
deletions.
● Enhances query performance by allowing the database to selectively scan only
relevant partitions when queries involve conditions on the partition key
● Ease in data archival
Table Partitioning in PostgreSQL
19. Types of table partitioning in PostgreSQL
PostgreSQL supports different partitioning methods including range partitioning , list
partitioning, and hash partitioning.
Range Partitioning:
- Divides a table into partitions based on ranges of values in a chosen partition key
column
- Partitioning by a date column, each partition may represent a specific time period,
such as months or years
Creating a child table :
CREATE TABLE child_table_2022_01_01 PARTITION OF
parent_table
FOR VALUES FROM ('2022-01-01') TO ('2023-02-01');
20. Types of table partitioning in PostgreSQL
id Username date_of_creation
01 Jake 01-01-2023
05 Ryan 01-01-2023
02 Anne 02-01-2023
06 Austin 02-01-2023
03 Daniel 03-01-2023
04 Taylor 03-01-2023
Child table-1
Child table-2
Child table-3
Range based partitioned table
21. Types of table partitioning in PostgreSQL
List based Partitioning:
- Divides a table into partitions based on specific values in a chosen partition key
column.
- Rows with values falling within specific ranges of this key are grouped into individual
partitions
Creating a child table :
CREATE TABLE new_york_child_table PARTITION OF parent_table
FOR VALUES IN (‘New York’);
22. Types of table partitioning in PostgreSQL
id Username City
01 Jake New York
05 Ryan New York
02 Anne Tokyo
06 Austin Tokyo
03 Daniel London
04 Taylor London
Child table-1
Child table-2
Child table-3
List based partitioned table
23. Types of table partitioning in PostgreSQL
Hash based Partitioning:
- Divides a table into partitions based on the hash value of a chosen partition key
column
- Uniform distribution of data across partitions by using a hash function
Creating a child table :
CREATE TABLE child_table_1 PARTITION OF parent_table FOR
VALUES WITH (MODULUS 3,REMAINDER 0);
24. Types of table partitioning in PostgreSQL
id Username City
01 Anne New York
05 Taylor Tokyo
02 Jake Tokyo
06 Austin London
03 Daniel London
04 Ryan New York
Child table-1
Child table-2
Child table-3
List based partitioned table
Remainder
value
0001
0001
0002
0002
0003
0003
26. Partitioning techniques in PostgreSQL
1. Manual Partitioning:
- Creating parent tables and child tables manually
- Maintenance of the child tables should be taken care of
E.g
Creating a parent table :
CREATE TABLE parent_table (
id int,date_column DATE, value INTEGER,
CONSTRAINT pkey PRIMARY KEY (id,date_column)
) PARTITION BY RANGE (date_column);
27. Partitioning techniques in PostgreSQL
E.g
Creating a child table :
CREATE TABLE child_table_2022 PARTITION OF parent_table
FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');
28. Partitioning techniques in PostgreSQL
2. Partitioning using pg_partman extension:
- Extension designed to create and oversee sets of partitioned tables
- The creation of child tables is fully managed by the extension itself
- The extension includes a background worker (BGW) process to streamline partition
maintenance
- Optional retention policy that can automatically discard partitions that are no longer
necessary
- Maintenance function takes cares of partition management on
timely basis
29. Partitioning techniques in PostgreSQL
pg_partman extension is required:
demo_db=# dx
List of installed extensions
Name | Version | Schema | Description
------------+---------+------------+---------------------------------
---------------------
pg_partman | 4.7.3 | partman | Extension to manage partitioned
tables by time or ID
30. Partitioning techniques in PostgreSQL
Creating a parent table :
Creating a child table :
CREATE TABLE parent_table (
id integer ,created_date date,
CONSTRAINT table01_pkey PRIMARY KEY
(id,created_date)
) partition by range(created_date);
SELECT partman.create_parent( p_parent_table =>
'public.parent_table’, p_control => 'created_date',
p_interval=> '1 day', p_premake => 2);
33. Limitations of Table partitioning
● Useful only when the partition key is used
● Regular supervision is required for the pg_partman tool for partition management
● Complexities in data migration services
● Complexities in tables with foreign key relationships
● Child tables that are stored as backup needs to be taken care
● Data accumulation in default table fails the pg_partman functions
35. ● Choose the Right Partition Key
● Understand Query Patterns
● Monitor and Tune Performance
● Choose Appropriate Partitioning Method
● Limit the Number of Partitions
● Implement Data Archiving
● Regularly Update PostgreSQL Version
Best practices for partitioned table maintenance