Tree-like data relationships are common, but working with trees in SQL usually requires awkward recursive queries. This talk describes alternative solutions in SQL, including:
- Adjacency List
- Path Enumeration
- Nested Sets
- Closure Table
Code examples will show using these designs in PHP, and offer guidelines for choosing one design over another.
Presentation given at OSCON 2009 and PostgreSQL West 09. Describes SQL solutions to a selection of object-oriented problems:
- Extensibility
- Polymorphism
- Hierarchies
- Using ORM in MVC application architecture
These slides are excerpted from another presentation, "SQL Antipatterns Strike Back."
Trees In The Database - Advanced data structuresLorenzo Alberton
Storing tree structures in a bi-dimensional table has always been problematic. The simplest tree models are usually quite inefficient, while more complex ones aren't necessarily better. In this talk I briefly go through the most used models (adjacency list, materialized path, nested sets) and introduce some more advanced ones belonging to the nested intervals family (Farey algorithm, Continued Fractions, and other encodings). I describe the advantages and pitfalls of each model, some proprietary solutions (e.g. Oracle's CONNECT BY) and one of the SQL Standard's upcoming features, Common Table Expressions.
Designing an extensible, flexible schema that supports user customization is a common requirement, but it's easy to paint yourself into a corner.
Examples of extensible database requirements:
- A database that allows users to declare new fields on demand.
- Or an e-commerce catalog with many products, each with distinct attributes.
- Or a content management platform that supports extensions for custom data.
The solutions we use to meet these requirements is overly complex and the performance is terrible. How should we find the right balance between schema and schemaless database design?
I'll briefly cover the disadvantages of Entity-Attribute-Value (EAV), a problematic design that's an example of the antipattern called the Inner-Platform Effect, That is, modeling an attribute-management system on top of the RDBMS architecture, which already provides attributes through columns, data types, and constraints.
Then we'll discuss the pros and cons of alternative data modeling patterns, with respect to developer productivity, data integrity, storage efficiency and query performance, and ease of extensibility.
- Class Table Inheritance
- Serialized BLOB
- Inverted Indexing
Finally we'll show tools like pt-online-schema-change and new features of MySQL 5.6 that take the pain out of schema modifications.
MySQL 8 introduces support for ANSI SQL recursive queries with common table expressions, a powerful method for working with recursive data references. Until now, MySQL application developers have had to use workarounds for hierarchical data relationships. It's time to write SQL queries in a more standardized way, and be compatible with other brands of SQL implementations. But as always, the bottom line is: how does it perform? This presentation will briefly describe how to use recursive queries, and then test the performance and scalability of those queries against other solutions for hierarchical queries.
Presentation given at OSCON 2009 and PostgreSQL West 09. Describes SQL solutions to a selection of object-oriented problems:
- Extensibility
- Polymorphism
- Hierarchies
- Using ORM in MVC application architecture
These slides are excerpted from another presentation, "SQL Antipatterns Strike Back."
Trees In The Database - Advanced data structuresLorenzo Alberton
Storing tree structures in a bi-dimensional table has always been problematic. The simplest tree models are usually quite inefficient, while more complex ones aren't necessarily better. In this talk I briefly go through the most used models (adjacency list, materialized path, nested sets) and introduce some more advanced ones belonging to the nested intervals family (Farey algorithm, Continued Fractions, and other encodings). I describe the advantages and pitfalls of each model, some proprietary solutions (e.g. Oracle's CONNECT BY) and one of the SQL Standard's upcoming features, Common Table Expressions.
Designing an extensible, flexible schema that supports user customization is a common requirement, but it's easy to paint yourself into a corner.
Examples of extensible database requirements:
- A database that allows users to declare new fields on demand.
- Or an e-commerce catalog with many products, each with distinct attributes.
- Or a content management platform that supports extensions for custom data.
The solutions we use to meet these requirements is overly complex and the performance is terrible. How should we find the right balance between schema and schemaless database design?
I'll briefly cover the disadvantages of Entity-Attribute-Value (EAV), a problematic design that's an example of the antipattern called the Inner-Platform Effect, That is, modeling an attribute-management system on top of the RDBMS architecture, which already provides attributes through columns, data types, and constraints.
Then we'll discuss the pros and cons of alternative data modeling patterns, with respect to developer productivity, data integrity, storage efficiency and query performance, and ease of extensibility.
- Class Table Inheritance
- Serialized BLOB
- Inverted Indexing
Finally we'll show tools like pt-online-schema-change and new features of MySQL 5.6 that take the pain out of schema modifications.
MySQL 8 introduces support for ANSI SQL recursive queries with common table expressions, a powerful method for working with recursive data references. Until now, MySQL application developers have had to use workarounds for hierarchical data relationships. It's time to write SQL queries in a more standardized way, and be compatible with other brands of SQL implementations. But as always, the bottom line is: how does it perform? This presentation will briefly describe how to use recursive queries, and then test the performance and scalability of those queries against other solutions for hierarchical queries.
The JSON data type and functions that support it comprise one of the most interesting features introduced in MySQL 5.7 for application developers. But no feature is a Golden Hammer. We need to apply a little expertise to get the best of it, and avoid misusing it. I’ll show practical examples that work well with JSON, and other scenarios where conventional columns would perform better. Questions addressed in this presentation: How much space does JSON data use, compared to conventional data? What is the performance of querying JSON vs. conventional data? How do I create indexes for JSON data? What kind of data is best to store in JSON? How do I get the best of both worlds?
Spring Data is a high level SpringSource project whose purpose is to unify and ease the access to different kinds of persistence stores, both relational database systems and NoSQL data stores.
Let's get into several common types of queries that developers struggle with, showing SQL solutions, and then analyze them for optimal efficiency. I'll cover Exclusion Join, Random Selection, Greatest-Per-Group, Dynamic Pivot, and Relational Division.
Graphs in the Database: Rdbms In The Social Networks AgeLorenzo Alberton
Despite the NoSQL movement trying to flag traditional databases as a dying breed, the RDBMS keeps evolving and adding new powerful weapons to its arsenal. In this talk we'll explore Common Table Expressions (SQL-99) and how SQL handles recursion, breaking the bi-dimensional barriers and paving the way to more complex data structures like trees and graphs, and how we can replicate features from social networks and recommendation systems. We'll also have a look at window functions (SQL:2003) and the advanced reporting features they make finally possible.
Asynchronous API in Java8, how to use CompletableFutureJosé Paumard
Slides of my talk as Devoxx 2015. How to set up asynchronous data processing pipelines using the CompletionStage / CompletableFuture API, including how to control threads and how to handle exceptions.
GraphQL is a query language for APIs and a runtime for fulfilling those queries. It gives clients the power to ask for exactly what they need, which makes it a great fit for modern web and mobile apps. In this talk, we explain why GraphQL was created, introduce you to the syntax and behavior, and then show how to use it to build powerful APIs for your data. We will also introduce you to AWS AppSync, a GraphQL-powered serverless backend for apps, which you can use to host GraphQL APIs and also add real-time and offline capabilities to your web and mobile apps. You can follow along if you have an AWS account – no GraphQL experience required!
Level: Beginner
Speaker: Rohan Deshpande - Sr. Software Dev Engineer, AWS Mobile Applications
A presentation of what are JavaScript Promises, what problems they solve and how to use them. Dissects some Bluebird features, the most complete Promise library available for NodeJS and browser.
The JSON data type and functions that support it comprise one of the most interesting features introduced in MySQL 5.7 for application developers. But no feature is a Golden Hammer. We need to apply a little expertise to get the best of it, and avoid misusing it. I’ll show practical examples that work well with JSON, and other scenarios where conventional columns would perform better. Questions addressed in this presentation: How much space does JSON data use, compared to conventional data? What is the performance of querying JSON vs. conventional data? How do I create indexes for JSON data? What kind of data is best to store in JSON? How do I get the best of both worlds?
Spring Data is a high level SpringSource project whose purpose is to unify and ease the access to different kinds of persistence stores, both relational database systems and NoSQL data stores.
Let's get into several common types of queries that developers struggle with, showing SQL solutions, and then analyze them for optimal efficiency. I'll cover Exclusion Join, Random Selection, Greatest-Per-Group, Dynamic Pivot, and Relational Division.
Graphs in the Database: Rdbms In The Social Networks AgeLorenzo Alberton
Despite the NoSQL movement trying to flag traditional databases as a dying breed, the RDBMS keeps evolving and adding new powerful weapons to its arsenal. In this talk we'll explore Common Table Expressions (SQL-99) and how SQL handles recursion, breaking the bi-dimensional barriers and paving the way to more complex data structures like trees and graphs, and how we can replicate features from social networks and recommendation systems. We'll also have a look at window functions (SQL:2003) and the advanced reporting features they make finally possible.
Asynchronous API in Java8, how to use CompletableFutureJosé Paumard
Slides of my talk as Devoxx 2015. How to set up asynchronous data processing pipelines using the CompletionStage / CompletableFuture API, including how to control threads and how to handle exceptions.
GraphQL is a query language for APIs and a runtime for fulfilling those queries. It gives clients the power to ask for exactly what they need, which makes it a great fit for modern web and mobile apps. In this talk, we explain why GraphQL was created, introduce you to the syntax and behavior, and then show how to use it to build powerful APIs for your data. We will also introduce you to AWS AppSync, a GraphQL-powered serverless backend for apps, which you can use to host GraphQL APIs and also add real-time and offline capabilities to your web and mobile apps. You can follow along if you have an AWS account – no GraphQL experience required!
Level: Beginner
Speaker: Rohan Deshpande - Sr. Software Dev Engineer, AWS Mobile Applications
A presentation of what are JavaScript Promises, what problems they solve and how to use them. Dissects some Bluebird features, the most complete Promise library available for NodeJS and browser.
You're looking for a fax solution? stable fax solution? Sangoma VoIP gateways let you send and receive faxes without any problem.
This webinar was held by SENA.
www.senatelecom.com
Login System with Windows/Microsoft Live using OAuth php and mysqlthesoftwareguy7
Login System with Windows/Microsoft Live using OAuth php and mysql. Create application to get Client ID and client Secret for using in web application.
We all have tasks from time to time for bulk-loading external data into MySQL. What's the best way of doing this? That's the task I faced recently when I was asked to help benchmark a multi-terrabyte database. We had to find the most efficient method to reload test data repeatedly without taking days to do it each time. In my presentation, I'll show you several alternative methods for bulk data loading, and describe the practical steps to use them efficiently. I'll cover SQL scripts, the mysqlimport tool, MySQL Workbench import, the CSV storage engine, and the Memcached API. I'll also give MySQL tuning tips for data loading, and how to use multi-threaded clients.
When does InnoDB lock a row? Multiple rows? Why would it lock a gap? How do transactions affect these scenarios? Locking is one of the more opaque features of MySQL, but it’s very important for both developers and DBA’s to understand if they want their applications to work with high performance and concurrency. This is a creative presentation to illustrate the scenarios for locking in InnoDB and make these scenarios easier to visualize. I'll cover: key locks, table locks, gap locks, shared locks, exclusive locks, intention locks, insert locks, auto-inc locks, and also conditions for deadlocks.
Many questions on database newsgroups and forums can be answered with uses of outer joins. Outer joins are part of the standard SQL language and supported by all RDBMS brands. Many programmers are expected to use SQL in their work, but few know how to use outer joins effectively.
Learn to use this powerful feature of SQL, increase your employability, and amaze your friends!
Karwin will explain outer joins, show examples, and demonstrate a Sudoku puzzle solver implemented in a single SQL query.
MySQL users commonly ask: Here's my table, what indexes do I need? Why aren't my indexes helping me? Don't indexes cause overhead? This talk gives you some practical answers, with a step by step method for finding the queries you need to optimize, and choosing the best indexes for them.
You find a column named EntityNum in a table you manage, but what data belongs in this column? Not every detail of usage is clear from just SQL data type and constraints. What is the sensible range of values? Unit of measure? How is the column used by applications? Who in the world knows? We need a way to add comments to the database schema, just as we would write comments in application code to document how programmers should use it. But comments are useful only if they're correct and current, and if they're easy to read and to update. Schemadoc is an experimental tool to help in these goals.
Using MySQL without Maatkit is like taking a photo without removing the camera's lens cap. Professional MySQL experts use this toolkit to help keep complex MySQL installations running smoothly and efficiently. This session will show you practical ways to use Maatkit every day.
MySQL exposes a collection of tunable parameters and indicators that is frankly intimidating. But a poorly tuned MySQL server is a bottleneck for your PHP application scalability. This session shows how to do InnoDB tuning and read the InnoDB status report in MySQL 5.5.
Software developers love tools for coding, debugging, testing, and configuration management. The more these tools improve the How of coding, the more we see that we're behind the curve on improving the What, Why, and When. If you've been on a project that seemed vague, adrift, and endless, this talk can help. Make your projects run SMART.
We all know how to define database indexes, but which indexes to define remains a mysterious art for most software developers. This talk will use general principles and specific scenarios to give you practical, step-by-step knowledge to turn a performance bottleneck into an epic win!
The most massive crime of identity theft in history was perpetrated in 2007 by exploiting an SQL Injection vulnerability. This issue is one of the most common and most serious threats to web application security. In this presentation, you'll see some common myths busted and you'll get a better understanding of defending against SQL injection.
A comparison of different solutions for full-text search in web applications using PostgreSQL and other technology. Presented at the PostgreSQL Conference West, in Seattle, October 2009.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
2. Me
• Software developer
• C, Java, Perl, PHP, Ruby
• SQL maven
• MySQL Consultant at Percona
• Author of SQL Antipatterns:
Avoiding the Pitfalls of
Database Programming
www.percona.com
3. Problem
• Store & query hierarchical data
- Categories/subcategories
- Bill of materials
- Threaded discussions
www.percona.com
4. Example: Bug Report
Comments
(1) Fran:
What’s the cause
of this bug?
(2) Ollie: (4) Kukla:
I think it’s a null We need to check
pointer. valid input.
(3) Fran: (6) Fran:
(5) Ollie:
No, I checked for Yes, please add a
Yes, that’s a bug.
that. check.
(7) Kukla:
That fixed it.
www.percona.com
7. Adjacency List
• Naive solution nearly everyone uses
• Each entry knows its immediate parent
comment_id parent_id author comment
1 NULL Fran What’s the cause of this bug?
2 1 Ollie I think it’s a null pointer.
3 2 Fran No, I checked for that.
4 1 Kukla We need to check valid input.
5 4 Ollie Yes, that’s a bug.
6 4 Fran Yes, please add a check
7 6 Kukla That fixed it.
www.percona.com
8. Insert a New Node
INSERT INTO Comments (parent_id, author, comment)
VALUES (5, ‘Fran’, ‘I agree!’);
(1) Fran:
What’s the cause of
this bug?
(2) Ollie: (4) Kukla:
I think it’s a null We need to check
pointer. valid input.
(3) Fran: (6) Fran:
(5) Ollie:
No, I checked for Yes, please add a
Yes, that’s a bug.
that. check.
(7) Kukla:
That fixed it.
www.percona.com
9. Insert a New Node
INSERT INTO Comments (parent_id, author, comment)
VALUES (5, ‘Fran’, ‘I agree!’);
(1) Fran:
What’s the cause of
this bug?
(2) Ollie: (4) Kukla:
I think it’s a null We need to check
pointer. valid input.
(3) Fran: (6) Fran:
(5) Ollie:
No, I checked for Yes, please add a
Yes, that’s a bug.
that. check.
(8) Fran: (7) Kukla:
I agree! That fixed it.
www.percona.com
10. Move a Node or Subtree
UPDATE Comments SET parent_id = 3
WHERE comment_id = 6;
(1) Fran:
What’s the cause of
this bug?
(2) Ollie: (4) Kukla:
I think it’s a null We need to check
pointer. valid input.
(3) Fran: (6) Fran:
(5) Ollie:
No, I checked for Yes, please add a
Yes, that’s a bug.
that. check.
(7) Kukla:
That fixed it.
www.percona.com
11. Move a Node or Subtree
UPDATE Comments SET parent_id = 3
WHERE comment_id = 6;
(1) Fran:
What’s the cause of
this bug?
(2) Ollie: (4) Kukla:
I think it’s a null We need to check
pointer. valid input.
(3) Fran:
(5) Ollie:
No, I checked for
Yes, that’s a bug.
that.
www.percona.com
12. Move a Node or Subtree
UPDATE Comments SET parent_id = 3
WHERE comment_id = 6;
(1) Fran:
What’s the cause of
this bug?
(2) Ollie: (4) Kukla:
I think it’s a null We need to check
pointer. valid input.
(3) Fran:
(5) Ollie:
No, I checked for
Yes, that’s a bug.
that.
www.percona.com
13. Move a Node or Subtree
UPDATE Comments SET parent_id = 3
WHERE comment_id = 6;
(1) Fran:
What’s the cause of
this bug?
(2) Ollie: (4) Kukla:
I think it’s a null We need to check
pointer. valid input.
(3) Fran:
(5) Ollie:
No, I checked for
Yes, that’s a bug.
that.
(6) Fran:
Yes, please add a
check.
(7) Kukla:
That fixed it. www.percona.com
14. Query Immediate Child/Parent
• Query a node’s children:
SELECT * FROM Comments c1
LEFT JOIN Comments c2
ON (c2.parent_id = c1.comment_id);
• Query a node’s parent:
SELECT * FROM Comments c1
JOIN Comments c2
ON (c1.parent_id = c2.comment_id);
www.percona.com
15. Can’t Handle Deep Trees
SELECT * FROM Comments c1
LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id)
LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id)
LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id)
LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id)
LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id)
LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id)
LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id)
LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id)
LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id)
...
www.percona.com
16. Can’t Handle Deep Trees
SELECT * FROM Comments c1
LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id)
LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id)
LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id)
LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id)
LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id)
LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id)
LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id)
LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id)
LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id)
...
it still doesn’t support
unlimited depth!
www.percona.com
17. SQL-99 recursive syntax
WITH [RECURSIVE] CommentTree
(comment_id, bug_id, parent_id, author, comment, depth)
AS (
SELECT *, 0 AS depth FROM Comments
WHERE parent_id IS NULL
UNION ALL
SELECT c.*, ct.depth+1 AS depth FROM CommentTree ct
JOIN Comments c ON (ct.comment_id = c.parent_id)
)
SELECT * FROM CommentTree WHERE bug_id = 1234;
✓
PostgreSQL, Oracle 11g,
IBM DB2, Microsoft SQL
Server, Apache Derby ✗ MySQL, SQLite, Informix,
Firebird,etc.
www.percona.com
19. Path Enumeration
• Store chain of ancestors in each node
comment_id path author comment
1 1/ Fran What’s the cause of this bug?
2 1/2/ Ollie I think it’s a null pointer.
3 1/2/3/ Fran No, I checked for that.
4 1/4/ Kukla We need to check valid input.
5 1/4/5/ Ollie Yes, that’s a bug.
6 1/4/6/ Fran Yes, please add a check
7 1/4/6/7/ Kukla That fixed it.
www.percona.com
20. Path Enumeration
• Store chain of ancestors in each node
good for
breadcrumbs
comment_id path author comment
1 1/ Fran What’s the cause of this bug?
2 1/2/ Ollie I think it’s a null pointer.
3 1/2/3/ Fran No, I checked for that.
4 1/4/ Kukla We need to check valid input.
5 1/4/5/ Ollie Yes, that’s a bug.
6 1/4/6/ Fran Yes, please add a check
7 1/4/6/7/ Kukla That fixed it.
www.percona.com
21. Query Ancestors and Subtrees
• Query ancestors of comment #7:
SELECT * FROM Comments
WHERE ‘1/4/6/7/’ LIKE path || ‘%’;
• Query descendants of comment #4:
SELECT * FROM Comments
WHERE path LIKE ‘1/4/%’;
www.percona.com
22. Add a New Child of #7
INSERT INTO Comments (author, comment)
VALUES (‘Ollie’, ‘Good job!’);
SELECT path FROM Comments
WHERE comment_id = 7;
UPDATE Comments
SET path = $parent_path || LAST_INSERT_ID() || ‘/’
WHERE comment_id = LAST_INSERT_ID();
www.percona.com
24. Nested Sets
• Each comment encodes its descendants
using two numbers:
- A comment’s left number is less than all numbers
used by the comment’s descendants.
- A comment’s right number is greater than all
numbers used by the comment’s descendants.
- A comment’s numbers are between all
numbers used by the comment’s ancestors.
www.percona.com
25. What Does This Look Like?
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
26. What Does This Look Like?
(1) Fran:
What’s the
cause of this
bug?
1 14
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
2 5 6 13
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
3 4 7 8 9 12
(7) Kukla:
That fixed it.
10 11
www.percona.com
27. What Does This Look Like?
comment_id nsleft nsright author comment
1 1 14 Fran What’s the cause of this bug?
2 2 5 Ollie I think it’s a null pointer.
3 3 4 Fran No, I checked for that.
4 6 13 Kukla We need to check valid input.
5 7 8 Ollie Yes, that’s a bug.
6 9 12 Fran Yes, please add a check
7 10 11 Kukla That fixed it.
www.percona.com
28. What Does This Look Like?
comment_id nsleft nsright author comment
1 1 14 Fran What’s the cause of this bug?
2 2 5 Ollie I think it’s a null pointer.
3 3 4 Fran No, I checked for that.
4 6 13 Kukla We need to check valid input.
5 7 8 Ollie Yes, that’s a bug.
6 9 12 Fran Yes, please add a check
7 10 11 Kukla That fixed it.
these are not
foreign keys
www.percona.com
29. Query Ancestors of #7
(1) Fran: ancestors
What’s the
cause of this
bug?
1 14
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
2 5 6 13 child
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
3 4 7 8 9 12
(7) Kukla:
That fixed it.
10 11
www.percona.com
30. Query Ancestors of #7
SELECT * FROM Comments child
JOIN Comments ancestor ON child.nsleft
BETWEEN ancestor.nsleft AND ancestor.nsright
WHERE child.comment_id = 7;
www.percona.com
31. Query Subtree Under #4
(1) Fran:
What’s the
parent
cause of this
bug?
1 14
(2) Ollie: (4) Kukla: descendants
We need to
I think it’s a null
pointer. check valid
input.
2 5 6 13
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
3 4 7 8 9 12
(7) Kukla:
That fixed it.
10 11
www.percona.com
32. Query Subtree Under #4
SELECT * FROM Comments parent
JOIN Comments descendant ON descendant.nsleft
BETWEEN parent.nsleft AND parent.nsright
WHERE parent.comment_id = 4;
www.percona.com
33. Insert New Child of #5
(1) Fran:
What’s the
cause of this
bug?
1 14
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
2 5 6 13
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
3 4 7 8 9 12
(7) Kukla:
That fixed it.
10 11
www.percona.com
34. Insert New Child of #5
(1) Fran:
What’s the
cause of this
bug?
1 16
14
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
2 5 6 15
13
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
3 4 7 10 11
8 9 14
12
(7) Kukla:
That fixed it.
12
10 13
11
www.percona.com
35. Insert New Child of #5
(1) Fran:
What’s the
cause of this
bug?
1 16
14
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
2 5 6 15
13
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
3 4 7 10 11
8 9 14
12
(8) Fran: (7) Kukla:
I agree! That fixed it.
8 9 10
12 13
11
www.percona.com
36. Insert New Child of #5
UPDATE Comments
SET nsleft = CASE WHEN nsleft >= 8 THEN nsleft+2
ELSE nsleft END,
nsright = nsright+2
WHERE nsright >= 7;
INSERT INTO Comments (nsleft, nsright, author, comment)
VALUES (8, 9, 'Fran', 'I agree!');
• Recalculate left values for all nodes to the right of
the new child. Recalculate right values for all
nodes above and to the right.
www.percona.com
37. Query Immediate Parent of #6
(1) Fran:
What’s the
cause of this
bug?
1 14
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
2 5 6 13
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
3 4 7 8 9 12
(7) Kukla:
That fixed it.
10 11
www.percona.com
38. Query Immediate Parent of #6
• Parent of #6 is an ancestor who has no
descendant who is also an ancestor of #6.
SELECT parent.* FROM Comments AS c
JOIN Comments AS parent
ON (c.nsleft BETWEEN parent.nsleft AND parent.nsright)
LEFT OUTER JOIN Comments AS in_between
ON (c.nsleft BETWEEN in_between.nsleft AND in_between.nsright
AND in_between.nsleft BETWEEN parent.nsleft AND parent.nsright)
WHERE c.comment_id = 6 AND in_between.comment_id IS NULL;
www.percona.com
39. Query Immediate Parent of #6
• Parent of #6 is an ancestor who has no
descendant who is also an ancestor of #6.
SELECT parent.* FROM Comments AS c
JOIN Comments AS parent
ON (c.nsleft BETWEEN parent.nsleft AND parent.nsright)
LEFT OUTER JOIN Comments AS in_between
ON (c.nsleft BETWEEN in_between.nsleft AND in_between.nsright
AND in_between.nsleft BETWEEN parent.nsleft AND parent.nsright)
WHERE c.comment_id = 6 AND in_between.comment_id IS NULL;
querying immediate child
is a similar problem
www.percona.com
41. Closure Table
CREATE TABLE TreePaths (
ancestor INT NOT NULL,
descendant INT NOT NULL,
PRIMARY KEY (ancestor, descendant),
FOREIGN KEY(ancestor)
REFERENCES Comments(comment_id),
FOREIGN KEY(descendant)
REFERENCES Comments(comment_id)
);
www.percona.com
42. Closure Table
• Many-to-many table
• Stores every path from each node
to each of its descendants
• A node even connects to itself
www.percona.com
43. Closure Table illustration
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
44. Closure Table illustration
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
45. Closure Table illustration
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
46. Closure Table illustration
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
47. What Does This Look Like?
ancestor descendant
comment_id author comment 1 1
1 2
1 Fran What’s the cause of this
bug? 1 3
1 4
2 Ollie I think it’s a null pointer. 1 5
3 Fran No, I checked for that. 1 6
1 7
4 Kukla We need to check valid
2 2
input.
2 3
5 Ollie Yes, that’s a bug. 3 3
4 4
6 Fran Yes, please add a check
4 5
7 Kukla That fixed it. 4 6
4 7
5 5
requires O(n²) rows 6
6
6
7
7 7
www.percona.com
48. What Does This Look Like?
ancestor descendant
comment_id author comment 1 1
1 2
1 Fran What’s the cause of this
bug? 1 3
1 4
2 Ollie I think it’s a null pointer. 1 5
3 Fran No, I checked for that. 1 6
1 7
4 Kukla We need to check valid
2 2
input.
2 3
5 Ollie Yes, that’s a bug. 3 3
4 4
6 Fran Yes, please add a check
4 5
7 Kukla That fixed it. 4 6
4 7
5 5
requires O(n²) rows 6
6
6
7
7 7
(but far fewer in practice)
www.percona.com
49. Query Descendants of #4
SELECT c.* FROM Comments c
JOIN TreePaths t
ON (c.comment_id = t.descendant)
WHERE t.ancestor = 4;
www.percona.com
50. Paths Starting from #4
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
51. Query Ancestors of #6
SELECT c.* FROM Comments c
JOIN TreePaths t
ON (c.comment_id = t.ancestor)
WHERE t.descendant = 6;
www.percona.com
52. Paths Terminating at #6
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
53. Insert New Child of #5
INSERT INTO Comments
VALUES (8, ‘Fran’, ‘I agree!’);
INSERT INTO TreePaths (ancestor, descendant)
SELECT ancestor, 8 FROM TreePaths
WHERE descendant = 5
UNION ALL SELECT 8, 8;
www.percona.com
54. Copy Paths from Parent
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(8) Fran: (7) Kukla:
I agree! That fixed it.
www.percona.com
55. Copy Paths from Parent
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(8) Fran: (7) Kukla:
I agree! That fixed it.
www.percona.com
56. Copy Paths from Parent
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(8) Fran: (7) Kukla:
I agree! That fixed it.
www.percona.com
58. Delete Paths Terminating at #7
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
59. Delete Paths Terminating at #7
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
60. Delete Paths Terminating at #7
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
www.percona.com
61. Delete Paths Terminating at #7
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
62. Delete Subtree Under #4
DELETE FROM TreePaths
WHERE descendant IN
(SELECT descendant FROM TreePaths
WHERE ancestor = 4);
www.percona.com
63. Delete Any Paths Under #4
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
64. Delete Any Paths Under #4
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie:
We need to
I think it’s a null
pointer. check valid
input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
65. Delete Any Paths Under #4
(1) Fran:
What’s the
cause of this
bug?
(2) Ollie:
I think it’s a null
pointer.
(3) Fran:
No, I checked
for that.
www.percona.com
66. Delete Any Paths Under #4
(1) Fran:
What’s the
cause of this
bug?
(4) Kukla:
(2) Ollie: We need to
I think it’s a null check valid
pointer. input.
(3) Fran: (5) Ollie: (6) Fran:
No, I checked Yes, that’s a Yes, please add
for that. bug. a check.
(7) Kukla:
That fixed it.
www.percona.com
67. Path Length
• Add a length column ancestor descendant length
• MAX(length) is depth of tree
1 1 0
1 2 1
1 3 2
• Makes it easier to query 1
1
4
5
1
2
immediate parent or child: 1 6 2
1 7 3
SELECT c.* 2 2 0
FROM Comments c 2 3 1
JOIN TreePaths t 3 3 0
4 4 0
ON (c.comment_id = t.descendant) 4 5 1
WHERE t.ancestor = 4 4 6 1
AND t.length = 1; 4 7 2
5 5 0
6 6 0
6 7 1
7 7 0
www.percona.com
68. Path Length
• Add a length column ancestor descendant length
• MAX(length) is depth of tree
1 1 0
1 2 1
1 3 2
• Makes it easier to query 1
1
4
5
1
2
immediate parent or child: 1 6 2
1 7 3
SELECT c.* 2 2 0
FROM Comments c 2 3 1
JOIN TreePaths t 3 3 0
4 4 0
ON (c.comment_id = t.descendant) 4 5 1
WHERE t.ancestor = 4 4 6 1
AND t.length = 1; 4 7 2
5 5 0
6 6 0
6 7 1
7 7 0
www.percona.com
69. Choosing the Right Design
Design Tables Query Query Delete Insert Move Referential
Child Subtree Node Node Subtree Integrity
Adjacency 1 Easy Hard Easy Easy Easy Yes
List
Path 1 Hard Easy Easy Easy Easy No
Enumeration
Nested Sets 1 Hard Easy Hard Hard Hard No
Closure 2 Easy Easy Easy Easy Easy Yes
Table
www.percona.com
71. Hierarchical Test Data
• Integrated Taxonomic Information System
- http://itis.gov/
- Free authoritative taxonomic information on plants,
animals, fungi, microbes
- 518,756 scientific names (as of Feb 2011)
www.percona.com
74. California Poppy: ITIS Entry
SELECT * FROM Hierarchy
WHERE hierarchy_string LIKE ‘%-18956’;
hierarchy_string
202422-564824-18061-18063-18064-18879-18880-18954-18956
www.percona.com
75. California Poppy: ITIS Entry
SELECT * FROM Hierarchy
WHERE hierarchy_string LIKE ‘%-18956’;
hierarchy_string
202422-564824-18061-18063-18064-18879-18880-18954-18956
ITIS data uses ...but I converted
path enumeration it to closure table
www.percona.com
76. Hierarchical Data Classes
abstract class ZendX_Db_Table_TreeTable
extends Zend_Db_Table_Abstract
{
public function fetchTreeByRoot($rootId, $expand)
public function fetchBreadcrumbs($leafId)
}
www.percona.com
77. Hierarchical Data Classes
class ZendX_Db_Table_Row_TreeRow
extends Zend_Db_Table_Row_Abstract
{
public function addChildRow($childRow)
public function getChildren()
}
class ZendX_Db_Table_Rowset_TreeRowset
extends Zend_Db_Table_Rowset_Abstract
{
public function append($row)
}
www.percona.com
78. Using TreeTable
class ItisTable extends ZendX_Db_Table_TreeTable
{
protected $_name = “longnames”;
protected $_closureName = “treepaths”;
}
$itis = new ItisTable();
www.percona.com
80. Breadcrumbs SQL
SELECT a.* FROM longnames AS a
INNER JOIN treepaths AS c ON a.tsn = c.a
WHERE (c.d = 18956)
ORDER BY c.l DESC
www.percona.com
81. How Does it Perform?
• Query profile = 0.0006 sec
• MySQL EXPLAIN:
table type key ref rows extra
c ref tree_dl const 9 Using where; Using index
a eq_ref primary c.a 1
www.percona.com
82. Dump Tree
$tree = $itis->fetchTreeByRoot(18880); // Papaveraceae
print_tree($tree);
function print_tree($tree, $prefix = ‘’)
{
print “{$prefix} {$tree->completename}n”;
foreach ($tree->getChildren() as $child) {
print_tree($child, “{$prefix} ”);
}
}
www.percona.com
84. Dump Tree SQL
SELECT d.*, p.a AS _parent
FROM treepaths AS c
INNER JOIN longnames AS d ON c.d = d.tsn
LEFT JOIN treepaths AS p ON p.d = d.tsn
AND p.a IN (202422, 564824, 18053, 18020)
AND p.l = 1
WHERE (c.a = 202422)
AND (p.a IS NOT NULL OR d.tsn = 202422)
ORDER BY c.l, d.completename;
www.percona.com
85. Dump Tree SQL
show children
SELECT d.*, p.a AS _parent of these nodes
FROM treepaths AS c
INNER JOIN longnames AS d ON c.d = d.tsn
LEFT JOIN treepaths AS p ON p.d = d.tsn
AND p.a IN (202422, 564824, 18053, 18020)
AND p.l = 1
WHERE (c.a = 202422)
AND (p.a IS NOT NULL OR d.tsn = 202422)
ORDER BY c.l, d.completename;
www.percona.com
86. How Does it Perform?
• Query profile = 0.20 sec on Macbook Pro
• MySQL EXPLAIN:
table type key ref rows extra
c ref tree_adl const 114240 Using index; Using
temporary; Using filesort
d eq_ref primary c.d 1
p ref tree_dl c.d, 1 Using where; Using index
const
www.percona.com