Lei Wang China Everbright Bank
Track 2: Ecology and Solutions
https://open.mi.com/conference/hbasecon-asia-2019
THE COMMUNITY EVENT FOR APACHE HBASE™
July 20th, 2019 - Sheraton Hotel, Beijing, China
https://hbase.apache.org/hbaseconasia-2019/
6. Research Background
Existing Products or Solutions
1. Native Filter,Low performance
2. HBase +Solr(ES) , Complex architecture / Performance is not good enough / High maintenance cost
3. Phoenix,Heavy Solution / Community inactivity / Imperfect function
4. HBase On Cloud,Unable to use because of security requirements
Our requirements
1. Non-Invasive
2. High performance
3. Universal
4. Simple architecture
5. Support transaction consistency
7. Pharos
1. Name
comes from English word ‘pharos’
2. Business Scenarios
•Read Only , T+1 Batch Load
•Read and Less Write ( Experimental )
3. Design Principle
Non-Invasive, Simple architecture
8. Research Process
Startup
V0.1 Release V0.22 Release
V0.3
Single Index
Multi Condition
Multi Data Type
V0.2 Release
Multi Index
Sorting
Paging
Cache
Index Builder Improvement
Refactoring Code
Bitmap Index
CBO Improvement
More Complex Conditions
Todo…
Transaction Consistency Index
2018.4 2018.11 2019.3 2019.7 2019.11
9. Pharos V0.22 Features (on HBase 1.2.6 or CDH 5.8.3-HBase 1.2.0)
1. Single Index(single column、multi column),Multi Index.
2. Paging, Sorting.
3. Multi Condition Query,including equal, less (equal), greater (equal).
4. AND / OR logic operation.
5. Multiple Data Type, including Char, Date ,Double and so on.
6. Simple Function Compute, for example record count.
7. Batch Index Creation.
14. Global Index VS Partition(Local)Index
Global Index
• Support unique index
• Index creating and updating will be part of distributed
transactions, performance is not good.
• Query will cross different nodes, so performance may not
be good
Partition Index
• Index and data are co-distribution ,so queries can be pushed down to
each node. We can get good performance.
• Avoiding distributed transaction
• Not support unique index or other global constraints.
15. Storage Policy
Shadow Column Family
Index and data are exist the same region but in different column
family. We just need to control the generating logic of the index start
rowkey. It is un-invasive.
Single Index Table
Region is the smallest unit that is balanced. We must
guarantee that an index region is distributed with the
corresponding data region. So we must the modify the
balancer . It is invasive.
16. Index Key
1. Start key, keep index co-distribution
with data
2. Index name / number
3. Indexed column value
4. Reference data row key
Index Value
1. Version info
2. Metadata for deserialization
3. Transaction flag
Index Data Structure
17. Client as Global Coordinator, Keep Arch Simple
Two-Phase Sorting
1. Region Side Sorting
Base on natural sequence of index
2. Client Side Sorting
Merge results from regions, If data skew occurs,
query again.
Sorting Mechanisms
18. Reason
1. Paging is a universal requirement.
2. Always, the matched index is greater than the memory.
Design Strategy
Adding global session, shielding internal complexity.
One session breakpoint mapping multi region’s breakpoint.
Implement
1.Assuming that the data is evenly distributed, the page size
is spread to each region.
2.Region side, we control return indexes number and cache
the breakpoint.
3.Client side, merge result, if not enough then query again .
Paging Mechanisms
19. Choose local cache instead of distributed cache
Keep Arch Simple
Client side cache + Server side cache
Cache Mechanisms
20. Avoid Rebuilding index due to Region Splitting.
For different rowkey design, Bulkload may lead to
region splitting.
After data loading, stable regions can be obtained.
So we can create indexes and keep co-distribution
with the reference data.
Index Builder
23. Transaction Consistency
Inspire by Google’s Percolator
Bob -> Joe $7
Deformation of 2 Phase Commit
The state of the primary data is the state of the
transaction.
In the write process, only modify the primary state.
In the subsequent query, the state of second data
can be modified asynchronously based on the
primary.
Step 4 -> Transaction complete
24. Transaction Consistency
In the write process, we can
complete majority consistency, but
not all;
In the read process, we can confirm
transaction state by data row state,
then update index state.