A presentation in Apache Pegasus meetup in 2022 from Wei Wang.
Apache Pegasus is a horizontally scalable, strongly consistent and high-performance key-value store.
Know more about Pegasus https://pegasus.apache.org, https://github.com/apache/incubator-pegasus
19. 数据生成 Pegasus-Spark
Convert to SST file for Bulk load
node
node
node
node
node
node
Transform(Pegasus-Spark)
HDFS
(sst file)
Distinct
Repartition
Sort
original
data
original
data