GitLab, the open source alternative to GitHub written in Rails, does not scale automatically out of the box, as it stores its git repositories on a single filesystem, making storage capabilities hard to expand. Rather than attaching a NAS server, we decided to use a cloud-based object storage (such as S3) to replace the FS. This introduced changes to both the Ruby layer and the deeper C layers. In this talk, we will show the audience how we did the change and overcame the performance loss introduced by network I/O.
28. GitLab Sharding
• Introduces Sidekiq sharing as well
• Introduces many changes to the application
layer as well
- need to have super user authentication
- need to eliminate every page with requests
across shards (e.g. admin page of repo sizes)
• Tedious changes on the application level.
29. How to deal with FS?
• 🤔 Hardware Network-Attached Storage?
• 🤔 Software Network-Attached Storage?
• 🤔 Remote Procedure Calls to FS shards?
• 🤔 Kill it?
30. • Hard-NAS: Alibaba has non-IOE policies.
• Soft-NAS: Alibaba does not have it yet.
• RPC: GitRPC? Good. GitHub does that.
• Kill FS: Use the cloud. Try something new!
31. by “cloud” we mean…
• Amazon S3: Amazon Simple Storage Service
• Alibaba OSS: Alibaba Object Storage Service
32. libgit2 git grit
• used in wiki’s
• via gollum-lib
• via gollum-grit_adapter
• eliminate-able via
gollum-rugged_adapter
gitlab-rails
33. gitlab-rails
libgit2 git
• via gitlab_git
• via rugged
• backend
replace-able
• via gitlab-shell
• via gitlab-workhorse
• via popen
• backend
hard-to-replace (FS)
grit
72. • develop libgit2 backends for AWS S3
• gitlab: favour libgit2, eliminate direct calls to git
• gitlab: add settings to choose backends
• gollum: use rugged as the default
• libgit2: improve performance, e.g. pack builder