This presentation unveils roboMUA, an avant-garde AI solution poised to revolutionize the beauty industry by championing inclusivity and diversity. At its core, roboMUA addresses the critical issue of skin diversity in computer vision technology. Traditional algorithms falter due to a lack of diverse data, leading to biased and inaccurate results. This not only perpetuates customer dissatisfaction but also contributes to financial and environmental waste.
roboMUA’s mission is to dismantle these barriers by curating the most comprehensive skin tone datasets, developing predictive and generative AI models that cater to an extensive range of skin shades, and fostering a community that connects users with beauty professionals attuned to their unique needs.
The market opportunity for such innovation is vast, with the beauty industry valued at approximately $508 billion. roboMUA’s technology has the potential to tap into this lucrative market by offering personalized product matches, reducing returns, and enhancing customer satisfaction.
The presentation delves into the specifics of roboMUA’s offerings, including foundation shade matching for over 100 skin tones, virtual try-ons for hairstyles, and AI-driven shapewear recommendations. It also highlights the platform’s ability to bridge the gap between tech and local beauty expertise through a sophisticated chatbot service.
Furthermore, the deck presents a compelling use of NVIDIA technologies such as CUDA & cuBLAS for computing masked mean colors in the application of virtual foundation as well as TensorRT which allows us to accurately detect different faces and locate facial landmarks precisely for makeup applications like lipstick. It does this by emphasizing the transformative impact roboMUA could have on the industry.
In essence, roboMUA embodies a future where technology mirrors the rich tapestry of human diversity, empowering individuals to express their authentic beauty without constraints. The presentation encapsulates this vision, demonstrating how roboMUA is not just a product but a movement towards a more inclusive and equitable beauty landscape.
2. THE POWER OF NVIDIA
TECHNOLOGIES
NVIDIA technologies play a crucial role in boosting online
beauty retail by enhancing the user experience and
improving the efficiency of key processes. With the use of
NVIDIA CUDA and cuBLAS, we can compute masked mean
colors for a seamless foundation application.
CUDA & cuBLAS
TensorRT
Enables accurate face detection and
landmark prediction, ensuring precise
and efficient beauty analysis and virtual
makeup application.
3. Compute Unified Device Architecture (CUDA)
CUDA is NVIDIA’s parallel computing platform and programming model.
It allows software to use NVIDIA graphics processing units (GPUs) for general-purpose processing, an approach
called general-purpose computing on GPUs (GPGPU).
These algorithms are accelerated by thousands of parallel threads running on GPUs.
With CUDA, we leverage the massive parallelism of GPUs to accelerate image segmentation processes.
This is particularly useful for computationally intensive tasks like processing videos for live virtual makeup try-ons.
Overall, CUDA is a powerful platform that helps unlock the full potential of virtual try-on applications and take
advantage of the incredible performance of NVIDIA GPUs.
4. Basic Linear Algebra Subprograms (cuBLAS)
cuBLAS is an implementation of BLAS (Basic Linear Algebra Subprograms) on top of the NVIDIA CUDA runtime.
It allows us to access the computational resources of the NVIDIA Graphics Processing Unit (GPU).
cuBLAS is designed to leverage NVIDIA GPUs for various matrix multiplication operations.
It provides a set of basic linear algebra operations, such as vector addition and scaling, vector and matrix
multiplication, and matrix inversion.
Using cuBLAS can significantly speed up computations, especially when dealing with large matrices or when high
precision is required.
We compute the masked mean colors by performing a matrix-vector multiplication on the color and mask data of an
image. These operations are parallelized and performed more efficiently using the CUDA platform and the cuBLAS
library. This is particularly useful in image processing tasks such as virtual makeup applications.
6. FACE DETECTION with TensorRT
Face Detection involves finding the bounding box coordinates of each face present in the input image. TensorRT uses a
deep learning model, such as a convolutional neural network (CNN), to perform this task efficiently and accurately.
Input Image Pre Processing Face Detection
TensorRT is a high-performance deep learning inference optimizer and runtime library developed by NVIDIA for the
deployment of trained models on GPUs.
It is designed to optimize and accelerate the inference of deep learning models making it an excellent choice for real-
time face detection. It achieves this by combining the following:
Layer Fusion: convolution, bias, and ReLU layers fused to form a single layer
Precision Calibration: from FP32 precision to lower precisions like FP16 or INT8
Kernel Auto-Tuning: select the best layers, algorithms, and optimal batch size based on the target GPU platform
Dynamic Tensor Memory Management: optimize memory usage for tensor by allocation based on size and lifetime
Multi-Stream Execution: parallel execution of independent inference tasks
...
7. LANDMARK PREDICTION with TensorRT
Landmark prediction involves identifying and localizing specific facial landmarks that provide valuable information
about the shape and structure of the face. TensorRT can be used to optimize deep learning models for landmark
prediction, enabling real-time performance on GPUs.
Landmark
Prediction
Output Image
TensorRT is an optimization tool provided by NVIDIA that accelerates deep learning inference on GPUs, making it ideal
for landmark prediction. By using TensorRT we achieve lower latency and higher throughput from our models when
deployed for inference.
Since TensorRT is a general-purpose deep learning inference optimizer and runtime that can be used with any deep
learning model, we also utilize it for inference on our skin shade detection models which are built with TensorFlow and
PyTorch.
...
9. THE ART OF BLUSH
APPLICATION
Attaining a natural & flawless blush application require
crucial steps like:
Prepping the Skin
Choosing the Right Shade
Picking the Right Brush
Applying to the Apples of the Cheeks
Blending for a Natural Finish
Building & Layering
Set with Setting Spray
With GPU-accelerated Gaussian blur, our image
segmentation technology can enhance the visual
representation of the blush application, providing a
realistic and detailed guide for makeup enthusiasts.
Blush application is an essential step in any makeup
routine as it adds a healthy flush of color to the cheeks
and enhances the overall complexion.
DIFFERENT FACES. DIFFERENT SHAPES.
11. ACHIEVING
REALISTIC
MAKEUP LOOKS
The following are important when it comes
to achieving realistic makeup looks and
effects:
Color Matching
Intensity Control
Opacity Control
Texture & Finish
Enhancing your natural beauty comes easy
when the above are taken into proper
account and so we ensure the code for virtual
try-on (VTO) does this considerably well too.
VIRTUAL LIPSTICK APPLICATION
12. Virtual Lipstick By Blending the Original &
Transformed Pixels using Opacity
CODE
SNIPPET
13. THE IMPACT OF AI
MODELS & GPU
TECHNOLOGY
The integration of AI models and GPU technology in online
beauty retail offers several benefits:
Enhanced Personalization
Virtual Try-On
Improved Customer Experience
Increased Sales
BENEFITS
The impact of AI models and GPU technology in online
beauty retail is still evolving.
As technology advances, we can expect even more
sophisticated AI models and GPU capabilities.
This will further enhance the customer experience,
drive innovation, and shape the future of online beauty
retail.
FUTURE POTENTIAL
14. IMPLEMENTING
AI-POWERED
SOLUTIONS
Using AI-powered recommendation systems to suggest
personalized beauty products to customers. Some of the
beauty brands we have partnered with are:
Nude Barre (Shapewear & Bodywear)
MyNudeShade (Custom Tights)
Using facial recognition and image processing to
overlay virtual makeup such as lipstick and even
different hairstyles and hair colors on customers’ heads
and faces.
PRODUCT RECOMMENDATION
VIRTUAL TRY-ON
15. DEPLOYMENT & OVERCOMING CHALLENGES
Models are serialized for deployment to production, including skin shade and undertone detection models and face
detection and landmark prediction models used in VTO features.
TensorRT is used for inference when utilizing these models.
Models and VTO features are served via APIs to our web app and Android and iOS apps.
Models are also available as published APIs for developers to integrate into their systems.
Continuous building of image datasets is needed due to the lack of sufficient data for problem space.
Different lighting conditions, unusual head poses, tilted heads, and the zoom level of the uploaded image can affect
VTO results.
Users are provided with specific directions on how to take pictures for VTO.
A model is being developed to instruct users in real-time for proper placement of face and head before a picture is
taken.
A pre-defined frame is being created for users to place their faces in for optimum results.
MODEL DEPLOYMENT & ACCESSIBILITY
CHALLENGES ENCOUNTERED
MITIGATION STRATEGIES
16. ENHANCING VIRTUAL TYR-ON: IMPROVEMENTS
Our lipstick feature is one of our best VTO solutions, boasting an accuracy of natural application at approximately
90%.
A critical area for improvement is to fix white patches that sometimes appear on the corner of the lips with slightly
parted lips.
Balancing the right intensity and opacity is key. We’re considering adding slider options so users can freely
customize these factors to fit the current image, as these factors in the look of the lipstick are not universal.
The biggest challenge for the blush application is accurately capturing the defined contour of the face to correctly
place the blush on the apples of the cheeks.
This issue is compounded by the fact that 5 main facial shape types need to be catered for oval, round, square,
oblong, and heart. These greatly impact how the model applies the blush to the face in terms of correct placement
and blend.
The accuracy of the current solution sits at about 68% since one size does not fit all in this scenario.
LIPSTICK APPLICATION: PRECISION & CUSTOMIZATION
BLUSH APPLICATION: CHALLENGES & SOLUTIONS
17. LEVERAGING TENSORS IN VTO FEATURES
The introduction of NVIDIA technologies greatly improved the processing and response times for our VTO features,
reducing it from ~10 seconds initially to about ~<1 second currently, in most cases.
We have several VTO features that are built using tensors and thus benefit from NVIDIA technologies to rapidly
provide VTO capabilities to users. These include hair dye/color, hairstyles, lipstick, blush, and foundation with more
in the development pipeline.
18. FINAL THOUGHTS
Augmented Reality Integration: Image
segmentation can be integrated with augmented
reality technology to provide virtual try-on
experiences, allowing customers to see how
products will look on them before making a
purchase.
Virtual Beauty Advisors: By analyzing customer
images and preferences, image segmentation can
be used to create virtual beauty advisors that
provide personalized product recommendations
and application techniques.
FUTURE OPPORTUNITIES
CONCLUSION
Image segmentation has the potential to revolutionize the
online beauty retail industry by improving product discovery,
personalization, and even visual search capabilities. By
leveraging this technology, beauty retailers can enhance the
customer experience and drive sales.