SlideShare a Scribd company logo
1 of 7
Download to read offline
Accelerate natural language processing with AWS
EC2 M7i instances featuring 4th Gen Intel Xeon
Scalable processors
Compared to AWS M7g instances with AWS Graviton3 processors,
AWS M7i instances with Intel Xeon Scalable processors analyzed up
to 10.65 times the sentences per second and delivered up to 8.62
times the performance per dollar in RoBERTa model testing
Natural language processing (NLP) supports many everyday applications, from search result
suggestions to text autocorrections to text generation chatbots. Deep learning frameworks for NLP,
such as Bidirectional Encoder Representations from Transformers (BERT) and its variations, demand
robust compute power to analyze text and make predictions. For businesses running these workloads
in the cloud, selecting the right instance can enable faster text analysis.
Using Intel®
Extension for PyTorch, we tested the deep learning performance of two types of Amazon
Web Services (AWS) Elastic Cloud Compute (EC2) instances: M7i instances enabled by 4th Gen Intel
Xeon®
Scalable processors and M7g instances enabled by AWS Graviton3 processors. To measure
how the instances performed at different sizes, we tested both types with 4, 16, and 64 vCPUs. We
ran tests with the RoBERTa model, a variant of BERT, using a benchmark from the Intel Model Zoo. We
optimized the model for each processor type.
In our tests, the M7i instances analyzed up to 10.65 times the sentences per second compared to
the M7g instances. These results indicate that businesses running similar RoBERTa workloads could
analyze textual data faster and complete more work per instance with M7i instances featuring 4th Gen
Intel Xeon Scalable processors.
the throughput on
16vCPU instances
the throughput on
64vCPU instances
Up to
10.65x
the performance
per dollar on
64vCPU instances
Up to
4.66x
Up to
8.62x
This project was commissioned by Intel.
Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors October 2023
A Principled Technologies report: Hands-on testing. Real-world results.
How we tested
We ran two types of AWS EC2 instances in the US-east-1f region:
• M7i instances featuring 4th Gen Intel Xeon Scalable processors
• M7g instances featuring AWS Graviton3 processors
We optimized the RoBERTa model for each instance type. The 4th Gen Intel Xeon Scalable processors featured
Intel Advanced Matrix Extensions (Intel AMX) with 8-bit integer, bfloat16, and float16 support. To determine
the performance businesses of various sizes might expect, we ran tests on smaller, medium-size, and larger
instances, as Figure 1 shows. To account for different types of datasets businesses may analyze, we ran tests at
different batch sizes, which indicate the number of samples that go through the neural network at a time. We
chose a small batch size of 1 and a large batch size of 32. Finally, to show performance at different precisions
an organization may use, we tested with both BF16 and FP32 precision. While FP32 was “the standard type for
neural network computations for a long time,” BF16 is a newer precision type that some organizations have used
to replace FP16 and FP32 precision.1
In addition to measuring the number of sentences per second each instance could analyze at two batch sizes
and with two precisions, we also calculated the performance per price of each instance. We based this data on
the results of our testing and instance pricing we researched on August 10, 2023. To read more about our tests,
configurations, and calculations, see the science behind the report.
What better RoBERTa performance might mean for you
BERT, pre-trained on the entirety of the English language Wikipedia, currently helps Google interpret
user searches.2
Developed by Facebook AI, the RoBERTa model we used in our testing trained on a
larger dataset, including the English language Wikipedia, 63 million news articles, and 38 GB of Reddit
content.3
According to one article, “RoBERTa has been shown to outperform BERT and other state-of-
the-art models on a variety of natural language processing tasks, including language translation, text
classification, and question answering. It has also been used as a base model for many other successful
NLP models and has become a popular choice for research and industry applications.”4
At all three
instance sizes we tested, the M7i instances enabled by 4th Gen Intel Xeon Scalable processors handled
greater throughput than the M7g instances with Graviton3 processors. And they did so using different
batch sizes and precision types. This means that selecting M7i instances could help businesses run
RoBERTa workloads faster, which could lead to a more seamless experience for users on RoBERTa-
powered apps or the ability to support more users. Additionally, with the better value that the M7i
instances delivered, organizations might accomplish more RoBERTa work per instance, improving their
bottom lines as they save on instance uptime.
Figure 1: Key specifications for each instance size we tested. Source: Principled Technologies.
Small instances
4 vCPUs
16GB RAM
Medium instances
16 vCPUs
64GB RAM
Large instances
64 vCPUs
256GB RAM
October 2023 | 2
Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
Analyze text faster at BF16 and FP32 precision
We first compared RoBERTa performance at BF16 precision, measuring the sentences per second
each instance analyzed. At both batch sizes and across vCPU counts, M7i instances featuring 4th
Gen Intel Xeon Scalable processors handled a higher throughput rate than M7g instances with
Graviton3 processors. As Figures 2 and 3 show, M7i instances analyzed up to 10.65 times the
sentences per second.
Normalized RoBERTa throughput
with BF16 at batch size 1 Higher is better
3.54
8.00
6.00
4.00
0.00
12.00
10.00
4 vCPUs
Normalized
sentences/second
4.66
16 vCPUs
10.65
64 vCPUs
1.00 1.00 1.00
2.00
M7i M7g
Instance size
Figure 2: Relative RoBERTa performance of M7i instances, in sentences analyzed per second, compared to M7g instances.
Both types of instances used BF16 precision at batch size: 1. Higher is better. Source: Principled Technologies.
8.00
6.00
4.00
0.00
12.00
10.00
2.00
Normalized RoBERTa throughput
with BF16 at batch size 32 Higher is better
3.62
4 vCPUs
Normalized
sentences/second
4.58
16 vCPUs
5.17
64 vCPUs
1.00 1.00 1.00
M7i M7g
Instance size
Figure 3: Relative RoBERTa performance of M7i instances, in sentences analyzed per second, compared to M7g instances.
Both types of instances used BF16 precision at batch size: 32. Higher is better. Source: Principled Technologies.
Up to 10.65x
the throughput on
64vCPU instances
Up to 5.17x
the throughput on
64vCPU instances
Note: The graphs in this report use two different y-axis scales to keep to a consistent size. Please be mindful of each graph’s
data range as you compare.
October 2023 | 3
Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
When we compared performance using FP32 precision, we again saw performance increases from M7i
instances featuring 4th Gen Intel Xeon Scalable processors. For both batch sizes we tested, the M7i instances
outperformed the M7g instances by as much as 4.25 times (Figures 4 and 5).
Normalized RoBERTa throughput
with FP32 at batch size 1 Higher is better
1.82
3.00
2.00
1.00
0.00
5.00
4.00
4 vCPUs
1.00
Normalized
sentences/second
2.16
16 vCPUs
1.00
4.25
64 vCPUs
1.00
M7i M7g
Instance size
Figure 4: Relative RoBERTa performance of M7i instances, in sentences analyzed per second, compared to M7g instances.
Both types of instances used FP32 precision at batch size: 1. Higher is better. Source: Principled Technologies.
Normalized RoBERTa throughput
with FP32 at batch size 32 Higher is better
1.93
3.00
2.00
1.00
0.00
5.00
4.00
4 vCPUs
Normalized
sentences/second
2.01
16 vCPUs
1.93
64 vCPUs
1.00 1.00 1.00
M7i M7g
Instance size
Figure 5: Relative RoBERTa performance of M7i instances, in sentences analyzed per second, compared to M7g instances.
Both types of instances used FP32 precision at batch size: 32. Higher is better. Source: Principled Technologies.
Up to 4.25x
the throughput on
64vCPU instances
Up to 2.01x
the throughput on
16vCPU instances
About 4th Gen Intel Xeon Scalable processors
According to Intel, 4th Gen Intel Xeon Scalable processors feature
“the most built-in accelerators of any CPU on the market to improve
performance in AI, data analytics, networking, storage, and HPC.”5
Along with PCIe Gen5 technology, DDR5 memory, and CXL 1.1
capabilities, 4th Gen Intel Xeon Scalable processors also offer Intel
Accelerator Engines for a variety of workloads.6
For more information, visit https://www.intel.com/content/www/
us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-
scalable-processors.html.
October 2023 | 4
Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
Take advantage of higher-value performance
Not only did M7i instances analyze sentences at a faster rate, but this performance improvement
made them a better value. Dividing each instance’s RoBERTa throughput by its per-hour price,
we found that M7i instances performed more RoBERTa work for every dollar they cost. At BF16
precision, M7i instances delivered up to 8.62 times the throughput per cost of M7g instances, as
Figures 6 and 7 show.
8.00
6.00
4.00
0.00
12.00
10.00
2.00
Normalized RoBERTa performance per
dollar with BF16 at batch size 1 Higher is better
2.87
4 vCPUs
Normalized
throughput/$
per
hour
3.77
16 vCPUs
8.62
64 vCPUs
M7i M7g
1.00 1.00 1.00
Instance size
Figure 6: Throughput per instance cost of M7i instances compared to M7g instances. Both instances used BF16 precision at
batch size 1. Higher is better. Source: Principled Technologies.
Normalized RoBERTa performance per
dollar with BF16 at batch size 32 Higher is better
2.93
3.00
2.00
1.00
0.00
5.00
4.00
4 vCPUs
1.00
Normalized
throughput/$
per
hour
3.71
16 vCPUs
1.00
4.19
64 vCPUs
1.00
M7i M7g
Instance size
Figure 7: Throughput per instance cost of M7i instances compared to M7g instances. Both instances used BF16 precision at
batch size 32. Higher is better. Source: Principled Technologies.
Up to 8.62x
the performance
per dollar on
64vCPU instances
Up to 4.19x
the performance
per dollar on
64vCPU instances
October 2023 | 5
Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
Using the same calculations as we did for BF16 precision, we saw that with FP32 precision, the M7i instances
again offered a better value than the M7g instances we tested. Figures 8 and 9 show that M7i instances handled
up to 3.44 times the throughput per cost of the M7g instances using FP32 precision.
Normalized RoBERTa performance per
dollar with FP32 at batch size 1 Higher is better
1.47
3.00
2.00
1.00
0.00
5.00
4.00
4 vCPUs
1.00
Normalized
throughput/$
per
hour
1.75
16 vCPUs
1.00
3.44
64 vCPUs
1.00
M7i M7g
Instance size
Figure 8: Throughput per instance cost of M7i instances compared to M7g instances. Both instances used FP32 precision at
batch size 1. Higher is better. Source: Principled Technologies.
Normalized RoBERTa performance per
dollar with FP32 at batch size 32 Higher is better
1.56
3.00
2.00
1.00
0.00
5.00
4.00
4 vCPUs
1.00
Normalized
throughput/$
per
hour
1.63
16 vCPUs
1.00
1.56
64 vCPUs
1.00
M7i M7g
Instance size
Figure 9: Throughput per instance cost of M7i instances compared to M7g instances. Both instances used FP32 precision at
batch size 32. Higher is better. Source: Principled Technologies.
Up to 3.44x
the performance
per dollar on
64vCPU instances
Up to 1.63x
the performance
per dollar on
16vCPU instances
October 2023 | 6
Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
Conclusion
For natural language processing workloads, the instance type you choose can make an enormous difference
in performance—how quickly they can make sense of textual data—and value—how much work they can
accomplish at a given cost. Our RoBERTa test results indicate that AWS EC2 M7i instances enabled by 4th
Gen Intel Xeon Scalable processors outperformed M7g instances with AWS Graviton3 processors at different
vCPU counts, batch sizes, and precisions, processing up to 10.65 times as many sentences per second. These
performance gains led to M7i instances delivering a better value, achieving up to 8.62 times the throughput per
dollar. With an instance that can analyze text more quickly, your business could offer a smoother experience on
RoBERTa-supported apps or potentially condense RoBERTa workloads onto fewer instances.
1. Grigory Sapunov, “FP64, FP32, FP16, BFLOAT16, TF32, and other members of the ZOO,” accessed August 25, 2023,
https://moocaholic.medium.com/fp64-fp32-fp16-bfloat16-tf32-and-other-members-of-the-zoo-a1ca7897d407.
2. Ben Lutkevich, “BERT language model,” accessed August 18, 2023,
https://www.techtarget.com/searchenterpriseai/definition/BERT-language-model.
3. GeeksforGeeks, “Overview of ROBERTa model,” accessed August 24, 2023,
https://www.geeksforgeeks.org/overview-of-roberta-model/.
4. GeeksforGeeks, “Overview of ROBERTa model.”
5. Intel, “4th Gen Xeon Scalable Processors,” accessed August 18, 2023, https://www.intel.com/content/www/us/en/prod-
ucts/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html/.
6. Intel, “4th Gen Xeon Scalable Processors.”
Principled Technologies is a registered trademark of Principled Technologies, Inc.
All other product names are the trademarks of their respective owners.
For additional information, review the science behind this report.
Principled
Technologies®
Facts matter.®
Principled
Technologies®
Facts matter.®
This project was commissioned by Intel.
Read the science behind this report at https://facts.pt/2aqfhRt
October 2023 | 7
Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors

More Related Content

Similar to Accelerate natural language processing with AWS EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors

Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Amazon Web Services
 
Age of Language Models in NLP
Age of Language Models in NLPAge of Language Models in NLP
Age of Language Models in NLPTyrone Systems
 
Amazon EC2 Instances, Featuring Performance Optimisation Best Practices
Amazon EC2 Instances, Featuring Performance Optimisation Best PracticesAmazon EC2 Instances, Featuring Performance Optimisation Best Practices
Amazon EC2 Instances, Featuring Performance Optimisation Best PracticesAmazon Web Services
 
Get a clearer picture of potential cloud performance by looking beyond SPECra...
Get a clearer picture of potential cloud performance by looking beyond SPECra...Get a clearer picture of potential cloud performance by looking beyond SPECra...
Get a clearer picture of potential cloud performance by looking beyond SPECra...Principled Technologies
 
Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...
Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...
Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...Principled Technologies
 
Finish Microsoft SQL Server data analysis faster with new M5n series instance...
Finish Microsoft SQL Server data analysis faster with new M5n series instance...Finish Microsoft SQL Server data analysis faster with new M5n series instance...
Finish Microsoft SQL Server data analysis faster with new M5n series instance...Principled Technologies
 
Elastic Compute Cloud (EC2) on AWS Presentation
Elastic Compute Cloud (EC2) on AWS PresentationElastic Compute Cloud (EC2) on AWS Presentation
Elastic Compute Cloud (EC2) on AWS PresentationKnoldus Inc.
 
Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon Web Services
 
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...Principled Technologies
 
Complete online analytics processing work faster with Google Cloud Platform N...
Complete online analytics processing work faster with Google Cloud Platform N...Complete online analytics processing work faster with Google Cloud Platform N...
Complete online analytics processing work faster with Google Cloud Platform N...Principled Technologies
 
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...Principled Technologies
 
Complete artificial intelligence workloads faster using Microsoft Azure virtu...
Complete artificial intelligence workloads faster using Microsoft Azure virtu...Complete artificial intelligence workloads faster using Microsoft Azure virtu...
Complete artificial intelligence workloads faster using Microsoft Azure virtu...Principled Technologies
 
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...Principled Technologies
 
Scale Up Performance with Intel® Development
Scale Up Performance with Intel® DevelopmentScale Up Performance with Intel® Development
Scale Up Performance with Intel® DevelopmentIntel IT Center
 
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...Principled Technologies
 
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksAmazon Web Services
 
Improve AI inference performance with HPE ProLiant DL380 Gen11 servers, power...
Improve AI inference performance with HPE ProLiant DL380 Gen11 servers, power...Improve AI inference performance with HPE ProLiant DL380 Gen11 servers, power...
Improve AI inference performance with HPE ProLiant DL380 Gen11 servers, power...Principled Technologies
 
Re invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampionRe invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampionMia D Champion
 

Similar to Accelerate natural language processing with AWS EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors (20)

Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319
 
Age of Language Models in NLP
Age of Language Models in NLPAge of Language Models in NLP
Age of Language Models in NLP
 
Amazon EC2 Instances, Featuring Performance Optimisation Best Practices
Amazon EC2 Instances, Featuring Performance Optimisation Best PracticesAmazon EC2 Instances, Featuring Performance Optimisation Best Practices
Amazon EC2 Instances, Featuring Performance Optimisation Best Practices
 
Amazon EC2 Foundations
Amazon EC2 FoundationsAmazon EC2 Foundations
Amazon EC2 Foundations
 
Get a clearer picture of potential cloud performance by looking beyond SPECra...
Get a clearer picture of potential cloud performance by looking beyond SPECra...Get a clearer picture of potential cloud performance by looking beyond SPECra...
Get a clearer picture of potential cloud performance by looking beyond SPECra...
 
Podila QCon SF 2016
Podila QCon SF 2016Podila QCon SF 2016
Podila QCon SF 2016
 
Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...
Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...
Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...
 
Finish Microsoft SQL Server data analysis faster with new M5n series instance...
Finish Microsoft SQL Server data analysis faster with new M5n series instance...Finish Microsoft SQL Server data analysis faster with new M5n series instance...
Finish Microsoft SQL Server data analysis faster with new M5n series instance...
 
Elastic Compute Cloud (EC2) on AWS Presentation
Elastic Compute Cloud (EC2) on AWS PresentationElastic Compute Cloud (EC2) on AWS Presentation
Elastic Compute Cloud (EC2) on AWS Presentation
 
Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017
 
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
 
Complete online analytics processing work faster with Google Cloud Platform N...
Complete online analytics processing work faster with Google Cloud Platform N...Complete online analytics processing work faster with Google Cloud Platform N...
Complete online analytics processing work faster with Google Cloud Platform N...
 
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
 
Complete artificial intelligence workloads faster using Microsoft Azure virtu...
Complete artificial intelligence workloads faster using Microsoft Azure virtu...Complete artificial intelligence workloads faster using Microsoft Azure virtu...
Complete artificial intelligence workloads faster using Microsoft Azure virtu...
 
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
 
Scale Up Performance with Intel® Development
Scale Up Performance with Intel® DevelopmentScale Up Performance with Intel® Development
Scale Up Performance with Intel® Development
 
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
Complete more PostgreSQL work with new Microsoft Azure Lsv3-series VMs featur...
 
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
 
Improve AI inference performance with HPE ProLiant DL380 Gen11 servers, power...
Improve AI inference performance with HPE ProLiant DL380 Gen11 servers, power...Improve AI inference performance with HPE ProLiant DL380 Gen11 servers, power...
Improve AI inference performance with HPE ProLiant DL380 Gen11 servers, power...
 
Re invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampionRe invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampion
 

More from Principled Technologies

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Realize 2.1X the performance with 20% less power with AMD EPYC processor-back...
Realize 2.1X the performance with 20% less power with AMD EPYC processor-back...Realize 2.1X the performance with 20% less power with AMD EPYC processor-back...
Realize 2.1X the performance with 20% less power with AMD EPYC processor-back...Principled Technologies
 
Upgrade your cloud infrastructure with Dell PowerEdge R760 servers and VMware...
Upgrade your cloud infrastructure with Dell PowerEdge R760 servers and VMware...Upgrade your cloud infrastructure with Dell PowerEdge R760 servers and VMware...
Upgrade your cloud infrastructure with Dell PowerEdge R760 servers and VMware...Principled Technologies
 
A comparison of security features in Dell, HP, and Lenovo PC systems
A comparison of security features in Dell, HP, and Lenovo PC systemsA comparison of security features in Dell, HP, and Lenovo PC systems
A comparison of security features in Dell, HP, and Lenovo PC systemsPrincipled Technologies
 
Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Principled Technologies
 
Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Principled Technologies
 
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...Principled Technologies
 
Scale up your storage with higher-performing Dell APEX Block Storage for AWS
Scale up your storage with higher-performing Dell APEX Block Storage for AWSScale up your storage with higher-performing Dell APEX Block Storage for AWS
Scale up your storage with higher-performing Dell APEX Block Storage for AWSPrincipled Technologies
 
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationGet in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationPrincipled Technologies
 
Improving database performance and value with an easy migration to Azure Data...
Improving database performance and value with an easy migration to Azure Data...Improving database performance and value with an easy migration to Azure Data...
Improving database performance and value with an easy migration to Azure Data...Principled Technologies
 
Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Principled Technologies
 
Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Principled Technologies
 
Set up students and teachers to excel now and in the future with Intel proces...
Set up students and teachers to excel now and in the future with Intel proces...Set up students and teachers to excel now and in the future with Intel proces...
Set up students and teachers to excel now and in the future with Intel proces...Principled Technologies
 
Finding the path to AI success with the Dell AI portfolio - Summary
Finding the path to AI success with the Dell AI portfolio - SummaryFinding the path to AI success with the Dell AI portfolio - Summary
Finding the path to AI success with the Dell AI portfolio - SummaryPrincipled Technologies
 
Finding the path to AI success with the Dell AI portfolio
Finding the path to AI success with the Dell AI portfolioFinding the path to AI success with the Dell AI portfolio
Finding the path to AI success with the Dell AI portfolioPrincipled Technologies
 
Achieve strong performance and value on Azure SQL Database Hyperscale
Achieve strong performance and value on Azure SQL Database HyperscaleAchieve strong performance and value on Azure SQL Database Hyperscale
Achieve strong performance and value on Azure SQL Database HyperscalePrincipled Technologies
 
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Principled Technologies
 
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Principled Technologies
 
Utilizing Azure Cosmos DB for intelligent AI‑powered applications
Utilizing Azure Cosmos DB for intelligent AI‑powered applicationsUtilizing Azure Cosmos DB for intelligent AI‑powered applications
Utilizing Azure Cosmos DB for intelligent AI‑powered applicationsPrincipled Technologies
 

More from Principled Technologies (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Realize 2.1X the performance with 20% less power with AMD EPYC processor-back...
Realize 2.1X the performance with 20% less power with AMD EPYC processor-back...Realize 2.1X the performance with 20% less power with AMD EPYC processor-back...
Realize 2.1X the performance with 20% less power with AMD EPYC processor-back...
 
Upgrade your cloud infrastructure with Dell PowerEdge R760 servers and VMware...
Upgrade your cloud infrastructure with Dell PowerEdge R760 servers and VMware...Upgrade your cloud infrastructure with Dell PowerEdge R760 servers and VMware...
Upgrade your cloud infrastructure with Dell PowerEdge R760 servers and VMware...
 
A comparison of security features in Dell, HP, and Lenovo PC systems
A comparison of security features in Dell, HP, and Lenovo PC systemsA comparison of security features in Dell, HP, and Lenovo PC systems
A comparison of security features in Dell, HP, and Lenovo PC systems
 
Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...
 
Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...Increase security, sustainability, and efficiency with robust Dell server man...
Increase security, sustainability, and efficiency with robust Dell server man...
 
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
Scale up your storage with higher-performing Dell APEX Block Storage for AWS ...
 
Scale up your storage with higher-performing Dell APEX Block Storage for AWS
Scale up your storage with higher-performing Dell APEX Block Storage for AWSScale up your storage with higher-performing Dell APEX Block Storage for AWS
Scale up your storage with higher-performing Dell APEX Block Storage for AWS
 
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower WorkstationGet in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
Get in and stay in the productivity zone with the HP Z2 G9 Tower Workstation
 
Improving database performance and value with an easy migration to Azure Data...
Improving database performance and value with an easy migration to Azure Data...Improving database performance and value with an easy migration to Azure Data...
Improving database performance and value with an easy migration to Azure Data...
 
Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...
 
Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...Realize better value and performance migrating from Azure Database for Postgr...
Realize better value and performance migrating from Azure Database for Postgr...
 
Set up students and teachers to excel now and in the future with Intel proces...
Set up students and teachers to excel now and in the future with Intel proces...Set up students and teachers to excel now and in the future with Intel proces...
Set up students and teachers to excel now and in the future with Intel proces...
 
Finding the path to AI success with the Dell AI portfolio - Summary
Finding the path to AI success with the Dell AI portfolio - SummaryFinding the path to AI success with the Dell AI portfolio - Summary
Finding the path to AI success with the Dell AI portfolio - Summary
 
Finding the path to AI success with the Dell AI portfolio
Finding the path to AI success with the Dell AI portfolioFinding the path to AI success with the Dell AI portfolio
Finding the path to AI success with the Dell AI portfolio
 
Achieve strong performance and value on Azure SQL Database Hyperscale
Achieve strong performance and value on Azure SQL Database HyperscaleAchieve strong performance and value on Azure SQL Database Hyperscale
Achieve strong performance and value on Azure SQL Database Hyperscale
 
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
 
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
Improve backup and recovery outcomes by combining Dell APEX Data Storage Serv...
 
Utilizing Azure Cosmos DB for intelligent AI‑powered applications
Utilizing Azure Cosmos DB for intelligent AI‑powered applicationsUtilizing Azure Cosmos DB for intelligent AI‑powered applications
Utilizing Azure Cosmos DB for intelligent AI‑powered applications
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

Accelerate natural language processing with AWS EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors

  • 1. Accelerate natural language processing with AWS EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors Compared to AWS M7g instances with AWS Graviton3 processors, AWS M7i instances with Intel Xeon Scalable processors analyzed up to 10.65 times the sentences per second and delivered up to 8.62 times the performance per dollar in RoBERTa model testing Natural language processing (NLP) supports many everyday applications, from search result suggestions to text autocorrections to text generation chatbots. Deep learning frameworks for NLP, such as Bidirectional Encoder Representations from Transformers (BERT) and its variations, demand robust compute power to analyze text and make predictions. For businesses running these workloads in the cloud, selecting the right instance can enable faster text analysis. Using Intel® Extension for PyTorch, we tested the deep learning performance of two types of Amazon Web Services (AWS) Elastic Cloud Compute (EC2) instances: M7i instances enabled by 4th Gen Intel Xeon® Scalable processors and M7g instances enabled by AWS Graviton3 processors. To measure how the instances performed at different sizes, we tested both types with 4, 16, and 64 vCPUs. We ran tests with the RoBERTa model, a variant of BERT, using a benchmark from the Intel Model Zoo. We optimized the model for each processor type. In our tests, the M7i instances analyzed up to 10.65 times the sentences per second compared to the M7g instances. These results indicate that businesses running similar RoBERTa workloads could analyze textual data faster and complete more work per instance with M7i instances featuring 4th Gen Intel Xeon Scalable processors. the throughput on 16vCPU instances the throughput on 64vCPU instances Up to 10.65x the performance per dollar on 64vCPU instances Up to 4.66x Up to 8.62x This project was commissioned by Intel. Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors October 2023 A Principled Technologies report: Hands-on testing. Real-world results.
  • 2. How we tested We ran two types of AWS EC2 instances in the US-east-1f region: • M7i instances featuring 4th Gen Intel Xeon Scalable processors • M7g instances featuring AWS Graviton3 processors We optimized the RoBERTa model for each instance type. The 4th Gen Intel Xeon Scalable processors featured Intel Advanced Matrix Extensions (Intel AMX) with 8-bit integer, bfloat16, and float16 support. To determine the performance businesses of various sizes might expect, we ran tests on smaller, medium-size, and larger instances, as Figure 1 shows. To account for different types of datasets businesses may analyze, we ran tests at different batch sizes, which indicate the number of samples that go through the neural network at a time. We chose a small batch size of 1 and a large batch size of 32. Finally, to show performance at different precisions an organization may use, we tested with both BF16 and FP32 precision. While FP32 was “the standard type for neural network computations for a long time,” BF16 is a newer precision type that some organizations have used to replace FP16 and FP32 precision.1 In addition to measuring the number of sentences per second each instance could analyze at two batch sizes and with two precisions, we also calculated the performance per price of each instance. We based this data on the results of our testing and instance pricing we researched on August 10, 2023. To read more about our tests, configurations, and calculations, see the science behind the report. What better RoBERTa performance might mean for you BERT, pre-trained on the entirety of the English language Wikipedia, currently helps Google interpret user searches.2 Developed by Facebook AI, the RoBERTa model we used in our testing trained on a larger dataset, including the English language Wikipedia, 63 million news articles, and 38 GB of Reddit content.3 According to one article, “RoBERTa has been shown to outperform BERT and other state-of- the-art models on a variety of natural language processing tasks, including language translation, text classification, and question answering. It has also been used as a base model for many other successful NLP models and has become a popular choice for research and industry applications.”4 At all three instance sizes we tested, the M7i instances enabled by 4th Gen Intel Xeon Scalable processors handled greater throughput than the M7g instances with Graviton3 processors. And they did so using different batch sizes and precision types. This means that selecting M7i instances could help businesses run RoBERTa workloads faster, which could lead to a more seamless experience for users on RoBERTa- powered apps or the ability to support more users. Additionally, with the better value that the M7i instances delivered, organizations might accomplish more RoBERTa work per instance, improving their bottom lines as they save on instance uptime. Figure 1: Key specifications for each instance size we tested. Source: Principled Technologies. Small instances 4 vCPUs 16GB RAM Medium instances 16 vCPUs 64GB RAM Large instances 64 vCPUs 256GB RAM October 2023 | 2 Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
  • 3. Analyze text faster at BF16 and FP32 precision We first compared RoBERTa performance at BF16 precision, measuring the sentences per second each instance analyzed. At both batch sizes and across vCPU counts, M7i instances featuring 4th Gen Intel Xeon Scalable processors handled a higher throughput rate than M7g instances with Graviton3 processors. As Figures 2 and 3 show, M7i instances analyzed up to 10.65 times the sentences per second. Normalized RoBERTa throughput with BF16 at batch size 1 Higher is better 3.54 8.00 6.00 4.00 0.00 12.00 10.00 4 vCPUs Normalized sentences/second 4.66 16 vCPUs 10.65 64 vCPUs 1.00 1.00 1.00 2.00 M7i M7g Instance size Figure 2: Relative RoBERTa performance of M7i instances, in sentences analyzed per second, compared to M7g instances. Both types of instances used BF16 precision at batch size: 1. Higher is better. Source: Principled Technologies. 8.00 6.00 4.00 0.00 12.00 10.00 2.00 Normalized RoBERTa throughput with BF16 at batch size 32 Higher is better 3.62 4 vCPUs Normalized sentences/second 4.58 16 vCPUs 5.17 64 vCPUs 1.00 1.00 1.00 M7i M7g Instance size Figure 3: Relative RoBERTa performance of M7i instances, in sentences analyzed per second, compared to M7g instances. Both types of instances used BF16 precision at batch size: 32. Higher is better. Source: Principled Technologies. Up to 10.65x the throughput on 64vCPU instances Up to 5.17x the throughput on 64vCPU instances Note: The graphs in this report use two different y-axis scales to keep to a consistent size. Please be mindful of each graph’s data range as you compare. October 2023 | 3 Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
  • 4. When we compared performance using FP32 precision, we again saw performance increases from M7i instances featuring 4th Gen Intel Xeon Scalable processors. For both batch sizes we tested, the M7i instances outperformed the M7g instances by as much as 4.25 times (Figures 4 and 5). Normalized RoBERTa throughput with FP32 at batch size 1 Higher is better 1.82 3.00 2.00 1.00 0.00 5.00 4.00 4 vCPUs 1.00 Normalized sentences/second 2.16 16 vCPUs 1.00 4.25 64 vCPUs 1.00 M7i M7g Instance size Figure 4: Relative RoBERTa performance of M7i instances, in sentences analyzed per second, compared to M7g instances. Both types of instances used FP32 precision at batch size: 1. Higher is better. Source: Principled Technologies. Normalized RoBERTa throughput with FP32 at batch size 32 Higher is better 1.93 3.00 2.00 1.00 0.00 5.00 4.00 4 vCPUs Normalized sentences/second 2.01 16 vCPUs 1.93 64 vCPUs 1.00 1.00 1.00 M7i M7g Instance size Figure 5: Relative RoBERTa performance of M7i instances, in sentences analyzed per second, compared to M7g instances. Both types of instances used FP32 precision at batch size: 32. Higher is better. Source: Principled Technologies. Up to 4.25x the throughput on 64vCPU instances Up to 2.01x the throughput on 16vCPU instances About 4th Gen Intel Xeon Scalable processors According to Intel, 4th Gen Intel Xeon Scalable processors feature “the most built-in accelerators of any CPU on the market to improve performance in AI, data analytics, networking, storage, and HPC.”5 Along with PCIe Gen5 technology, DDR5 memory, and CXL 1.1 capabilities, 4th Gen Intel Xeon Scalable processors also offer Intel Accelerator Engines for a variety of workloads.6 For more information, visit https://www.intel.com/content/www/ us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon- scalable-processors.html. October 2023 | 4 Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
  • 5. Take advantage of higher-value performance Not only did M7i instances analyze sentences at a faster rate, but this performance improvement made them a better value. Dividing each instance’s RoBERTa throughput by its per-hour price, we found that M7i instances performed more RoBERTa work for every dollar they cost. At BF16 precision, M7i instances delivered up to 8.62 times the throughput per cost of M7g instances, as Figures 6 and 7 show. 8.00 6.00 4.00 0.00 12.00 10.00 2.00 Normalized RoBERTa performance per dollar with BF16 at batch size 1 Higher is better 2.87 4 vCPUs Normalized throughput/$ per hour 3.77 16 vCPUs 8.62 64 vCPUs M7i M7g 1.00 1.00 1.00 Instance size Figure 6: Throughput per instance cost of M7i instances compared to M7g instances. Both instances used BF16 precision at batch size 1. Higher is better. Source: Principled Technologies. Normalized RoBERTa performance per dollar with BF16 at batch size 32 Higher is better 2.93 3.00 2.00 1.00 0.00 5.00 4.00 4 vCPUs 1.00 Normalized throughput/$ per hour 3.71 16 vCPUs 1.00 4.19 64 vCPUs 1.00 M7i M7g Instance size Figure 7: Throughput per instance cost of M7i instances compared to M7g instances. Both instances used BF16 precision at batch size 32. Higher is better. Source: Principled Technologies. Up to 8.62x the performance per dollar on 64vCPU instances Up to 4.19x the performance per dollar on 64vCPU instances October 2023 | 5 Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
  • 6. Using the same calculations as we did for BF16 precision, we saw that with FP32 precision, the M7i instances again offered a better value than the M7g instances we tested. Figures 8 and 9 show that M7i instances handled up to 3.44 times the throughput per cost of the M7g instances using FP32 precision. Normalized RoBERTa performance per dollar with FP32 at batch size 1 Higher is better 1.47 3.00 2.00 1.00 0.00 5.00 4.00 4 vCPUs 1.00 Normalized throughput/$ per hour 1.75 16 vCPUs 1.00 3.44 64 vCPUs 1.00 M7i M7g Instance size Figure 8: Throughput per instance cost of M7i instances compared to M7g instances. Both instances used FP32 precision at batch size 1. Higher is better. Source: Principled Technologies. Normalized RoBERTa performance per dollar with FP32 at batch size 32 Higher is better 1.56 3.00 2.00 1.00 0.00 5.00 4.00 4 vCPUs 1.00 Normalized throughput/$ per hour 1.63 16 vCPUs 1.00 1.56 64 vCPUs 1.00 M7i M7g Instance size Figure 9: Throughput per instance cost of M7i instances compared to M7g instances. Both instances used FP32 precision at batch size 32. Higher is better. Source: Principled Technologies. Up to 3.44x the performance per dollar on 64vCPU instances Up to 1.63x the performance per dollar on 16vCPU instances October 2023 | 6 Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors
  • 7. Conclusion For natural language processing workloads, the instance type you choose can make an enormous difference in performance—how quickly they can make sense of textual data—and value—how much work they can accomplish at a given cost. Our RoBERTa test results indicate that AWS EC2 M7i instances enabled by 4th Gen Intel Xeon Scalable processors outperformed M7g instances with AWS Graviton3 processors at different vCPU counts, batch sizes, and precisions, processing up to 10.65 times as many sentences per second. These performance gains led to M7i instances delivering a better value, achieving up to 8.62 times the throughput per dollar. With an instance that can analyze text more quickly, your business could offer a smoother experience on RoBERTa-supported apps or potentially condense RoBERTa workloads onto fewer instances. 1. Grigory Sapunov, “FP64, FP32, FP16, BFLOAT16, TF32, and other members of the ZOO,” accessed August 25, 2023, https://moocaholic.medium.com/fp64-fp32-fp16-bfloat16-tf32-and-other-members-of-the-zoo-a1ca7897d407. 2. Ben Lutkevich, “BERT language model,” accessed August 18, 2023, https://www.techtarget.com/searchenterpriseai/definition/BERT-language-model. 3. GeeksforGeeks, “Overview of ROBERTa model,” accessed August 24, 2023, https://www.geeksforgeeks.org/overview-of-roberta-model/. 4. GeeksforGeeks, “Overview of ROBERTa model.” 5. Intel, “4th Gen Xeon Scalable Processors,” accessed August 18, 2023, https://www.intel.com/content/www/us/en/prod- ucts/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html/. 6. Intel, “4th Gen Xeon Scalable Processors.” Principled Technologies is a registered trademark of Principled Technologies, Inc. All other product names are the trademarks of their respective owners. For additional information, review the science behind this report. Principled Technologies® Facts matter.® Principled Technologies® Facts matter.® This project was commissioned by Intel. Read the science behind this report at https://facts.pt/2aqfhRt October 2023 | 7 Accelerate natural language processing with Amazon EC2 M7i instances featuring 4th Gen Intel Xeon Scalable processors