This document provides an overview of next generation tokenization for data protection and compliance. It discusses how tokenization has evolved from traditional approaches to provide significantly improved performance, scalability, and security compared to encryption and other older tokenization methods. Memory-based tokenization in particular is highlighted as delivering extremely fast tokenization speeds without the need for replication or synchronization between servers. The document also examines use cases and challenges around securing data in cloud and distributed environments and how tokenization addresses these issues through centralized policy management and transparency.
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Next Generation Tokenization for Cloud Data Protection and Compliance
1. Next Generation Tokenization for
Compliance and Cloud Data Protection
Ulf Mattsson
CTO Protegrity
ulf . mattsson [at] protegrity . com
2. Ulf Mattsson
20 years with IBM Development & Global Services
Inventor of 22 patents – Encryption and Tokenization
Co-founder of Protegrity (Data Security)
Research member of the International Federation for Information
Processing (IFIP) WG 11.3 Data and Application Security
Member of
• PCI Security Standards Council (PCI SSC)
• American National Standards Institute (ANSI) X9
• Cloud Security Alliance (CSA)
• Information Systems Security Association (ISSA)
• Information Systems Audit and Control Association (ISACA)
02
5. PCI DSS is Evolving
Encrypt
Data on Attacker
SSL
Public
Public
Network
Networks
(PCI DSS)
Private Network
Clear Text
Data Application
Clear Text Data
Database
Encrypt
Data OS File
At Rest System
(PCI DSS)
Storage
System
5 Source: PCI Security Standards Council, 2011
6. Protecting the Data Flow – PCI/PII Example
: Enforcement point
Unprotected sensitive information:
6
Protected sensitive information
11. Hiding Data in Plain Sight – Data Tokenization
Data Entry
Y&SFD%))S( Tokenization
Server
400000 123456 7899 Data Token
400000 222222 7899
Application
Databases
011
12. What is Tokenization and what is the Benefit?
Tokenization
• Tokenization is process that replaces sensitive data in
systems with inert data called tokens which have no value to
the thief.
• Tokens resemble the original data in data type and length
Benefit
• Greatly improved transparency to systems and processes that
need to be protected
Result
• Reduced remediation
• Reduced need for key management
• Reduce the points of attacks
• Reduce the PCI DSS audit costs for retail scenarios
12
13. What is Encryption and Tokenization?
Encryption Tokenization
Used Approach Cipher System Code System
Cryptographic algorithms
Cryptographic keys
Code books
Index tokens
Source: McGraw-HILL ENCYPLOPEDIA OF SCIENCE & TECHNOLOGY
13
14. Best Practices for Tokenization *
Unique Sequence
Number
One way Hash Secret per
Irreversible merchant
Function**
Randomly generated
value
*: Published July 14, 2010
**: Multi-use tokens
014
15. Comments on Visa’s Tokenization Best Practices
Visa recommendations should have been simply to
use a random number
You should not write your own 'home-grown' token
servers
015
18. Positioning of Different Protection Options
Evaluation Criteria Strong Formatted Data
Encryption Encryption Tokens
Security & Compliance
Total Cost of Ownership
Use of Encoded Data
Best Worst
18
19. Comparing Field Encryption & Tokenization
Intrusiveness to Applications and Databases
Hashing - !@#$%a^///&*B()..,,,gft_+!@4#$2%p^&*
Standard
Encryption
Strong Encryption - !@#$%a^.,mhu7/////&*B()_+!@
Alpha Encoding - aVdSaH 1F4hJ 1D3a
Tokenizing /
Numeric Encoding - 666666 777777 8888 Formatted
Encryption
Partial Encoding - 123456 777777 1234
Clear Text Data - 123456 123456 1234 Data
I I
Length
Original Longer
019
21. Speed of Different Protection Methods
Transactions per second (16 digits)
10 000 000 -
1 000 000 -
100 000 -
10 000 -
1 000 -
100 -
I I I I I
Traditional Format Data AES CBC Memory
Data Preserving Type Encryption Data
Tokenization Encryption Preservation Standard Tokenization
Encryption
21
*: Speed will depend on the configuration
22. Security of Different Protection Methods
Security Level
High -
Low -
I I I I I
Traditional Format Data AES CBC Memory
Data Preserving Type Encryption Data
Tokenization Encryption Preservation Standard Tokenization
Encryption
22
23. Speed and Security of Different Protection Methods
Transactions per second (16 digits) Security Level
10 000 000 -
Speed* High
1 000 000 -
100 000 -
10 000 - Security
Low
1 000 -
100 -
I I I I I
Traditional Format Data AES CBC Memory
Data Preserving Type Encryption Data
Tokenization Encryption Preservation Standard Tokenization
Encryption
23
*: Speed will depend on the configuration
24. Different Approaches for Tokenization
Traditional Tokenization
• Dynamic Model or Pre-Generated Model
• 5 tokens per second - 5000 tokenizations per second
Next Generation Tokenization
• Memory-tokenization
• 200,000 - 9,000,000+ tokenizations per second
• “The tokenization scheme offers excellent security, since it is
based on fully randomized tables.” *
• “This is a fully distributed tokenization approach with no need
for synchronization and there is no risk for collisions.“ *
*: Prof. Dr. Ir. Bart Preneel, Katholieke University Leuven, Belgium
024
26. Evaluating Encryption & Tokenization Approaches
Evaluation Criteria Encryption Tokenization
Database Database Centralized Memory
Area Impact File Column Tokenization Tokenization
Encryption Encryption (old) (new)
Availability
Scalability Latency
CPU Consumption
Data Flow
Protection
Compliance Scoping
Security Key Management
Randomness
Separation of Duties
026 Best Worst
27. Evaluating Field Encryption & Distributed Tokenization
Evaluation Criteria Strong Field Formatted Memory
Encryption Encryption Tokenization
Disconnected environments
Distributed environments
Performance impact when loading data
Transparent to applications
Expanded storage size
Transparent to databases schema
Long life-cycle data
Unix or Windows mixed with “big iron” (EBCDIC)
Easy re-keying of data in a data flow
High risk data
Security - compliance to PCI, NIST
Best Worst
27
28. Tokenization Summary
Traditional Tokenization Memory Tokenization
Footprint Large, Expanding. Small, Static.
The large and expanding footprint of Traditional The small static footprint is the enabling factor that
Tokenization is it’s Achilles heal. It is the source of delivers extreme performance, scalability, and expanded
poor performance, scalability, and limitations on its use.
expanded use.
High Complex replication required. No replication required.
Availability, Deploying more than one token server for the Any number of token servers can be deployed without
DR, and purpose of high availability or scalability will require the need for replication or synchronization between the
Distribution complex and expensive replication or servers. This delivers a simple, elegant, yet powerful
synchronization between the servers. solution.
Reliability Prone to collisions. No collisions.
The synchronization and replication required to Protegrity Tokenizations’ lack of need for replication or
support many deployed token servers is prone to synchronization eliminates the potential for collisions .
collisions, a characteristic that severely limits the
usability of traditional tokenization.
Performance, Will adversely impact performance & scalability. Little or no latency. Fastest industry tokenization.
Latency, and The large footprint severely limits the ability to place The small footprint enables the token server to be
Scalability the token server close to the data. The distance placed close to the data to reduce latency. When placed
between the data and the token server creates in-memory, it eliminates latency and delivers the fastest
latency that adversely effects performance and tokenization in the industry.
scalability to the extent that some use cases are not
possible.
Extendibility Practically impossible. Unlimited Tokenization Capability.
Based on all the issues inherent in Traditional Protegrity Tokenization can be used to tokenize many
Tokenization of a single data category, tokenizing data categories with minimal or no impact on footprint
more data categories may be impractical. or performance.
28
29. Token Flexibility for Different Categories of Data
Type of Data Input Token Comment
Token Properties
Credit Card 3872 3789 1620 3675 8278 2789 2990 2789 Numeric
Medical ID 29M2009ID 497HF390D Alpha-Numeric
Date 10/30/1955 12/25/2034 Date
E-mail Address bob.hope@protegrity.com empo.snaugs@svtiensnni.snk Alpha Numeric, delimiters in
input preserved
SSN delimiters 075-67-2278 287-38-2567 Numeric, delimiters in input
Credit Card 3872 3789 1620 3675 8278 2789 2990 3675 Numeric, Last 4 digits exposed
Policy Masking
Credit Card 3872 3789 1620 3675 clear, encrypted, tokenized at rest Presentation Mask: Expose 1st
3872 37## #### #### 6 digits
29
31. Some Tokenization Use Cases
Customer 1
• Vendor lock-in: What if we want to switch payment processor?
• Performance challenge: What if we want to rotate the tokens?
• Performance challenge with initial tokenization
Customer 2
• Reduced PCI compliance cost by 50%
• Performance challenge with initial tokenization
• End-to-end: looking to expand tokenization to all stores
Customer 3
• Desired a single vendor
• Desired use of encryption and tokenization
• Looking to expand tokens beyond CCN to PII
Customer 4
• Remove compensating controls on the mainframe
• Pushing tokens through to avoid compensating controls
31
32. Tokenization Use Case #2
A leading retail chain
• 1500 locations in the U.S. market
Simplify PCI Compliance
• 98% of Use Cases out of audit scope
• Ease of install (had 18 PCI initiatives at one time)
Tokenization solution was implemented in 2 weeks
• Reduced PCI Audit from 7 months to 3 months
• No 3rd Party code modifications
• Proved to be the best performance option
• 700,000 transactions per days
• 50 million card holder data records
• Conversion took 90 minutes (plan was 30 days)
• Next step – tokenization servers at 1500 locations
32
34. Risks Associated with Cloud Computing
Handing over sensitive data to a
third party
Threat of data breach or loss
Weakening of corporate network
security
Uptime/business continuity
Financial strength of the cloud
computing provider
Inability to customize applications
0 10 20 30 40 50 60 70 %
The evolving role of IT managers and CIOs Findings from the 2010 IBM Global IT Risk Study
034
35. What Amazon AWS’s PCI Compliance Means to You, Dec 7 2010
1. Just because AWS is certified doesn't mean you are. You still need to deploy a PCI compliant
application/service and anything on AWS is still within your assessment scope.
2. The open question? PCI-DSS 2.0 doesn't address multi-tenancy concerns
3. AWS is certified as a service provider doesn't mean all cloud IaaS providers will be
4. You can store PAN data on S3, but it still needs to be encrypted in accordance with PCI-DSS
requirements
5. Amazon doesn't do this for you -- it's something you need to implement yourself; including
key management, rotation, logging, etc.
6. If you deploy a server instance in EC2 it still needs to be assessed by your QSA
7. What this certification really does is eliminate any doubts that you are allowed to deploy an
in-scope PCI system on AWS
8. This is a big deal, but your organization's assessment scope isn't necessarily reduced
9. it might be when you move to something like a tokenization service where you reduce your
handling of PAN data
035 securosis.com
37. Data Protection Challenges
The actual protection of the data is not the challenge
Centralized solutions are needed to managed
complex security requirements
• Based on Security Policies with Transparent Key
management
• Many methods to secure the data
• Auditing, Monitoring and Reporting
Solutions that minimize the impact on business
operations
• Highest level of performance and transparency
Rapid Deployment
Affordable with low TCO
Enable & Maintaining compliance
37
38. Protegrity Data Security Management
Policy
File System
Protector Database
Protector
Audit
Log
Application
Protector
Enterprise
Data Security
Administrator
Tokenizatio Secure
n Server Archive
38 : Encryption service
39. About Protegrity
Proven enterprise data security software and innovation leader
• Sole focus on the protection of data
• Patented Technology, Continuing to Drive Innovation
Growth driven by compliance and risk management
• PCI (Payment Card Industry)
• PII (Personally Identifiable Information)
• PHI (Protected Health Information) – HIPAA
• State and Foreign Privacy Laws, Breach Notification Laws
• High Cost of Information Breach ($4.8m average cost), immeasurable costs of brand
damage , loss of customers
• Requirements to eliminate the threat of data breach and non-compliance
Cross-industry applicability
• Retail, Hospitality, Travel and Transportation
• Financial Services, Insurance, Banking
• Healthcare
• Telecommunications, Media and Entertainment
• Manufacturing and Government
39
43. Please contact us for more information
Ulf Mattsson, CTO
ulf . Mattsson [at] protegrity . com
April G. Healy, Global Alliance Director
april . healy [at] protegrity . com
Editor's Notes
My years at IBM and Protegrity allowed me to research Data Breaches, Compliance aspects and New Approaches for Data Protection in different Environments including Cloud, Virtualization, Web, Client-server and Mainframe centric environments. Every technology transition opened up new ways to attack systems My work in the PCI Security Standards Council is involving work to define how the Payment Card Industry should utilize emerging technologies in the areas of cloud, encryption and tokenization of dataCSA published Top Threats to Cloud Computing Threat research updated twice yearlyApr 27, 2010 ... Cloud Security Alliance Releases Cloud Controls Matrix. Controls framework aligned with CSA guidance, assists both cloud providers and cloud consumersDownload Application Security Whitepaper - Released July 28, 2010Research tools and processes to perform consistent measurements of cloud providers. released October 12, 2010 CloudAudit (10/20/2010: Now a CSA project!)The goal of CloudAudit is to provide a common interface and namespace that allows cloud computing providers to automate the Audit, Assertion, Assessment, and Assurance (A6) of their infrastructure (IaaS), platform (PaaS), and application (SaaS) environments and allow authorized consumers of their services to do likewise via an open, extensible and secure interface and methodology.