Nell’iperspazio con Rocket: il Framework Web di Rust!
Isaca new delhi india privacy and big data
1. Bridging the Gap Between Privacy and Big Data
Ulf Mattsson, CTO
Protegrity
ulf.mattsson AT protegrity.com
2. Ulf Mattsson, CTO Protegrity
20 years with IBM
• Research & Development & Global Services
Inventor
• Encryption, Tokenization & Intrusion Prevention
Involvement
• PCI Security Standards Council (PCI SSC)
• American National Standards Institute (ANSI) X9
• Encryption & Tokenization
• International Federation for Information Processing
• IFIP WG 11.3 Data and Application Security
• ISACA New York Metro chapter
2
4. Agenda
1. What is Big Data & Cloud?
2. Risk & Drivers for Data Security
3. The Evolution of Data Security Methods
4. Data De-Identification
5. Off-Shoring & Outsourcing
6. Use Cases & Case Studies
4
5. Who is Protegrity?
Proven enterprise data protection software leader since the 90’s.
Business driven by compliance
• PCI (Payment Card Industry)
• PII (Personally Identifiable Information)
• PHI (Protected Health Information) – HIPAA
• State and Industry Privacy Laws
Servicing many Industries
• Retail, Hospitality, Travel and Transportation
• Financial Services, Insurance, Banking
• Healthcare
• Telecommunications, Media and Entertainment
• Manufacturing and Government
7. What is Big Data?
Hadoop
• Designed to handle the emerging “4 V’s”
• Massively Parallel Processing (MPP)
• Elastic scale
• Usually Read-Only
• Allows for data insights on massive, heterogeneous
data sets
• Includes an ecosystem of components:
Hive
Pig
Other
Application Layers
MapReduce
HDFS
Storage Layers
Physical Storage
7
10. Cloud Services
Services usually provided by a third party
• Can be virtual, public, private, or hybrid
Increasing adoption – up 12% from 2012*
Often an outsourced solution, sometimes cross-border
Allows for greater accessibility of data and low overhead
*Source: GigaOM
13. Drivers for Data Security
Regulations & Laws
• Payment Card Industry Data Security Standard (PCI DSS)
• National Privacy Laws
• Cross-Border & Outsourcing Privacy Laws
Expanding Threat Landscape
• Hackers & APT
• Internal Threats & Rogue Privileged Users
• Excessive Privilege or Security Negligence
Sensitive Data Insight & Usability
• Unprotected Sensitive or Restricted Data is Unusable for
Marketing, Monetization, Outsourcing, etc.
Vulnerabilities in Emerging Technologies
13
15. PCI Data Security Standards Council
Founded in 2006, comprised of four major credit card
brands
Each card brand enforcement program issues fines,
fees and schedule deadlines
• Visa's Cardholder Information Security Program (CISP)
http://www.visa.com/cisp
• MasterCard's Site Data Protection (SDP) program
http://www.mastercard.com/us/sdp/index.html
• Discover's Discover Information Security and Compliance
(DISC) program
http://www.discovernetwork.com/fraudsecurity/disc.html
• American Express Data Security Operating Policy (DSOP)
http://www.americanexpress.com/datasecurity
15
16. PCI DSS
Build and maintain a secure
network.
1.
2.
Install and maintain a firewall configuration to protect
data
Do not use vendor-supplied defaults for system
passwords and other security parameters
Protect cardholder data.
3.
4.
Protect stored data
Encrypt transmission of cardholder data and
sensitive information across public networks
Maintain a vulnerability
management program.
5.
6.
Use and regularly update anti-virus software
Develop and maintain secure systems and
applications
Implement strong access
control measures.
7.
8.
Restrict access to data by business need-to-know
Assign a unique ID to each person with computer
access
Restrict physical access to cardholder data
9.
Regularly monitor and test
networks.
Maintain an information
security policy.
16
10. Track and monitor all access to network resources
and cardholder data
11. Regularly test security systems and processes
12. Maintain a policy that addresses information security
17. PCI DSS 3.0
Protection of cardholder data in memory
Clarification of key management dual control and split
knowledge
Recommendations on making PCI DSS business-asusual and best practices
Security policy and operational procedures added
Increased password strength
New requirements for point-of-sale terminal security
More robust requirements for penetration testing
17
18. PCI DSS Cloud Guidelines
Relevant to all sensitive data that is outsourced to cloud
1. Clients retain responsibility for the data they put in the cloud
2. Public-cloud providers often have multiple data centers, which may
often be in multiple countries or regions
3. The client may not know the location of their data, or the data may
exist in one or more of several locations at any particular time
4. A client may have little or no visibility into the controls
5. In a public-cloud environment, one client’s data is typically stored
with data belonging to multiple other clients. This makes a public
cloud an attractive target for attackers
18
20. National Privacy Laws - USA
Heath Information Portability and Accountability Act – HIPAA
1. Names
11. Certificate/license numbers
2. All geographical subdivisions
smaller than a State
12. Vehicle identifiers and serial
numbers
3. All elements of dates (except
year) related to individual
13. Device identifiers and serial
numbers
4. Phone numbers
14. Web Universal Resource Locators
(URLs)
5. Fax numbers
6. Electronic mail addresses
7. Social Security numbers
15. Internet Protocol (IP) address
numbers
8. Medical record numbers
16. Biometric identifiers, including
finger prints
9. Health plan beneficiary
numbers
17. Full face photographic images
10. Account numbers
20
18. Any other unique identifying
number
22. National Privacy Laws - India
Information Technology Act – 2000 (IT Act)
• Requires that the corporate body and Data Processor
implement reasonable security practices and standards
• IS/ISO/IEC 27001 requirements recognized
Information Technology Act – 2008 (Amended IT Act)
• Damages for negligence and wrongful gain or loss
• Criminal punishment for disclosing Sensitive Personal
Information (SPI)
India Privacy Law – 2011
• Expanded definition of SPI to passwords, financial data,
health data, medical treatment records, and more
Right to Privacy Bill – 2013 (Proposed)
• Increased jail terms & fines for disclosure of SPI
• Addresses data handled for foreign clients
22
24. Cross-Border & Outsourcing Laws
The laws of the sending country apply to data sent
across international borders, including outsourced
operations
• i.e. National Privacy Laws
APEC Cross-Border Privacy Laws
• Non-binding privacy enforcement in Asia-Pacific region
24
35. Sensitive Data Insight & Usability
Big Data and Cloud environments are designed for
access and deep insight into vast data pools
Data can monetized not only by marketing
analytics, but through sale or use by a third party
The more accessible and usable the data is, the
greater this ROI benefit can be
Security concerns and regulations are often viewed
as opponents to data insight
35
36. Big Data Vulnerabilities and Concerns
Big Data (Hadoop) was designed for data access,
not security
Security in a read-only environment introduces new
challenges
Massive scalability and performance requirements
Sensitive data regulations create a barrier to
usability, as data cannot be stored or transferred in
the clear
Transparency and data insight are required for ROI
on Big Data
36
37. Cloud Vulnerabilities and Concerns
Public cloud security is often not visible to the client,
but client is still responsible for security
Greater access to shared data sets by more users
creates additional points of vulnerability
Data redundancy for high availability, often across
multiple data centers, increases vulnerability
Virtualization can create numerous security issues
Transparency and data insight are required for ROI
How do you lock this?
37
43. DC6
Access Control
Risk
High –
Old and flawed:
Minimal access
levels so people
can only carry
out their jobs
Low –
I
Low
43
I
High
Access
Privilege
Level
44. Slide 43
DC6
I have no idea what this graph is supposed to represent
Daniel Crum, 11/6/2013
45. Applying the protection profile to
the content of data fields allows
for a wider range of authority
options
44
46. How the New Approach is Different
Risk
High –
Old:
Minimal access
levels – Least
Privilege to avoid
high risks
New:
Much greater
flexibility and
lower risk in data
accessibility
Low –
I
Low
45
I
High
Access
Privilege
Level
47. Reduction of Pain with New Protection Techniques
Pain
& TCO
High
Input Value: 3872 3789 1620 3675
Strong Encryption Output: !@#$%a^.,mhu7///&*B()_+!@
AES, 3DES
Format Preserving Encryption
DTP, FPE
8278 2789 2990 2789
Format Preserving
Vault-based Tokenization
8278 2789 2990 2789
Greatly reduced Key
Management
Vaultless Tokenization
Low
No Vault
1970
46
2000
2005
2010
8278 2789 2990 2789
48. Fine Grained Security: Encryption of Fields
Production Systems
Non-Production Systems
47
Encryption of fields
• Reversible
• Policy Control (authorized / Unauthorized Access)
• Lacks Integration Transparency
• Complex Key Management
• Example: !@#$%a^.,mhu7///&*B()_+!@
49. Fine Grained Security: Masking of Fields
Production Systems
Non-Production Systems
48
Masking of fields
• Not reversible
• No Policy, Everyone can access the data
• Integrates Transparently
• No Complex Key Management
• Example: 0389 3778 3652 0038
50. Fine Grained Security: Tokenization of Fields
Production Systems
Tokenization (Pseudonymization)
• No Complex Key Management
• Business Intelligence
• Example: 0389 3778 3652 0038
• Reversible
• Policy Control (Authorized / Unauthorized Access)
• Not Reversible
• Integrates Transparently
Non-Production Systems
49
51. Fine Grained Data Security Methods
Tokenization and Encryption are Different
Encryption
Used Approach
Tokenization
Cipher System
Code System
Cryptographic algorithms
Cryptographic keys
Code books
Index tokens
Source: McGraw-HILL ENCYPLOPEDIA OF SCIENCE & TECHNOLOGY
50
52. Fine Grained Data Security Methods
Vault-based vs. Vaultless Tokenization
Vault-based Tokenization
Footprint
Large, Expanding.
Small, Static.
High Availability,
Disaster Recovery
Complex, expensive
replication required.
No replication required.
Distribution
Practically impossible to
distribute geographically.
Easy to deploy at different
geographically distributed locations.
Reliability
Prone to collisions.
No collisions.
Performance,
Latency, and
Scalability
51
Vaultless Tokenization
Will adversely impact
performance & scalability.
Little or no latency. Fastest industry
tokenization.
53. The Future of Tokenization
PCI DSS 3.0
• Split knowledge and dual control
PCI SSC Tokenization Task Force
• Tokenization and use of HSM
Card Brands – Visa, MC, AMEX …
• Tokens with control vectors
ANSI X9
• Tokenization and use of HSM
52
54. Security of Different Protection Methods
Security Level
High
Low
I
I
I
Basic
Format
AES CBC
Vaultless
Data
Preserving
Encryption
Data
Tokenization
53
I
Encryption
Standard
Tokenization
55. Speed of Different Protection Methods
Transactions per second*
10 000 000 1 000 000 100 000 10 000 1 000 100 I
I
I
I
Vault-based
Format
AES CBC
Vaultless
Data
Preserving
Encryption
Data
Tokenization
Encryption
Standard
Tokenization
*: Speed will depend on the configuration
54
56. Risk Adjusted Data Protection
There is always a trade-off between security and usability.
Data Security Methods
Performance
Storage
Security
Transparency
System without data protection
Monitoring + Blocking + Obfuscation
Data Type Preservation Encryption
Strong Encryption
Vaultless Tokenization
Hashing
Anonymisation
Worst
55
Best
58. What is de-identification of identifiable data?
The solution to protecting Identifiable data is to properly deidentify it.
Personally Identifiable Information
Health Information / Financial Information
Personally Identifiable Information
Health Information / Financial Information
Redact the information – remove it.
The identifiable portion of the record is de-identified with any
number of protection methods such as masking, tokenization,
encryption, redacting (removed), etc.
The method used will depend on your use case and the
reason that you are de-identifying the data.
57
59. Identifiable Sensitive Information
Field
Real Data
Tokenized / Pseudonymized
Name
Joe Smith
csu wusoj
Address
100 Main Street, Pleasantville, CA
476 srta coetse, cysieondusbak, CA
Date of Birth
12/25/1966
01/02/1966
Telephone
760-278-3389
760-389-2289
E-Mail Address
joe.smith@surferdude.org
eoe.nwuer@beusorpdqo.org
SSN
076-39-2778
937-28-3390
CC Number
3678 2289 3907 3378
3846 2290 3371 3378
Business URL
www.surferdude.com
www.sheyinctao.com
Fingerprint
Encrypted
Photo
Encrypted
X-Ray
Encrypted
Healthcare /
Financial
Services
58
Dr. visits, prescriptions, hospital stays
and discharges, clinical, billing, etc.
Financial Services Consumer Products
and activities
Protection methods can be equally
applied to the actual healthcare data, but
not needed with de-identification
60. De-Identified Sensitive Data
Field
Real Data
Tokenized / Pseudonymized
Name
Joe Smith
csu wusoj
Address
100 Main Street, Pleasantville, CA
476 srta coetse, cysieondusbak, CA
Date of Birth
12/25/1966
01/02/1966
Telephone
760-278-3389
760-389-2289
E-Mail Address
joe.smith@surferdude.org
eoe.nwuer@beusorpdqo.org
SSN
076-39-2778
076-28-3390
CC Number
3678 2289 3907 3378
3846 2290 3371 3378
Business URL
www.surferdude.com
www.sheyinctao.com
Fingerprint
Encrypted
Photo
Encrypted
X-Ray
Encrypted
Healthcare /
Financial
Services
59
Dr. visits, prescriptions, hospital stays
and discharges, clinical, billing, etc.
Financial Services Consumer Products
and activities
Protection methods can be equally
applied to the actual data, but not
needed with de-identification
61. How Should I Secure Different Data?
Use
Case
Tokenization
of Fields
Encryption
of Files
Simple –
Card
Holder
Data
PII
PCI
Personally Identifiable Information
Complex –
Protected
Health
Information
I
Un-structured
60
PHI
I
Structured
Type of
Data
62. Research Brief
Tokenization Gets Traction
Aberdeen has seen a steady increase in enterprise
use of tokenization for protecting sensitive data over
encryption
Nearly half of the respondents (47%) are currently
using tokenization for something other than cardholder
data
Over the last 12 months, tokenization users had 50%
fewer security-related incidents than tokenization nonusers
61
Author: Derek Brink, VP and Research Fellow, IT Security and IT GRC
63. Vaultless Tokenization & Data Insight
The business intelligence exposed through Vaultless
Tokenization can allow many users and processes to
perform job functions on protected data
Extreme flexibility in data de-identification can allow
responsible data monetization
Data remains secure throughout data flows, and can
maintain a one-to-one relationship with the original
data for analytic processes
62
66. Privacy Impacts BPO & Offshore Business Solutions
Business Process Outsourcing (BPO)
• Business Processes
• E.g. Loans, Mortgages, Call Centre, Claims Processing, ERP,
etc.
• Application Development
• Need to de-identify Data for Testing and Development
Off-Shoring
• Same as Outsourcing, but data is sent for business functions
(like call center, etc.) off-shore.
Laws governing your ability to send real data to 3rd parties are
already restrictive, and becoming more so
Penalties for infringement are growing more severe
Risk of data breaches and data theft is increased
65
67. Examples
Major Bank in EU wants to centralise EDW
operations in a single country and therefore send
customer data from country A to country B. Privacy
Laws in country A prohibit this.
Private Bank in Europe wants to offshore Finance
Operations. Privacy Law prohibits transfer of citizen
data to India.
Retail Bank in Scandinavia wants to offshore
Customer Services. Privacy law prevents transfer of
citizen data to the Far East.
66
69. Protegrity Use Case: UniCredit
CHALLENGES
The primary challenge was to protect PII – names and addresses, phone and email, policy and account numbers,
birth dates, etc. – to the satisfaction of EU Cross Border Data Security requirements. This included incoming
source data from various European banking entities, and existing data within those systems, which would be
consolidated at the Italian HQ.
70. Case Study - Large US Chain Store
Reduced cost
• 50 % shorter PCI audit
Quick deployment
• Minimal application changes
• 98 % application transparent
Top performance
• Performance better than encryption
Stronger security
69
71. Case Study: Large Chain Store
Why? Reduce compliance cost by 50%
• 50 million Credit Cards, 700 million daily transactions
• Performance Challenge: 30 days with Basic to 90 minutes with
Vaultless Tokenization
• End-to-End Tokens: Started with the D/W and expanding to
stores
• Lower maintenance cost – don’t have to apply all 12 requirements
• Better security – able to eliminate several business and daily
reports
• Quick deployment
• Minimal application changes
• 98 % application transparent
70
74. Aadhaar Data Stores
Shard
0
Shard
a
Shard
2
Shard
6
Shard
d
Shard
1
Shard
f
Shard
9
Solr cluster
(all enrolment records/documents
– selected demographics only)
Shard
2
Shard
3
Shard
4
Shard
5
UID master
(sharded)
Mongo cluster
Data
Node 1
LUN 1
Region
Ser. 10
Data
Node 10
LUN 2
Low latency indexed read (Documents per sec),
High latency random search (seconds per read)
(all enrolment records/documents
– demographics + photo)
Enrolment
DB
MySQL
(all UID generated records - demographics only,
track & trace, enrolment status )
HBase
Region
Ser. 1
Low latency indexed read (Documents per sec),
Low latency random search (Documents per sec)
Region
Ser. ..
Data
Node ..
LUN 3
Region
Ser. 20
(all enrolment
biometric templates)
High read throughput (MB per sec),
Low-to-Medium latency read (milli-seconds per read)
(all raw packets)
High read throughput (MB per sec),
High latency read (seconds per read)
NFS
Data
Node 20
LUN 4
HDFS
Low latency indexed read (milliseconds per read),
High latency random search (seconds
per read)
Moderate read throughput,
High latency read (seconds per read)
(all archived raw packets)
75. Protegrity Summary
Proven enterprise data security
software and innovation leader
•
Sole focus on the protection of
data
•
Patented Technology,
Continuing to Drive Innovation
Cross-industry applicability
•
•
Financial Services, Insurance,
Banking
•
Healthcare
•
Telecommunications, Media and
Entertainment
•
74
Retail, Hospitality, Travel and
Transportation
Manufacturing and Government
76. Please contact us for more information
Ulf.Mattsson@protegrity.com
Info@protegrity.com
Elaine.Evans@protegrity.com
www.protegrity.com