As I talk to Enterprise customers I’m finding a lot of confusion about when to use encryption or tokenization, and how to think about these two data protection technologies. Once you understand how each of these technologies work, you understand that there are no easy answers to which is best for you, or when one is better than another. I want to talk about some general guidelines I’ve developed to help with this conundrum.
Encryption is well known technology, has clear standards, and has been in use for data protection for a long time. Most of the compliance regulations (PCI, HIPAA/HITECH, state privacy regulations, etc.) make clear reference to encryption and widely accepted standards. So it is a no-brainer to use encryption. When done correctly, it is going to meet compliance regulations and your security goals.
But encryption has a nasty side-effect. When you encrypt fields in your database that are indexes or keys, you disrupt the indexing capability of the field and you introduce unacceptable performance burdens on your system. Encrypting an index or key field often means re-engineering the application and this is costly and time consuming.
Enter the new kid on the block – Tokenization.
When you tokenize data you replace the sensitive data with a surrogate, or token, value. The token itself is not sensitive data, but it maintains the characteristics of the original sensitive data. It walks like a duck, it quacks like a duck, but from a compliance point of view, it is NOT a duck. Tokenizing data lets you maintain those precious index and key relationships in your databases, and minimizes the number of changes you have to do to your applications.
So, why not use tokenization for everything? Is this the magic bullet we’ve been searching for?
Hold on there, Cowboy. There are some things you should think about.
Tokenization solutions typically work by creating a separate database to store the token and the relationship to the original sensitive data. This means that every time you need to register a new token, retrieve the attributes of the token, or recover the sensitive data, you have to make a request to the tokenization solution to do this work for you. Got 10 million records in your database? This is going to have a major impact on performance. Applications that need high performance may not be the best for a tokenization approach – you might really want to use encryption in this environment.
Then there is the question of compliance. Tokenization is new technology. At this point there are no standards for tokenization solutions, and no reference to tokenization in the published regulations. So, are you really compliant if you tokenize sensitive data? I think so, but you should be aware that this is an unsettled question.
When you tokenize data, you are creating a separate repository of information about the original sensitive data. In most cases you will probably be using a solution from a vendor. Since the tokenization solution contains sensitive data, it will itself be in scope for compliance. Has the vendor used encryption, key management, and secure communications that meet compliance regulations? How do you know? If you are going to deploy a tokenization solution you will want to see NIST certification of the solution’s encryption and key management so that you are not just relying on the claims of the vendor.
Most Enterprise customers will probably find uses for both encryption and tokenization. Encryption is great for those high-performance production applications. Tokenization is great for maintaining database relationships, and reducing risks in the development, test, QA and business intelligence databases. Both can help you protect your companies’ sensitive data!
For more information on tokenization, view our recorded webcast titled "
Tokenization and Compliance - Five Ways to Reduce Costs and Increase Security."