7 Key Security Criteria for Data Tools: A Buyer’s Guide

How many specialized SaaS tools does your company have for moving, analyzing, and profiling data across departments? If you’re like most digital companies today, the answer is “more and more.”

In the last decade, the decentralization of data competencies and increased adoption of cloud-based deployment models has resulted in a wider-than-ever range of data tools on the market, and a greater-than-ever need to use them. They include:

  • Extract, transform, load (ETL) and extract, load, transform (ELT) platforms
  • Information platforms as a service (iPaaS)
  • Data warehousing solutions
  • Data visualization solutions
  • ML/AI engines that ingest data models and build AI engineering models on top of them

Using such tools to build a composable data architecture is key to staying agile in the fast-evolving world of digital business, but it raises two important questions about security: How can we ensure that vendors of SaaS data solutions are keeping our data and our customers’ data safe? What security mechanisms do these tools offer to help us as users prevent the leakage of sensitive data?

With GDPR fines and hacking incidents on the rise, companies should be answering these questions before buying.

7 Security Criteria to Guide Buyers of Data Tools

Below are seven criteria to look out for when evaluating the suitability of any data tool for inclusion in your data stack.

SOC 2 Type 2 Certification

SOC 2 (System and Organization Controls 2) is a type of audit report that measures how securely service organizations—in particular those providing cloud-based services—handle customer data. More specifically, it is an independent assessment of controls for security, availability, processing integrity, confidentiality, and privacy. 

Whereas type 1 reports are only snapshots of an organization’s controls at a specific point in time, type 2 reports evaluate the effectiveness of a service organization’s controls over a longer period of time (minimum 6 months).

SOC 2 Type 2 certification is therefore the standard for how cloud service providers should manage customer data.

ISO 27001 Certification or Compliance

ISO 27001 is an international standard for information security management systems (ISMS) and their requirements. Certification or compliance with this standard is a reasonable substitute for—or proper supplement to—SOC 2 Type 2 certification. (However, SOC 2 Type II certification is more thorough because it requires an on-site check from an independent auditor.)

The standard provides for six areas of data security: company security policy, asset management, physical and environmental security, access control, incident management, and regulatory compliance.

Compliance with Important Local and Regional Data Protection Standards

Such standards include GDPR for Europe, CCPA and HIPAA in the US, LGPD for Brazil, and POPIA for South Africa.

Though it’s not possible to be certified for all of these (there is no GDPR certification, for example) data tool vendors operating in these areas should be taking steps to ensure compliance. Fortunately, if a data tool vendor is SOC 2 Type 2 certified, they are also by default compliant with the standards mentioned above.

Data Masking/Exclusion Capability

Data masking and exclusion capabilities are something you would look for in a data integration tool specifically, because they prevent the flow of personal identifiable information (PII) and other sensitive data from one system to another entirely.

CRMs and ERPs, for instance, are systems that contain a lot of sensitive data on customers and employees, like address information, payroll information, and so on. These systems have strictly controlled access levels, so as long as the sensitive data stays inside them, there isn’t much of a security risk. If your data integration tool offers data exclusion functionality, you can simply exclude—or not extract—sensitive data when moving datasets across systems.

But, sometimes, sensitive information needs to be extracted; for example, when an email address is the only reasonable customer identifier for use in downstream systems. If your data integration tool offers masking functionality, you can mask or hash this data during extraction. This way, the uniqueness of the email address as an identifier is preserved, but the email address itself is hidden from view.  

Role-Based Access Control (RBAC)

Vendors should allow you to set multiple permission levels for the use of their tools, e.g. for viewing, editing, authorizing, etc. This makes managing access controls much easier, because it allows administrators to assign permissions to roles rather than to individual users.

Data Encryption Capability

There are two basic types of data encryption: encryption in rest and encryption in transit. For tools that move data around, ask about encryption in transit (Does the vendor support SSH tunnelling and VPNs?). For tools that store data, ask about encryption at rest (What kind of ciphers are being used? AES256 is the standard.). 

For both types of tools, ask how encryption keys are stored. Best practice is to use a third-party, hardware security module (HSM) key management service like AWS Key Management Service. Vendors using services like this don’t actually have access to the keys because they are encrypted via the service’s API.

Logging and Auditing Systems

Make sure that each tool has a logging system in place. This way, you can keep a complete overview of all activity within the tool and, if a security incident ever does occur, you will have all the records you need to conduct a forensic analysis.

No Silver Bullet, Only Due Diligence

Even though we are relying more and more on the measures that tool vendors take to ensure the security of our data, we ourselves are just as responsible for it as we’ve always been.

Indeed, no matter how secure a data tool is on the vendor’s side, it should still give us the capabilities we need to do our part.

If any given data tool meets the above criteria, you’ll know that its vendor pays due diligence to data security, so that you can, too.

About the Author

Petr Nemeth is the founder and CEO of Dataddo—a fully managed, no-code data integration platform that connects cloud-based services, dashboarding applications, data warehouses, and data lakes. The platform offers ETL, ELT, reverse ETL, and data replication functionality, as well as an extensive portfolio of 200+ connectors, enabling business professionals with any level of technical expertise to send data from virtually any source to any destination. Before founding Dataddo, Petr worked as a developer, analyst, and system architect for telco, IT, and media companies on large-scale projects involving the internet of things, big data, and business intelligence.

Sign up for the free insideAI News newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideAI NewsNOW