# How Data Discovery and Classification Can Help Secure PII

When it comes to cybersecurity, data discovery can sometimes be an overlooked component of many organizations’ approach to securing data. However, the importance that data discovery has for any modern organization or enterprise cannot be understated. Data discovery tools help provide security teams with data visibility, or the ability to know where sensitive data is and whether its in use. Organizations can leverage this visibility to determine where their security efforts should be focused, what data to secure, how to comply with industry regulations, and what internal policies are necessary for managing data.

## What does data discovery involve?

Data discovery is exactly what it sounds like it’s the act of discovering data across your organization’s systems. Within the context of information security, data discovery is a process generally carried out by auditing tools designed to scan applications, networks, or endpoints for specific types of data. These tools can be anything from data loss prevention (DLP) solutions, like Nightfall, to access brokers and other monitoring or data policy enforcement tools.

Depending on your organization’s determined use case, it could make sense to use a discovery tool narrowly, like within a single application, or across all of your infrastructure. To illustrate this, consider a hypothetical outpatient clinic that wishes to communicate over Slack with affiliated clinics about patient care notes to help triage incoming patients more quickly. In such a case, it might make sense to invest in a HIPAA compliant data visibility solution specifically for Slack.

## What are the features of an effective data discovery platform?

Ultimately, your needs will determine what types of data discovery tools make sense to use; however, you should generally consider the following features to be integral in your efforts to increase your organization’s data visibility:

### 1. Use a platform that makes classification part of the discovery process

Although being able to monitor your data is central to the concept of data discovery, having the ability to classify your data is arguably even more important. Data classification lets you parse files and/or strings of data to properly categorize data found within structured or unstructured data sources. If this process is conducted with a high degree of accuracy (i.e. without false positives), this should let you determine the content and context of the data your organization uses and stores. Furthermore, it will enable your organization to make actionable insights regarding what to do with its data and how to secure it.

There are many approaches to data classification. For instance, some data discovery tools with classification features might use regular expressions, or regexes, to determine the content of data. In contrast, other tools might apply heuristics to assess the context of data. Our platform, Nightfall, is unique from these traditional approaches in that we’ve built custom machine learning detectors specifically trained to identify common types of PII across a variety of SaaS and IaaS environments. This allows our platform to both account for context and improve the accuracy of our detection and classification capabilities compared to other solutions. For example with Radar, our GitHub secrets detection tool, we found that our API key detectors have significantly fewer false positives than the most popular tools. You can learn more about Radar here.

### 2. Additionally, consider platforms that enable workflow implementation or remediation

The key benefit of data discovery and classification tools is that they typically provide teams with the insights needed to create thorough data use and storage policies. Effective programs go one step further by allowing administrators to implement workflows that enforce these policies across the applications or networks where their data lives. Our Slack bot, for example, enables teams to automatically detect, quarantine, and delete offending PII across designated channels. Furthermore, we allow remediation to be turned into a learning moment by permitting administrators to customize the messages end-users see after our bot flags their activity. You can learn more about our Slack data discovery and DLP features here.

### 3. Look for tools that automate as many processes as possible

Currently, one of the biggest trends in information security is automation, and this is for good reason. The security landscape has grown in complexity with cloud, IoT, bring your own device, shadow IT, and other trends transforming where data lives and how much of it exists. This is to say nothing of the shortage of talent within cybersecurity, with some estimates suggesting that 3.5 million security jobs will go unfilled by 2021. Given these circumstances, automation isn’t just a good idea but a genuine necessity for the success of security teams today. It’s important to note that the purpose of automation isn’t to set and forget, but rather, finding security solutions that successfully leverage AI or are otherwise automated will provide organizations the elbow room needed to build comprehensive data policies. Security teams using automated tools will prove to be more effective, as they won’t need to respond to every potential incident or security misconfiguration. Nightfall’s workflows are designed with this principle in mind, allowing security policies to be seamlessly enforced from the platform’s dashboard.

### 4. Consider cloud-native discovery platforms for securing your SaaS and IaaS stack

Gartner predicts that this year the public cloud market will grow from $227.8 billion to$266.4. With cloud adoption already being mainstream, many organizations are facing the problem of business-critical data being sprayed across cloud services and infrastructure. Data discovery tools, then, need to adjust for this by being designed specifically for the cloud. Nightfall, as a cloud-native security solution, circumvents many of the visibility issues associated with more traditional DLP tools by directly integrating with business applications. This gives you transparency into what’s happening within your applications and infrastructure, regardless of your network or device configurations.