Blog

How to Discover and Protect Sensitive Data in HAR Files

by
Isaac Madan
,
October 26, 2023
How to Discover and Protect Sensitive Data in HAR FilesHow to Discover and Protect Sensitive Data in HAR Files
Isaac Madan
October 26, 2023
On this page

In light of the recent data breach at Okta, it’s important to pay attention to the potential risks of sharing HAR files in SaaS data silos like Zendesk.

A HAR (HTTP Archive) file is a JSON-based file format that records and stores detailed information about the interactions between a web browser and a website. It captures data such as requests, responses, cookies, and timings. HAR files are commonly used for debugging and analyzing web performance issues, as they can provide valuable insights into network traffic and identify potential security vulnerabilities. However, it's important to note that HAR files may contain sensitive data, such as login credentials, which should be handled with caution to ensure data privacy and security.  Additional examples of sensitive data that may be found within HAR files include:

  • Cookies, which may contain access tokens or credentials
  • Authorization headers, which may also contain access tokens or credentials
  • POST body contents, which can reveal sensitive user inputs like passwords or credit card numbers
  • URL query parameters, which can expose IDs, search terms, or other private information
  • Cached resources, including local files like images and scripts

Most help center docs that reference HAR files provide a generic disclaimer like: “Make sure that you secure your HAR files accordingly.” However, that’s much easier said than done. HAR files can be massive; on some occasions they consist of tens of thousands of lines. With this in mind, it can be difficult to thoroughly scrub credentials and secrets out in a timely manner—that is, unless you have access to an automated secrets scanning solution like Nightfall.

There are a few ways you can keep your secrets safe with Nightfall:

  1. Discover data at rest: Use Nightfall for Data at Rest to scan historical data in silos like Zendesk—including stored HAR files.
  2. Detect sensitive HAR files upon upload: Deploy Nightfall for SaaS to scan customer support tickets for HAR files in near-real time.
  3. Scan specific files: Upload and scan compressed HAR files in the Nightfall Detection Playground and sandbox environment.
  4. Programmatically scan and redact sensitive data: Read, scan, and scrub secrets and keys from HAR files using Nightfall’s AI-powered detection engine.

Let’s take a closer look at each of these methods.

Discover data at rest

Nightfall for Data at Rest leverages AI-powered scanning capabilities to provide complete visibility into SaaS data silos for apps like Zendesk. In short, Nightfall can pinpoint sensitive PII, credentials, and more within historical data, including HAR files. This functionality not only helps businesses to proactively identify and address the potential security risks associated with HAR files—it also helps to keep their customer service platforms compliant with leading privacy frameworks like SOC 2.

Detect sensitive HAR files upon upload

As with any possible data leak, it’s crucial to flag sensitive data in HAR files immediately upon upload. By utilizing Nightfall for SaaS, organizations can set up custom rules and policies to ensure that HAR files are automatically scanned for authorization headers, login credentials, cookies, and PII during the ticket creation process. This ensures that any HAR files containing sensitive information are promptly detected and flagged for further review.

For instance, Nightfall’s detection engine can flag active JWT Bearer authentication tokens and API keys in both JSON and URL parameter format. Once these sensitive data types are detected, Nightfall provides several remediation options, such as deleting HAR files from customer support platforms. By promptly detecting and addressing sensitive information within HAR files, organizations can uphold the trust of their customers and protect their valuable data assets.

Scan specific files

Are you working with a HAR file and wondering if it contains sensitive data? Simply upload and scan your file in Nightfall’s Playground. The Playground is a safe and secure environment to test the scanning capabilities of Nightfall's detection engine.

Here’s how to scan a HAR file on the Nightfall Playground:

  1. Go to playground.nightfall.ai in your web browser.
  2. Click “I’m Scanning Files” to select the HAR file you want to scan. Then click the “Upload” button.
  3. Select the detection rule for “Credentials & Secrets” to start. Advanced users can also customize their own detection rules here.
  4. Receive a detailed report of any sensitive data detected within the HAR file.
  5. Review the report to understand any potential risks.
  6. Take action to secure any potentially sensitive information.

The Playground offers a quick way to scan and evaluate the contents of a HAR file for sensitive data. However, please note that Playground is a testing environment and should not be used for scanning sensitive or confidential data. For production use and comprehensive data protection, we recommend integrating Nightfall's Data Loss Prevention platform into your organization's infrastructure.

Programmatically scan and redact sensitive data

Say you’re looking to scan all the HAR files that live in a certain SaaS data silo. One way you could do this is by integrating a DLP solution like Nightfall with the SaaS app via APIs.

Let’s walk through a specific example involving Zendesk. In this case, our goal is to write a service that iterates through every Zendesk ticket, pulls all attachments from tickets and comments, and scans for sensitive secrets and credentials.

Here’s how to proceed:

  1. Authenticate to the Zendesk API.
  2. Use the tickets endpoint to retrieve tickets with attachments. Enumerate all of them by paging through the results.
  3. For each ticket_id call the comments endpoint to retrieve all comments with attachments on a given ticket. Enumerate all of them by paging through the results.
  4. Set up a webhook server for scanning files with Nightfall. If you’re starting from scratch, follow the Nightfall quickstart guide or take a moment to learn more about how file scanning works.
  5. Create a detection rule in Nightfall to scan for secrets and credentials. Follow these steps to generate your policy template.
  6. Send each file attachment to the Nightfall webhook server to be scanned by the file scanning endpoint using the detection rules that you’ve configured.
  7. Implement logic in the webhook server to write findings out to a file.
  8. To extend functionality further, write logic that uses the data classification findings to redact sensitive content in the character ranges from within the HAR files. Nightfall’s findings will provide you with character and byte ranges you can use to redact HAR files, which are essentially JSON.

Once you’ve worked through this pattern, you’ll have learned how to scan file attachments in both tickets and comments in Zendesk. This pattern will ensure that any sensitive information in your customer support platform will be identified and protected, provided you take the necessary remediation actions.

Key takeaways

The recent Okta data breach emphasizes the importance of safeguarding sensitive data in HAR files. Here are a few key points at a glance:

  • HAR files capture detailed interactions between a web browser and a website, and as such, they often contain sensitive information that could lead to a data breach.
  • Nightfall provides a seamless, AI-powered solution for discovering data at rest, inspecting SaaS data silos, scanning specific files, and programmatically redacting sensitive data.

In short? Nightfall helps to mitigate the risks associated with HAR files in order to protect business’ data while ensuring compliance every step of the way.

Try Nightfall's data-at-rest scanner for Zendesk to assess your risk today.

Nightfall Mini Logo

Getting started is easy

Install in minutes to start protecting your sensitive data.

Get a demo