Best Practices for Preventing Secrets & Credentials Leaks in GitHub

One of the core aspects of any information security program is maintaining the confidentiality and integrity of an organization’s data. Modern cloud environments can often make this difficult, with security teams having to maintain visibility and manage controls across a wide variety of SaaS and cloud infrastructure systems. Among these systems, code repositories like GitHub can be a lesser-known source of secrets leakage. This is especially true for organizations that are new to the cloud, or that don’t have mature information security programs. There are, however, practices that organizations of any size can adopt to ensure that they take steps to address this risk. Below we’ll go over the scope of the problem as well as discuss best practices to help limit secrets exposure.

Understanding the scope of credentials and secrets leakage

Code repositories or repos, like many other highly collaborative SaaS environments, create increased opportunities for sensitive data exposure to occur without warning or notice. Take for instance a code repo where both internal and external collaborators submit code. They could push commits at any time of the day, and if no review process is in place they could push code that contains credentials and other sensitive tokens within it. Even if the repo were private, you still may wish to strictly enforce what types of tokens are contained within your codebase to maintain your best practices. Research, like a North Carolina State University 2019 study titled “How Bad Can It Git? Characterizing Secret Leakage in Public GitHub Repositories,” have quantified just how common credentials and secrets exposure are within GitHub. The researchers who conducted this study found that thousands of keys leak from public repositories on a daily basis with hardcoded cryptographic keys and API keys being critical sources of leakage. To address this long and ongoing problem, GitHub has offered limited secret scanning for code pushes to public repositories containing popular token types like AWS, Azure, and Alibaba. GitHub will notify the service provider of any credentials leak and have them decide how they want to address the issue. Despite this, secrets leaks still occur on the platform. Earlier this year, a story broke about an AWS DevOps Cloud engineer who inadvertently made public nearly a gigabyte of sensitive data after making a commit to a personal repository. Another story from this year includes Canadian telecom company Rogers Communications having passwords and source code exposed on GitHub. Leakage isn’t limited to GitHub, though; for example, German automaker Daimler leaked Mercedes-Benz’s source code for smart car components through an unsecured GitLab server last month.

How can you prevent secrets leaks in GitHub?

Despite the scope of the problem, there are a variety of practices that organizations can take to begin reducing the risks of credentials and secrets being exposed within their codebase.

Standardize coding conventions and practices

In collaborative cloud environments with high volumes of activity, it’s very easy for organizations to fail to put rules regulating user behavior into place. This digital “housekeeping” is essential, as when users have different conceptions of what behaviors are allowed within an environment, things can turn into the wild west, where no one is responsible for cloud and data security. One of the rules that’s essential to put into place is to standardize coding conventions by eliminating practices like hard coding credentials within code and developing a consistent code review process that evaluates whether or not designated practices have been followed.

Make sure your production environment remains private

When it comes to secrets leakage, one piece of low hanging fruit to address is permission settings. Your organization needs to maintain visibility into its production environments at all times, ensuring that any associated code repos remain private. Within your GitHub org, make sure that org owners perform periodic reviews of repo privacy settings to ensure that repos which shouldn’t be public remain private.

Implement periodic reviews of your codebase

Reviewing code before commits is pretty standard practice, but you may also wish to standardize reviewing code after it’s been committed, as well. Periodic code reviews will give you an opportunity to ensure your codebase remains devoid of leakable secrets.

Make sure to use the right tools to manage secrets leakage within your code repos

With secrets leakage remaining a huge problem, many teams have turned to tools that help them mitigate leakage risk. For example, Nightfall Radar is a repository scanner that can connect to both public and private Github repositories via API to scan them for sensitive data. Using machine learning based detectors, Radar can identify a variety of tokens like keys, certificates, and other secrets. Radar can also ignore designated tokens, like in testing environments, after they’ve been added to an allow list. Finally, Radar scans can be automatically scheduled and conducted periodically via workflows. Using Radar is pretty intuitive and you can start for free at radar.nightfall.ai. If you want to learn more about the Radar platform, have a look at our Guide to Secrets Detection on GitHub or our Radar documentation page where you can sign up for it as well. Nightfall Radar, while powerful, is meant to be used in tandem with the Nightfall DLP GitHub Action which specifically can be used for scanning pull requests. This ensures that you have both historical scanning functionality on top of the ability to scan new code for secrets. You can learn more about the Nightfall DLP GitHub action here. Finally if you want to see both platforms in action, schedule a demo with us below.

Share this post: