GitHub key leaks and how to prevent them

Hundreds of thousands of tokens and cryptographic keys have been discovered on GitHub. We explain why this is bad and how to avoid a leak.

Special care should be taken when uploading code to GitHub

Recently, researchers at North Carolina State University discovered more than 100,000 projects on GitHub with tokens, cryptographic keys, and other confidential data stored in open form. In total, more than half a million such objects were found in the public domain, of which more than 200,000 are unique. What’s more, the tokens were issued by major companies such as Google, Amazon MWS, Twitter, Facebook, MailChimp, MailGun, Stripe, Twilio, Square, Braintree, and Picatic.

GitHub is a popular resource for cooperative software development. It is used for storing code in repositories with open or restricted access, linking with colleagues, involving them in program testing, and using ready-made open-source developments. It greatly simplifies and speeds up the creation of apps and services, so many programmers are happy to use it. Companies that create their software based on open-source modules use it actively. In addition to that, companies that want to be transparent frequently use it.

However, special care should be taken when uploading code to GitHub — advice that some developers do not always follow.

What data got into the public domain

GitHub was found to be hosting blocks of openly available code containing tokens and keys sufficient to pass authorization and perform certain actions on behalf of users or apps. Among this unwittingly declassified information were:

  • Login credentials for administrator accounts on major websites,
  • API keys or tokens enabling the use of in-app API functions — a set of tools for interaction between various system components, for example, a program and a website,
  • Cryptographic keys, many of which are used for authentication instead of a password, not in combination with one, so knowing only one key is enough to gain access to many resources, including private networks.

Why leaked tokens and cryptographic keys are a risk

Unauthorized access to your accounts, even limited, poses a serious threat to your business. Following are some examples.

One way to misuse tokens published on GitHub is the distribution of mail blitzes and posts supposedly from the company that published those tokens. For example, an intruder could gain access to a corporate website or a Facebook or Twitter account, and place a malicious post or phishing link there. Since official websites and accounts are generally considered to be reliable sources of information, the risk is high that many readers will assume the post or link is safe.

In addition, cybercriminals can phish everyone in your subscriber list (for example, if you use MailChimp). As in the previous scenario, the expectation here is that users will trust mail they signed up for from a bona fide company. Such attacks can seriously harm a company’s reputation and cause major damage in terms of lost clients and time spent on restoring normal operation.

Lastly, cybercriminals can simply use the billable features of a service — for example, Amazon AWS — at your expense. For example, blogger Luke Chadwick once received a message from Amazon that his key was publicly available on GitHub. A search led him to an old project that for some reason he had forgotten to close. When Chadwick logged into his Amazon account, he discovered $3,493 in pending charges. It turned out that an unauthorized user had gotten hold of the publicly available key and mined cryptocurrency using his account. In the end, Amazon reimbursed Chadwick’s losses. But remember that the tale doesn’t always end happily.

How private data ended up on GitHub

Analysis of the research results shows that it’s not only young and inexperienced programmers who leave confidential information in the public domain. For example, data providing access to the website of a large government institution was posted on GitHub by a developer with a 10-year track record.

Tokens and all kinds of keys get published in GitHub repos for various reasons. Authorization tools may be required for integrating an app with a particular service. When publishing code for testing, some programmers use valid keys instead of debug ones, and then just forget to remove this information from public access.

For example, Securosis analyst and CEO Rich Mogull posted on GitHub an app he was developing for a conference report. The program made calls to Amazon AWS, and he stored all data for authorization locally. However, to debug individual blocks of code, he created a test file containing several access keys. After debugging, Mogull simply forgot to remove them from this file. They were subsequently found by intruders who clocked up $500 in Amazon services before being noticed.

In addition, developers might simply be unaware of the risk of leaving valid tokens in GitHub repos and the need to pinpoint and delete (or replace) them before siting code there.

How to protect your resources

  • Make your developers aware that uploading valid tokens and keys to open repositories is harmful and dangerous; programmers should understand that before placing code, they must confirm that it contains no secret data.
  • Have the product manager audit your company’s projects on GitHub to see if they contain confidential information, and if so, delete it; note that it must be removed thoroughly so nothing remains in the change history.
  • If any of the data your company stores on GitHub contains passwords, change them; there is no way of knowing if anyone has already viewed and saved the code.
  • Raise employee awareness of information security on an ongoing basis, so that the responsible use of GitHub and other tools and resources becomes second nature. Our Kaspersky Automated Security Awareness Platform will help you do this effectively and practically, without interrupting operations.
Tips