3 reasons why GitHub timestamps shouldn’t be trusted

3 reasons why GitHub timestamps shouldn’t be trusted

GitHub has become an indispensable tool for collaborating on software development and version control. With robust features and a huge community, it’s not surprising that developers have found clever ways to repurpose GitHub for unexpected use cases.

One use case we regularly hear about is people who use GitHub to create timestamped repos, files, and data. These timestamps are then used externally to show when a piece of code was developed, data was collected, or an experiment was conducted. However, when it comes to recording verifiable timestamps, GitHub might not be the best choice. Here’s why.

GitHub relies on local machine time for timestamping commits

GitHub commit timestamps rely on the local machine’s clock, so if a developer’s clock is wrong, the timestamp will be too. GitHub stores commit times but doesn’t verify their accuracy, meaning anyone can set their own timestamps. You must fully trust the repository’s author to trust the timestamps on GitHub.

One does not even need to modify the clock time to create commits with arbitrary timestamps. Modifying past changes and altering their timestamps is a common scenario with support by core git tools. For a few specific examples of how this looks in practice, check out this GitHub repository (pictured below), which appears to have been created 55 years ago and includes some remarkably accurate predictions!

GitHub repo with "predictions" from 55 years ago

Here’s another fun example of a GitHub repo going back to the future.

GitHub repo with timestamps from 30 years ago

If a GitHub account is accidentally modified or compromised, the timestamps are too

GitHub timestamps are vulnerable to tampering if someone gains access to an account. Files can be easily re-uploaded with new timestamps, and since GitHub doesn’t verify these, they are as unreliable as changeable self-reported data. This creates a potential cyber attack risk for anyone relying on the accuracy of a GitHub repo’s timestamps.

But even without a cyberattack, things can easily go wrong. Anyone with permission to modify a repository can also modify the repo’s timestamps. Accidental resets are common and costly.

Moving commits across repositories resets their timestamps

If you ever move files across repositories, your files will generally acquire the timestamp of the time they were moved rather than the time they were created. This can lead to significant issues in tracking the original creation time of your commits, making it harder to maintain an accurate history and audit trail. This inconsistency can introduce errors and complicate version control for technical workflows that depend on precise timing.

What is to be done?

Are you wondering how to get reliable timestamps for your GitHub commits? Whether it’s for version control, IP protection, compliance, or communication, the native GitHub interface may not suffice. GitHub simply wasn’t built to securely track history and data provenance.

One approach corporations follow in high-stakes situations is using a TimeStamp Authority (TSA) like DigiCert. In this approach, you are issued a digital certificate, you sign your file/data with your certificate, and you send the signature to a TSA, which sends you back a timestamped version of your signature, signed by the TSA. You must submit your file for a new timestamp every couple of years, as the TSA’s certificate periodically expires.

While you can use a TSA for verifiable timestamps, it can be a complex and costly process. A lighter and potentially more secure option is the validityBase web app or vBase API, which uses blockchains to create tamper-proof auditable timestamps of any data. With validityBase, you simply log in with your e-mail and moments later start making tamper-proof verifiable timestamps for your data.

Do you have any ideas on uses and approaches to verifiable timestamps, or do you need help navigating this issue? We’d love to hear from you.

vBase Blog

Recent Posts

Beyond RFC 3161: The Failures of Legacy Timestamping and a Solution Beyond RFC 3161: The Failures of Legacy Timestamping and a Solution

RFC-3161 timestamps often fall short in a number of important use cases. We examine the problems and a solution.

Dan Averbukh
4 reasons people don’t trust your backtest 4 reasons people don’t trust your backtest

Some analysts spend months building backtests that no-one is willing to trust. Learn why, and what to do about it.

Dan Averbukh
3 reasons why GitHub timestamps shouldn’t be trusted 3 reasons why GitHub timestamps shouldn’t be trusted

GitHub timestamps can be trivially altered and should not be trusted for recording the provenance of code or data. Proceed with caution.

Dan Averbukh