Financial data must be made point-in-time

Financial data must be made point-in-time

Financial data is sometimes recorded in a bitemporal, or point-in-time, fashion, but this remains a minority practice and is not nearly as widespread as it should be.

Point-in-time (PIT) data is sometimes known as snapshot data. It differs from more basic data offerings in that PIT datasets record every datapoint, event, and observation and every revision to these. There are two time axes: the time of the event and the time when the data was recorded or revised. Thus, bitemporal data records both what was known and when it was known.

Non-PIT financial data can unexpectedly lose predictive validity

Imagine an analyst finds that, using government data covering some economic activity for February released on March 1st, she can construct a basket of stocks that will be profitable to own in March. This has the makings of a valuable trading strategy.

Now imagine that while the initial data release was on March 1st, the data was revised on April 1st. The analysis will be wrong if the analyst assumes she can trade in March using the revised data that became available on April 1st. She may eagerly put the incorrect model into production with severe costs for the firm and for her personal career.

The problem with non-PIT data is that one may not even know WHEN a particular piece of data became known or was last revised. Moreover, even if the time of the last revision is known, the original datapoint and its timestamp often won’t be. This lack makes the data much less useful in any time-sensitive analysis, as most financial analysis tends to be. For example, our analyst may have traded differently using the data in the March 1st release than in the April 1st revision.

Financial data that isn’t PIT lacks information on when it was actionable – a critical part of any practical application. A given economic indicator may be priceless on a Monday and worthless on a Tuesday once it has been incorporated into security prices. It thus matters deeply for interpreting a financial model, a dataset, or an analysis whether that model/dataset/strategy was created 6 months ago, 5 months and 29 days ago or yesterday. In most cases, the credibility, and thus value, that one can assign to financial data with an unknown date of creation is much lower than if all data and associated revisions are tracked carefully.

Non-PIT data introduces bias, leading to false conclusions

Using non-PIT data for financial analysis introduces a variety of potential biases. The most severe are hindsight bias, lookahead bias, and survivorship bias. Each can, and often does, fatally derail an otherwise valid-looking financial analysis.

Any backtests become especially fatally flawed. An accurate backtest relies on accurately recreating past conditions, analysis, and the resulting actions. If the inputs to the process are biased, for example, by assuming that a firm can trade on Monday using data or revisions from Tuesday, the entire effort becomes unreliable at best and an expensive fiasco at worst.

It may be tempting to give up and opt out of any backtesting and historical simulations altogether. Why not test every model live and be done with it? Unfortunately, this is not a practical option in today’s competitive marketplace, at least in the financial markets. As the value of predictive data decays rapidly, any delay in recognizing predictive value risks making the practitioner structurally uncompetitive. Consequently, PIT data and analysis become table stakes.

The ever-present compliance challenge

The importance of PIT data and reproducible analysis does not end with competitive imperatives. The heavy regulation of the financial industry means that various historical calculations often need to be reproducible for auditors and regulators. This becomes impossible if data is over-written and revision history is not reliably saved.

Creating PIT data and the associated snapshots allows for a clean base of data from which to perform calculations. This can include performance calculations, fee calculations, fair value measurements, reserve calculations, hedge effectiveness testing, foreign exchange adjustments, and many other financial calculations that rely heavily on real-time calculations and must be reproducible. Creating a clean record of bitemporal PIT data can be immensely valuable in avoiding hot water with clients, auditors, and regulators.

validityBase makes PIT data easy to create, validate and manage

We at vBase have dealt with the above challenges in our businesses for many years. Recent technical advances have opened new solutions to the problem of PIT data and analysis. Please contact us at hello@vbase.com for more information on how you can quickly start making any data, and any calculations using this data, auditably point-in-time and provably correct.

vBase Blog

Recent Posts

Beyond RFC 3161: The Failures of Legacy Timestamping and a Solution Beyond RFC 3161: The Failures of Legacy Timestamping and a Solution

RFC-3161 timestamps often fall short in a number of important use cases. We examine the problems and a solution.

Dan Averbukh
4 reasons people don’t trust your backtest 4 reasons people don’t trust your backtest

Some analysts spend months building backtests that no-one is willing to trust. Learn why, and what to do about it.

Dan Averbukh
3 reasons why GitHub timestamps shouldn’t be trusted 3 reasons why GitHub timestamps shouldn’t be trusted

GitHub timestamps can be trivially altered and should not be trusted for recording the provenance of code or data. Proceed with caution.

Dan Averbukh