Big Data Valuation: The Impact of Specificity of Complementary Goods on Data Reuse
by Kevin Garwood
Disclaimer: I am not an IP lawyer, an economist or an accountant. This series of articles on the topic of data valuation captures opinions I developed from taking an online course on IP Valuation. They are not meant to represent the views of my employers, past or present. The purpose of this article concerning technical, legal or professional subject matter is meant to foster more dialogue about the subject and does not constitute legal advice.
I began writing a series of articles on data valuation by asking how the future economic value of Big Data assets could be assessed. I sought inspiration by enrolling in a course on IP Valuation  through the IP Business Academy, and reflected on how valuation could apply to data rather than to patents.
In my previous article, I discussed the importance of achieving exclusive use rights for asset owners, the valuation concept of “complementary goods” and the importance of data management in supporting the value of Big Data assets.
In this article, I focus more on the character of complementary goods, which are viewed as required supporting factors in making an economic use scenario possible. The complementary goods needed to enable economic scenarios for intellectual assets tend to be quite specific. It is this specificity that may help limit the expectations of reusable data in ways that make it practical to do Big Data valuations.
The IP Value Anatomy: Valuation Object, Exploitation Scenario, Complementary Goods
The purpose, goal and audience of an IP valuation will help set the context for what portfolios of intellectual assets need to be assessed for their future economic benefit. The valuation object is the thing being valued and could be a patent, a copyright, a trademark, a database or a data set. The exploitation scenario is a technical term which describes how the valuation object could be used to generate some kind of future economic value. The exploitation scenario helps determine what complementary goods will be needed to make it realise value. A fourth attribute of IP Value may be the means of providing exclusive use protection for the asset’s owner. Several examples of IP value are shown in the figure below:
Each valuation object may have multiple exploitation methods, and each method may come with its own set of complementary goods which must be in place for the method to generate value. For example, if you are planning to produce copies of a patented invention, the activity may not actually generate any value unless you have the correct laboratory instruments or supply chain in place. If you choose to license the patent, it would generate value a different way and you would need a different set of specific complementary goods.
If we consider patents, once they have been granted, they will not change except under a few specific circumstances. The exploitation scenarios would be limited by the extent of patent claims. That the patent application would change little means its use would be somewhat predictable, even if that use occurred within an unpredictable market. The more predictable an intellectual asset could behave in the future, the more compelling a future forecast of its value would be.
However, if we consider data sets, they may split, merge, mutate or obsolesce rapidly in ways during their expected lifetimes of use. If we consider a Big Data asset with thousands of fields and millions of records, we may think of ways we could chop it up to support specific scenarios. Each asset variant could evolve exploitation scenarios that are relevant during one period and not during another.
If we were doing a valuation exercise for a Big Data asset, we may initially welcome an asset that could have all sorts of possible uses. However, if we were actually compelled to articulate what they were, we might then also welcome something to help naturally limit our expectations of data reuse.
Without those constraining limits, we may become unable to cope with the complexity of possible uses and instead retreat into becoming emphatic about its potential without being able to articulate what it is. That approach can generate enthusiasm but may not support a valuation forecast that buyers or users of the asset would be confident working with.
Our ability to discriminate rather than to expound on the potential of Big Data assets may ultimately provide credibility for the exploitation scenarios that IP Valuators may have traditionally depended on. If this is true, then we may seek to emphasise how specialised the complementary goods are for generating and managing the data.
The Specificity of Complementary Goods
One of most important comments I reflected on from course materials  was:
“The usability and the concrete benefit of immaterial economic goods such as IP are characterised by the high specificity of the dependence on complementary goods”.
As the instructor explained, the more abstract the IP, the greater specificity the complementary goods tend to have. The variety of Big Data sets that have been created is too large to make many generalisations. However, let’s consider the specificity of complementary goods that can appear alongside some examples of scientific data sets.
A year of collisions in a single experiment at the Large Hadron Collider (LHC) can generate almost 1 million petabytes of data. The LHC facility is therefore a very specialised complementary good which took a decade to build and $4.75 billion to create . A similar scenario involving expensive complementary goods would apply to many Big Data sets in astronomy.
Transcriptomics data about a rare Amazonian tree frog being exposed to cold might not generate large amounts of data. However, the data can only be produced if the rare frog is available, and its use may be limited to the one frog of that species under specific conditions.
Patients who use a specialised medical device may have provided their data to manufacturers to help those organisations create better offerings. However, the insights about a specific product may only exist because patients with specific health problems were available and because they consented to allowing their data be used for that purpose. Even if the data could apply to many use cases, the specific nature of the product and regulatory restrictions could mean that insights derived from their data would be used in only a very limited range of activities.
In these three examples, the high specificity of complementary goods both limits the possibilities of data reuse and gives that data credibility for the uses that remain. The complexity of protocols and scientific equipment used to create data sets, and their compliance with regulations may provide assurances of value for those who would use them in line with what they were designed to support.
Even when scientific data sets would appear to support a broad range of activities, they may lack other complementary goods that would support confidence in their original use or proposed reuse contexts. Some data sets may lack information to support reproducibility. Others may show non-standard data processing or lack linking fields that would limit how they could be combined with other data in new contexts.
I will revisit the topic of what characteristics would make data sets better support a broad range of exploitation scenarios. However, my main observation about specificity of complementary goods is that they may limit our impression of having too many scenario possibilities to consider. This has the effect of making a concrete exploitation scenario easier to identify, but it may also limit the sense of mystique that comes from a data set with limitless possibilities.
We must also be careful about not being too prescriptive about how a Big Data asset may be used. There is value in the certainty of claiming that variables with known relationships in a data set can support an established use case. There is also value in the hope that other variables with unknown relationships may also provide future value. Bridging the certainty of the known with the hope in the unknown are the expectations that a data set will begin providing valuable insights once it passes a threshold value in multiple dimensions of Big Data scale. We must be able to demonstrate actual value in Big Data assets and hint at its other uses as well.
The Next Article
The next article is titled ‘Data Valuation: Cost, Market and Income Approaches’. I will move on from discussing the anatomy of IP Value to discussing what methods are used to support comparisons in IP Valuations, and what implications they may have when they are applied to data.
 Wurzer, Alexander. Certified University Course IP Valuation 1. IP Business Academy. https://ipbusinessacademy.org/certified-university-course-ip-valuation-i
 Patent Valuation — General principles for monetary patent valuation. English translation, DIN 77100:2011–5. https://www.beuth.de/en/standard/din-77100/140168931
 Large Hadron Collider set to resume work this month. The Economic Times (Apr 2015) https://economictimes.indiatimes.com/news/science/large-hadron-collider-set-to-resume-work-this-month/articleshow/46807266.cms?from=mdr