Big Data Valuation: Indicators and Determinants of Value

9 min readJan 11, 2022

Disclaimer: I am not an IP lawyer, an economist or an accountant. This series of articles on the topic of data valuation captures opinions I developed from taking an online course on IP Valuation. They are not meant to represent the views of my employers, past or present. The purpose of this article concerning technical, legal or professional subject matter is meant to foster more dialogue about the subject and does not constitute legal advice.

I began writing a series of articles on data valuation by asking how the future economic value of Big Data assets could be assessed. I sought inspiration by enrolling in a course on IP Valuation [1] through the IP Business Academy, and reflected on how valuation could apply to data rather than to patents.

In previous articles, I explored what defines the core of IP value and how valuation objects are compared. The most common comparison approaches are the cost approach, the market approach and the income (cash-flow) approach. The advantages and disadvantages of these approaches are well covered in the EU-based standard DIN 77100: ‘Patent valuation — General principles for monetary patent valuation’ [2].

In this piece, I’ll make the case for supporting weaker comparison methods that are covered in the course but not in the standard. They involve identifying domain-specific indicators and determinants of value.

The Case for Considering the Least Preferable Comparison Approaches for Valuing Big Data Sets

Recalling the course materials [1], we learned about three main kinds of comparisons used for valuation:

· Direct comparisons: these depend on determinant factors of value in a value object

· Indirect comparisons: these depend on indicative factors of value in a value object

· Economic effect comparisons: eg: cost, market and economic approaches

Valuations will favour using economic effect comparisons because they involve considering scenarios where the intellectual material is used to produce future economic benefits. These comparisons provide a compelling view of future value through comparisons of financial risks, market transactions or costs.

Direct and indirect comparisons rely on identifying characteristics of the intellectual material that can be associated with value. Direct comparisons depend on identifying determinants which would decisively determine the value of the intellectual material. Indirect comparisons depend on factors which may be strongly associated with value. The basic idea behind these comparisons is that if the intellectual material to be valued has the same characteristic as analogous material that can be associated with a known value, then we can suggest that the material for valuation may have a similar value.

There are some obvious weaknesses of these approaches. One problem is getting people to agree on what determinants or indicators to consider. As the course materials indicated — value is in the eye of the beholder. A second problem is that the estimates are not anchored to actual economic scenarios. A third problem is assuming that two objects are comparable if they share common attributes: a banana and a canary are each yellow and lightweight but sharing two characteristics does not mean they are comparable.

Despite the drawbacks, representing Big Data sets through indicators and determinants of value has multiple benefits:

· They can simplify comparisons between Big Data assets that would otherwise be too complicated to do.

· They make it feasible to value a large collection of Big Data assets.

· They can make Big Data assets appear less volatile over time.

The first benefit is best illustrated through an analogy. Consider the task of comparing the values of multiple kinds of laptop computers. The machines may have hundreds of detailed features which may be difficult to compare, but technical specifications can reduce the complexity of comparisons to a few dozen main features. Of those, a consumer may choose to consider ones they think are the most important such as price, memory, speed and weight. Making a comparison between two complex machines based on these desirability characteristics may provide a somewhat inaccurate result, but the simplification makes it possible for the human mind to compare them at all.

The second benefit follows from the first; by reducing a complex valuation object to a simple one represented by desirability characteristics it means that a valuation activity may practically handle large portfolios of Big Data assets.

The third benefit is possible because the attributes of comparison can remain the same even if the underlying data sets they represent change features, grow, split, combine and obsolesce. Instead of describing the variety of a data set by continually updating the exact list of fields, basic categories of fields can be used. Instead of reporting a constantly changing size for a data set, it may be more useful to show a growth rate.

Identifying Generic Determinants of Value

What determinants of value would we all agree apply to all Big Data assets? It is unlikely we would identify traits that were universally applicable but we could try!

Perhaps it is easier to identify determinants that reflect the absence rather than presence of value. For example, consider a Big Data asset that describes individual people. The data set would undoubtedly be subject to some data protection regulations. Its various fields may or may not correlate with a high value. However, if it should but does not comply with data protection regulations, then the legitimate value of the data asset would presumably be zero.

A Big Data asset in Finance may need to be compliant with regulations related to fraud, personal data and taxation. A biological data asset may need to show evidence of ethics approval, compliance with data protection and compliance with tissue or laboratory animal handling regulations of biological material used to generate the data. In both cases, the places the data are processed could warrant the need to show compliance of regulations in multiple jurisdictions. In both cases, the failure to demonstrate compliance would not just make the assets worth zero, they would become liabilities.

Identifying Generic Indicators of Value

Whereas a determinant needs to decisively conclude the value of a valuation object, an indicator needs to only show a strong association. At least two challenges present themselves:

· Identifying indicators that everyone would agree apply to all data sets.

· Identifying which combination of indicators should be used for comparisons.

Again, we can only try to resolve these valuation issues, but the experience may be enlightening.

VRIN Asset Indicators

One set of properties that could inspire comparisons with indicators comes from the VRIN Framework. It is part of the Resource-based View, which inventor Birger Wernerfelt defined in 1984: ‘a basis for the competitive advantage of a firm that lies primarily in the application of a bundle of valuable tangible or intangible resources at the firm’s disposal’ [3]. The characteristics of the acronym VRIN can be used to describe the strategic value of an asset:

· Valuable

· Rare

· Inimitable

· Non-substitutable

The first two characteristics are obvious, but the second two warrant more explanation. The Inimitable factor means that a resource will provide competitive advantage if it cannot be copied. The Non-substitutable factor means that a resource should not be able to be replaced by a different yet strategically equivalent alternative. In a comparison, you could use value, rarity, inimitability and substitutability as indicators of value.

Infonomics Intangible Asset Indicators

In his introductory course on Infonomics, Laney describes another set of interesting criteria that could be used to assess data assets [4]:

· Low marginal cost to duplicate an asset or its economic benefits

· High initial investment

· Economies of scale, where the more an intangible asset is produced, the cheaper it becomes

· Joint consumption, where multiple people can use the same asset at once

· Imperfect substitution, which like VRIN describes its scarcity or uniqueness

· Network effects, which he describes as the effect whereby the more people who use an asset, the more people it will attract.

In a valuation, perhaps marginal cost, initial investment level, economic scalability, joint consumption potential and substitutability could be value indicators to compare two valuation objects.

Stucke and Grunes Big Data Network Effect Indicators

Laney’s last category of Network effects can be broken down into more specific indicators. In their book “Big Data and Competition Policy”, Stucke and Grunes [5] identify network effects which specifically apply to Big Data and relate to market dominance.

Most of them seem to relate to notions of market dominance and cover uses cases that involve social networks gathering data about their users and selling it to marketers. They don’t really fit scientific computing use cases I’m worked with, but they would be relevant to data sets that contain personal data about a user community.

But they provide some interesting thought, and they could be graded with values like ‘low’, ‘medium’ and ‘high’. I’ve recast their list into valuation indicators:

· Support level of direct networks effects: degree to which the utility that users experience from a product increases because the number of other users in a network grows.

· Support level of indirect network effects: degree to which the utility of a product grows because it is being used by more people, which can draw more users

· Support level of scale-of-data effects: the more users contribute their data to a product, the more the product owner can improve the product, which then draws more users

· Support level of scope-of-data effects: the owner of a data product uses the variety of data to improve the data asset, which then draws more users.

· Support for spill-over effects: the data product is fed by a multiple-sided platform featuring users and service providers. As more users join, more services come on board to appeal to them which makes it appealing to yet more users.

Visconti’s Big Data Criteria

Another set of indicators may be the growing list of “V” characteristics associated with Big Data [6]. Among the Vs he lists are:

· Volume: increasing amounts of data can help improve accuracy of data and forecasts

· Velocity: increasing speed can have a similar effect as increasing volume

· Variety: increasing variety of data can clarify an understanding of stakeholder needs (eg: richer user profiles can help make it easier to guess what they want)

· Veracity: increasing veracity of data means the data become more reliable and trustable [7]

· Validity: increasing validity of the data means the data are more accurate and correct for its intended use [6][7].

· Variability: increasing variability has multiple interpretations. It can increase the informative value of data [6] to increase, but it can also describe an increasing number of inconsistencies in data [7].

· Virality: increasing virality means data are propagated across networks more easily

· Visualisability: (Visconti calls this Visualisation) Increasing support for visualizing data means it is easier to turn information into user-accessible knowledge [6]

· Viscosity: increasing viscosity means it becomes easier to navigate the data [6]

The list rounds out to ten with ‘value’, I exclude it because it seems as though it is the product of considering all the other factors.

Support for Analyses

Another set of value indicators could describe whether a data set can support a specific kind of analysis technique. Perhaps it has a large enough volume or variety of data to support a particular kind of machine learning analysis. Capturing a Big Data set’s support for different types of analyses can reflect value of enabling processing activities which can turn it from data to information.

A Market of Domain Specific Indicators

So far we have tried to identify generic indicators which may prove useful in valuing data sets that have a general purpose or perhaps for certain data sets that may have never been made with the intent of one day being a data asset.

However, using domain-specific indicators to compare Big Data assets may support a more compelling valuation. Although they lack the credibility of describing the assets being exploited in economic scenarios, they would make it seem more likely they would match the needs of a domain community and generate some kind of economic activity.

The more niche a domain community, the more likely it seems they would agree on indicators to use and the combination of them to apply in valuations. As the producers and consumers of big data exhibit narrowing interests, value would remain in the eye of the beholder. But, it seems more likely that the owners of the valuation object and the buyers of it would at least see eye to eye through a shared understanding of desirability traits.

The Next Article

The next and final piece in this series will be entitled ‘Big Data Valuation: A Pause in a Journey of Learning’. I will review some of the core thoughts I’ve developed from taking an IP valuation course and trying to apply it to data.

Articles in the Series

References

[1] Wurzer, Alexander. Certified University Course IP Valuation 1. IP Business Academy. https://ipbusinessacademy.org/certified-university-course-ip-valuation-i

[2] Patent Valuation — General principles for monetary patent valuation. English translation, DIN 77100:2011–5. https://www.beuth.de/en/standard/din-77100/140168931

[3] Famuyide S. VRIN Framework/VRIO Analysis. Business Analyst Learnings. (May 2017). https://www.businessanalystlearnings.com/ba-techniques/2017/5/1/vrin-frameworkvrio-analysis

[4] Laney, Doug. Infonomics 1: Business Information Economics and Data Monetization. (MOOC). Coursera. https://www.coursera.org/learn/infonomics-1

[5] Stucke, M. E., & Grunes, A. P. Introduction: big data and competition policy. Big Data and Competition Policy, Oxford University Press (2016).

[6] Visconti, M., & Weis. The valuation of digital intangibles. (2020). Springer International Publishing.

[7] Firican G. . The 10 Vs of Big Data. (Feb 2017). TDWI. https://tdwi.org/articles/2017/02/08/10-vs-of-big-data.aspx