Big Data Valuation: Cost, Market and Income Approaches

13 min readJan 3, 2022

Disclaimer: I am not an IP lawyer, an economist or an accountant. This series of articles on the topic of data valuation captures opinions I developed from taking an online course on IP Valuation. They are not meant to represent the views of my employers, past or present. The purpose of this article concerning technical, legal or professional subject matter is meant to foster more dialogue about the subject and does not constitute legal advice.

I began writing a series of articles on data valuation by asking how the future economic value of Big Data assets could be assessed. I sought inspiration by enrolling in a course on IP Valuation [1] through the IP Business Academy, and reflected on how valuation could apply to data rather than to patents.

In the previous three articles, I delved into the meaning of IP Value. The IP Value of an intellectual asset is the correct fitting amongst three components: a valuation object, an exploitation method and a set of complementary goods. I would further advocate a fourth factor: the means of supporting exclusive access rights for the owner.

Comparing Like with Like

In this article, I’ll review the main comparison principles for valuation and relate them to data valuation. In valuations where we compare the economic conversion of the IP we’re interested in with another valued piece of IP, we want to make sure we’re comparing like with like.

Ideally, we would want the valuation object, exploitation scenario and complementary goods all to be as similar as possible between what we’re valuing and what is already generating economic value. The comparison idea is best illustrated by examples of where these factors may differ:

· Different valuation objects. Two similar looking trademarks are both licensed to brand T shirts and mugs, but one trademark design symbolises a sports team and another symbolises a nature conservation group. Even valuation objects that seem similar may have very different contexts.

· Different exploitation methods. The author of a copyrighted story can either use the work to publish books for the public or sell the broadcasting rights to an interested party. The valuation object is the same but it would generate economic value in very different ways.

· Different sets of complementary goods. Two patented technologies help test whether a person has a virus. In addition, both technologies are expected to be put into assembly line production. However, the solution that relies on a cheap chemical strip indicator has a completely different supply chain than the solution that relies on a complex electronic device. Even if we assume the valuation objects are comparable in intent and the exploitation methods are similar, the things needed to realise value in practice could be very different.

The three constituent parts of IP value also seem reasonable to characterise the value of data. Again, the comparing like-with-like idea can be illustrated by differences:

· Different valuation objects. Two different customer databases describe transactions of an online gaming community. One is a database of 10,000 premium users, each of which describes online game purchases and a richly described user profile. The other is a database of a million anonymised standard users, but it only contains basic online game purchases.

· Different exploitation methods. A grocery store chain manages a large inventory database that records customer transactions. The company can decide to use it to make its supply chains more cost-efficient, or it can sell database access to marketers who could learn how competitive their products are. Generating economic benefits can mean either gaining money or saving money.

· Different sets of complementary goods. Two travel car companies collect data about the recreational habits of motorists that might be useful for accommodation providers. One company has a database that determines interest levels by GPS signals emitted from its fleet of cars. The other company has a database that holds feedback from telephone and online questionnaires of its car drivers. The first company may rely on an infrastructure of signal emitting devices whereas the second may rely on a workforce responsible for doing interviewing and data entry or curation.

The combination of valuation object, exploitation method and complementary goods seems equally applicable to both data and traditional forms of IP such as patents, copyrights and trademarks. For Big Data, it is unclear how many of its ‘Big Vs’ — e.g.: volume, variety, velocity — have to be similar to consider two data sets to seem like the same kind of valuation object. Is it enough if they just cover the same theme of information or do other aspects make them sufficiently different to be incomparable?

Principles for Comparison

If the IP value combination of valuation object, exploitation method and complementary goods tells us what we’re comparing, we next have to consider how comparisons should be done. The course [1] presented the following common kinds of comparisons used for valuations:

· Direct comparisons, which rely on value determinants

· Indirect comparisons, which rely on value indicators

· Effect-oriented comparisons, which rely on economic effects. Of these, the most common are the cost, market and income-based approaches.

The first two types, direct and indirect comparisons, are the subject of the next article. They focus on defining characteristics of the valuation object that makes them comparable but their weakness is that they do not consider exploitation scenarios. Nevertheless, they may be useful for certain kinds of valuation scenarios and they may be particularly relevant to Big Data assets.

In this piece, we will focus on the effect-oriented comparisons which are widely used in asset valuations in general. From the course [1], the key characteristics of the three main economic effect approaches are:

1. The Income approach relies on comparing risks

2. The Market approach relies on comparing transactions

3. The Cost approach relies on comparing expenses

The purpose, goals and audience of an IP valuation will tend to determine which method is appropriate, but there is no one best approach for all scenarios. The power of the valuation context is illustrated in contextualised valuation questions listed in the Singapore government’s excellent guide on data valuation [2]:

· In the context of data produced specifically for a venture, would the value an organisation obtained from a data-sharing venture justify investment?

· In the context of data produced as a by-product of a main activity, what value could the organisation obtain by sharing the data?

· In the context of a data consumer considering a data source that has no alternative, what value would it bring to an organisation?

The guide uses each of the cost, market and income-based approaches to answer these same questions.

There is no all-purpose comparison method, and different methods can be used for different situations. In general, a preference for methods that best capture the idea of future value would mean that income would be most preferred, followed by market, and then cost.

However, the comparison method that is used will ultimately be determined by the availability of information used to support valuation [3]. For example, if there is not enough information about risks of future income streams, we might use the market approach. If there are no established data markets that could provide useful transaction data, we might end up using the cost method because information about expenses is more available.

In the following sections, we will explore each of these comparison approaches in more detail and consider them in the context of Big Data assets.

Cost Approach

The cost method establishes the value of an asset by calculating the cost of developing a similar or identical IP asset internally or externally [4]. It assumes there is a direct relation between the cost incurred to develop a valuation object and its economic value [5]. Its two main variants are the replacement cost and replication cost methods [5]. The reproduction cost method considers the total current prices of producing a replica of the valuation object [4] and includes costs spent on any failed prototypes [5]. The replacement cost method considers the cost of creating something that has the same functionality using the current state of the art [4].

The cost approach is rarely used to value IP because the cost of developing something often doesn’t accurately reflect its value [5]. However, it may be useful when: establishing a minimum value for an IP asset [2]; deciding whether it is better to acquire or buy an asset [6]; there is no economic activity to review, such as early technology that hasn’t yet produced revenue [4]; or there is insufficient information to support using the other two approaches [6].

Applying the cost approach to valuing Big Data assets has some interesting implications. One major consideration would be how the cost method would establish a minimum value in passive data collection efforts that featured a lot of underutilised data. Suppose a data producer invested in accumulating several themes of data with the informed hope that one day the data may reveal insights that are more than the sum of its parts. Further suppose that the rate at which data accumulated far outstripped the organisation’s capacity for analysing the data to produce outputs.

To the data producer, a replication cost might seem more appealing to use because it might reflect the effort spent on mass data accumulation to support both questions that have been asked and those which may yet be asked.

To a data consumer, the replacement cost method might be appealing in that it would need to consider only the data that was used to answer specific current questions. A shared view of relevant cost between data producer and data consumer would need to include a shared appreciation of potential for underutilised data.

Another issue to consider is whether a Big Data asset could actually be substituted at all [7]. In light of all the Big Data ‘Vs’ characteristics such as variety, velocity, volume, veracity, viscosity, variability and virality [8], when would two Big Data assets be considered comparable valuation objects? If one asset cannot be replaced by another, the results of the cost approach may be considered unreliable [7].

Market Approach

The market approach is based on comparisons of prices that are paid for similar assets that are available in a market [4]. It assumes that valuation objects that create the same utility will have the same market price. For well-established kinds of objects, there may be a mature market from which transaction data can be obtained. Otherwise, price data may have to be obtained from transactions that have comparable data [2].

According to the DIN 77100 Standard for Patent Valuation, “… in most cases, it is not possible to conduct reliable value estimations on the basis of the market-oriented procedure” [6]. Often, the approach cannot be supported because there may not be enough information about transactions or a market [7]. Valuation objects such as patents that feature novelty may not have comparable utility with other more established market alternatives [5],[6]. Accurate comparisons of transaction data may need to consider whether comparable terms and conditions exist [4]. When transaction and market information is available, it may be a better indicator of a mood for current economic activity than a gauge of future economic activity [4].

The market approach can provide multiple uses. It can provide a valuation estimate to complement other approaches. It can be used in scenarios where the valuation objects feature incremental rather than monumental improvements over existing market alternatives [5]. It can provide inputs for the income method [4]; and it can be useful when there are established markets [2].

For some data-rich economic sectors, the market approach may be more applicable to data sets than to patents. Patents have a built-in novelty requirement, whereas data sets do not need to be novel to generate economic value. In the realm of scientific computing, contract research organisations that perform standard scientific analyses likely have enough information about transactions and the market to support an approach.

It seems that data sets that would support the market approach would share some characteristics. Ideally, they would be: predictably structured, limited in coverage, compliant with community-based data exchange standards, machine generated.

In contrast, many Big Data sets may exhibit different characteristics. Many would be unstructured and combine with vastly different types of data. Some would have Big Data dimensions that could make it difficult to place them within existing or new markets. Stander reflects on the paradox that the unique character of information which makes it valuable makes it difficult to find comparable markets [7].

Income Approach

The income approach estimates the value of an asset based on the amount of income it is expected to generate, adjusted to its current value. It is the most commonly used approach in IP valuations [4]. Three important factors that are established in the approach are: estimating the useful remaining economic life of the asset; determining an appropriate discount rate used to arrive at a present value; and isolating a cash flow that would be attributable to the valuation object [6].

Although there are many variants of the income approach [5], the DIN 77100 standard describes two: the ‘incremental cash flow method’ and the ‘relief-from-royalty method’. The first involves calculating the patent-specific financial surplus from the economic benefit of using the patent. The second is based on estimating what an organisation would not have to pay in license fees to a third party for a similar kind of IP [6].

The income approach has a number of weaknesses. New technologies may not yet have established markets that could be used to predict sales and future revenues [5]. The calculations used to predict a reliable future cash flow may be subjective, complex and expensive to perform [5]. The approach yields value estimates that are entirely dependent on forecasts of potential future earnings [7], and it may be difficult to isolate the future income stream contribution of the valuation object from the stream of the wider organisation. It may oversimplify the risks involved with future cash flows by using a single discount factor used to estimate present value of future incomes.

Of the three approaches, and when it can be used, the income approach can provide the best economic value. It is best suited when IP assets can generate stable or predictable cash flows [4]. It can also be useful for estimating the upper limit cost an organisation should spend on developing a new product [7].

In the context of Big Data, it seems challenging to apply the income-based approach because the useful lifetime and evolutionary trajectory of a Big Data asset may be difficult to predict. Data sets can rapidly change character when they evolve new features, or when they split or combine with other data sets. The effects of mergers in data-driven organisations can also generate powerful market dominance effects that can also change the perception of economic benefits [9].

Further Thoughts on the Economic Effect Approaches

One of the most important points from the course [1] was that the cornerstone of valuation is avoiding incommensurability — a situation where no common measure exists to compare objects. The need to establish comparability between the IP value of subject matter to be valued and the more established IP value of analogous subject matter seems reasonable.

Ideally it seems that IP values should be comparable in their valuation objects, exploitation scenarios and complementary assets. I would suggest that they may also need to have comparable assurances of exclusivity that would lend reliability and predictability to estimates.

When we consider comparing one set of aspects with another set, we can begin to consider all the various combinations which may make them different. When the valuation objects are Big Data sets, we may conclude that meaningful comparisons may require further alignment of Big Vs such as volume, variety and velocity.

Stander’s observations suggest that the uniqueness of a valuation object can give great value to an organisation but risk making the object non-comparable [7]. However, at least the novelty is deliberately cultivated. A patentable idea may have begun by chance, but by the time it has been reflected in a successful patent, its novelty will have been deliberately positioned in relation to other patents. The need to register patents and avoid infringements may make patent owners better able to articulate the markets from which those assets may generate cash flows.

Although the novelty of a successful patent is deliberate, the novelty of a data set may happen as a result of an inventive step, a deliberate and obvious step (e.g.: systematic mass aggregation) or simply by chance. Unlike patents data sets need no registration, so their creators can allow them to live out their life cycles behind organisation walls while servicing more recognised assets. Data sets do not even need to be unique to be useful, and in some cases if they are simply published, they can negate the trade secret value of similar data sets managed by other organisations.

Once a patent has been granted, the utility of its design will likely remain static and therefore provide predictable use even if the markets within which it operates change. However, if a data set is valued today, it may evolve into something very different within its initial anticipated life span.

The life cycle of a data asset matters in assessing its future economic value. And, if it is so easy for two data sets to become non-comparable, we may need to seek a higher level of abstraction until we find ways where they can become comparable.

The Next Article

The next piece is entitled ‘Big Data Valuation: Indicators and Determinants of Value’. We will explore valuation comparisons that rely on determinants and indicators of value. Generally, these approaches provide a poor predictor of future value because they do not describe an exploitation scenario. However, they may help save Big Data sets from becoming poorly comparable.

Articles in the Series

References

[1] Wurzer, Alexander. Certified University Course IP Valuation 1. IP Business Academy. https://ipbusinessacademy.org/certified-university-course-ip-valuation-i

[2] Guide to Data Valuation for Data Sharing. https://www.imda.gov.sg/-/media/Imda/Files/Programme/AI-Data-Innovation/Guide-to-Data-Valuation-for-Data-Sharing.pdf

[3] Getting smarter: a strategy for knowledge & innovation assets in the public sector The Mackintosh report. URL: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/978668/Getting_smarter_report_150421.pdf

[4] Module 11 : IP Valuation. IP Panorama course materials. URL: https://www.wipo.int/export/sites/www/sme/en/documents/pdf/ip_panorama_11_learning_points.pdf

[5] WIPO Course materials from Intellectual Property Management

[6] Patent Valuation — General principles for monetary patent valuation. English translation, DIN 77100:2011–5. https://www.beuth.de/en/standard/din-77100/140168931

[7] Stander, J. B. (2015). The modern asset: big data and information valuation (Doctoral dissertation, Stellenbosch: Stellenbosch University). https://scholar.sun.ac.za/handle/10019.1/97824

[8] Visconti, Roberto Moro. The Valuation of Digital Intangibles: Technology, Marketing and Internet. Springer Nature, 2020.

[9] Stucke, Maurice E., and Allen P. Grunes. “Introduction: big data and competition policy.” Big Data and Competition Policy, Oxford University Press (2016)