Big Data Valuation: The Value of Asset Synergies

9 min readDec 13, 2021

by Kevin Garwood

Disclaimer: I am not an IP lawyer, an economist or an accountant. This series of articles on the topic of data valuation captures opinions I developed from taking an online course on IP Valuation. They are not meant to represent the views of my employers, past or present. The purpose of this article concerning technical, legal or professional subject matter is meant to foster more dialogue about the subject and does not constitute legal advice.

I began my series of articles on data valuation by asking how the future economic value of Big Data assets could be assessed. I sought inspiration by enrolling in a course on IP Valuation [1] through the IP Business Academy, and reflected on how valuation could apply to data rather than to patents. In my previous article, I outlined the benefits of how doing mock valuations could help rationalise data collection activities for Big Data assets.

This post marks the first of three articles that delve into the anatomy of IP Value. I begin by considering what a portfolio of Big Data assets might mean. I then provide thought about the synergies amongst IP items that are recognised as being important in valuations of more traditional IP valuations of patents, copyrights and trademarks. I then consider how IP valuations tend to emphasise “exploitation scenarios”, which is the technical term used in valuations to describe how IP assets would generate value in practice.

To draw insights about data valuation from patent valuations, I’ve identified parts of the DIN 77100 standard [2] and tried to replace “patents” with “data sets”. My summary findings are as follows:

· A portfolio of electronic data sets may qualify for a sui generis database right that is not likely to be meaningful for a portfolio of patents

· The standard recognises the importance of accounting for synergies amongst items in a portfolio, which seems essential for doing big data valuations

· The standard tends to emphasise applied value of using economic scenarios of usage rather than potential value of what assets could be used for.

The rest of the article seeks to cover each point in more detail.

Packaging Portfolio Data Sets as a Protectable Database

We can begin by considering what the target of the IP valuation activity would be. Substituting “patents” for “data sets”, the DIN 77100 would show [2]:

“One or more [data sets] can be subject to [data set] valuation. Several related [data sets] can form a [data set] portfolio.”

At first glance, a simple substitution of terms would suggest a portfolio of data sets was similar to a portfolio of patents. However, two notable differences come to mind: the difference in the number of items between portfolios and the impact on IP value that a database would have in organising them electronically.

The relative ease of rapidly creating data sets suggests that a typical portfolio of data sets would have far more items than a portfolio of patents. A portfolio of patents could contain one, a dozen, a hundred or more items, but the expense in creating each one may limit their number. However, imagine now a portfolio of hundreds of data sets. As I’ll discuss in a later article, the sheer amount of effort needed to cover that many items could influence the types of comparison techniques that are used in valuation.

Apart from the number of portfolio items, there is a technical matter to consider of how they might be organised electronically. A portfolio of patents could be organized as an electronic folder full of patent word-processing documents. If there are enough patents in a portfolio, perhaps they could be organised as a database to make the task of finding or relating them easier. However, the presence of a database would not significantly add any new value to a patent, because the value of patents resides in the legal protections that become vested in it after a rigorous application process.

Now consider the role a database could play in adding value to a collection of data sets. Although a single data set could have value in isolation, in many cases, much of its value may be vested in how it is related to other data sets. At least for structured data sets, databases can provide vital infrastructure for relating them in ways that significantly influence how their value would be realised in use cases.

Why is the role of a database important to consider in managing a portfolio of data sets? The first reason is that a database can bundle data sets in a way that simplifies how they would be licensed, copied, installed and used. The choice and selection of data sets that would belong in a database could then influence what use cases were envisioned for them.

The second reason is that a database can provide a form of IP protection that a data set in isolation may not have. On its own, a data set is a form of intellectual material that may be protected by copyright or in some cases as part of a patent process. However, these protections only apply under certain conditions. For example, in most cases raw data generated by sensors would not be covered by copyright [3]. Typically, data sets wouldn’t be patented because they don’t apply to inventions, but there are a few exceptions [4]. Some forms of test data can be protected by provisions in international agreements such as TRIPS [5], but again it only applies to certain situations.

Whereas there is no specific IP right for just data [3][6], there are sui generis database rights that have been designed to protect the effort needed to create and maintain databases. This kind of right is not uniformly supported across the world and is mainly found in jurisdictions such as the EU, South Korea and Mexico [7].

Therefore, grouping data sets within a database appears to provide benefits: it promotes their value by linking them at a technical level; it simplifies their presence in a portfolio of assets; and, in some jurisdictions, it provides them a special kind of database right protection. None of these benefits appear to be relevant to using a database to maintain an electronic portfolio of patents.

Recognising the Value of Synergies

DIN 77100 recognises the synergies that may exist amongst portfolio items [2]:

“When valuing a [data set] portfolio, possible synergy effects arising from the combined effect of the [data set] within a portfolio may need to be taken into account…”

At first glance, the sentence seems equally applicable to patents and data sets. Both a portfolio of patents and a portfolio of data sets could both exhibit synergistic value. However, the value of portfolio item combinations may be easier to assess for patents. Estimates of future value may appear most reliable when both the number and content of portfolio items remains static over time. The relative ease of creating and changing data sets versus patents means patent synergies will likely be anticipated and data set synergies will likely be emergent.

It is the life cycles of patents and data sets that may account for this difference. A patent may change as it is drafted, but barring some circumstances that later allow amendment [8][9][10], a granted patent will remain static until it becomes invalidated. For much of its typical lifetime of 20 years [11], a patent won’t change.

A data set can grow, change its content, merge, split, and become archived or deleted in a much shorter time frame. Data sets can also mutate for as long as they are maintained. Both kinds of asset can become obsolete. However, the obsolescence of technologies and use cases that may shorten the practical life of a data set may happen much faster than the regulatory and market-based obsolescence that can affect patents.

Emergent synergistic values seem like a central concept for valuations of big data sets. Each one may have an actual or presumed value in isolation, but the way they proliferate in number, grow in content and link with each other seem to carry much of the hope for future value.

The Value Emphasis on Application Rather than Potential

The valuation standard puts great emphasis on the need for an exploitation scenario [2]:

“The monetary value of the [data set] shall be determined on the basis of an exploitation scenario. The exploitation scenario describes the exploitation of the [data set] as the commercial implementation of the invention…”

Note that although the term exploitation tends to have general negative connotations, in IP valuation this is a technical term. This statement about monetary value seems sensible for all intangible assets, but different forms of intangible assets may vary in how obvious their commercial implementations may appear. Patents seem tailor-made to support practical economic uses because:

· They are designed to protect ideas with industrial application [12].

· The high costs of filing and managing them in multiple countries would necessitate a minimum economic return on investment [13].

Copyrighted material seems slightly more removed from commercial exploitation scenarios because:

· Authors may create a work without even being aware that copyright is an automatically granted right [14].

· The grant of copyright is free and does not require high registration costs that would motivate a minimum economic return [14].

· The underlying material that copyrights protect often comes from a creative desire to express something rather than a desire to solve an industrial problem.

· Copyright comes with a set of inalienable moral rights that could limit commercial exploitation scenarios regardless of whether the use cases generate economic value [15].

The creation of data sets seems governed by an even broader set of motivations than patents or copyrights. Data sets can become distanced from exploitation scenarios because they:

· Often do not require high initial registration or operational costs that would motivate a minimum economic return

· Do not have to satisfy criteria such as novelty, non-obviousness, or creative expression to justify their existence.

· May be too immature to demonstrate tangible value

· May have been created to produce societal, historical or scientific value but not economic value

· May often spend most of their lifetimes evolving to create value without being influenced by similar data being generated within the walls of other organisations.

· May not be immediately aligned with commercial exploitation scenarios because they seem poorly aligned with existing IP rights.

· May not be scrutinised for exploitation scenarios because data assets are poorly reported on accounting balance sheets

Big Data sets may become even more distanced from exploitation scenarios because their aspects of complexity and scale can make it difficult to identify an optimum set of economic use cases. To only identify one scenario may undervalue a Big Data set and to identify most of the possible scenarios may be impractical to articulate. In the face of this challenge to identify and develop representative use cases, stakeholders may opt to emphasise Big Data’s potential more than its immediate concrete use. The potential value drawn from deferred expectation would reside in two sources of hope:

· Unexpected benefits would occur once data become “Big Enough” data to exhibit network effects of scale

· Unexpected benefits would occur once a data set is combined with others data sets

The standard’s emphasis on having concrete exploitation scenarios makes sense because it seems like the most effective way to convince someone that an asset could generate an economic benefit. However, it is probably far more difficult to identify and develop a set of commercial exploitation scenarios for a big data set than it is to do for a patent.

The Next Article

My next article will be entitled “Big Data Valuation: The Importance of Market Protections, Complementary Goods and Data Asset Management”. In it, I will focus on the importance of data asset management, and the importance of complementary goods that contribute to producing data.

References

[1] Wurzer, Alexander. Certified University Course IP Valuation 1. IP Business Academy. https://ipbusinessacademy.org/certified-university-course-ip-valuation-i

[2] Patent Valuation — General principles for monetary patent valuation. English translation, DIN 77100:2011–5. https://www.beuth.de/en/standard/din-77100/140168931

[3] Barczewski, Maciej. Value of information: intellectual property, privacy and big data. №7. Peter Lang Publishing Group, 2018.

[4] Ahmad, Imran, Stepin, Nikita, Chou, Patrick, Suliman, Suzie. Where Data Meets IP (Sept 2021). https://www.dataprotectionreport.com/2021/09/where-data-meets-ip/

[5] Overview: the TRIPS Agreement. World Trade Organisation. https://www.wto.org/english/tratop_e/trips_e/intel2_e.htm

[6] Kilpatrick, Charlotte. How to protect big data. (Oct 2019). https://www.managingip.com/article/b1kbljy4tbktcm/how-to-protect-big-data

[7] Carroll, Michael. Sharing Research Data and Intellectual Property Law: A Primer.

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002235

[8] Revising a Patent, Justia. https://www.justia.com/intellectual-property/patents/revising-a-patent/

[9] Guidance: Amending your patent after grant. GOV.UK, https://www.gov.uk/guidance/amending-your-patent-after-grant

[10] Amendment — Changing a Patent Application After Filing, Albright IP, https://www.albright-ip.co.uk/patents/amendment-changing-a-patent-application-after-filing/

[11] Patents, WIPO. https://www.wipo.int/patents/en/

[12] Frequently Asked Questions: Patents, WIPO. https://www.wipo.int/patents/en/faq_patents.html

[13] de Andrade, Anthony. Twelve ways to manage global patent costs. https://www.wipo.int/wipo_magazine/en/2017/04/article_0007.html

[14] Obtaining IP Rights: Copyright, WIPO. https://www.wipo.int/sme/en/obtain_ip_rights/copyright.html

[15] Chavez, Javier Andre Murillo, Copyright and the value of moral rights, WIPO Magazine (2018), https://www.wipo.int/wipo_magazine/en/2018/04/article_0003.html