Big Data Valuation: The Importance of Market Protections, Complementary Goods and Data Asset Management

9 min readDec 22, 2021

by Kevin Garwood

Disclaimer: I am not an IP lawyer, an economist or an accountant. This series of articles on the topic of data valuation captures opinions I developed from taking an online course on IP Valuation. They are not meant to represent the views of my employers, past or present. The purpose of this article concerning technical, legal or professional subject matter is meant to foster more dialogue about the subject and does not constitute legal advice.

I began writing a series of articles on data valuation by asking how the future economic value of Big Data assets could be assessed. I sought inspiration by enrolling in a course on IP Valuation [1] through the IP Business Academy, and reflected on how valuation could apply to data rather than to patents.

In my previous article, I began the first of three pieces that delved into the anatomy of IP Value. I discussed how a portfolio of patents may differ from a portfolio of data sets. I also covered the value of synergies amongst portfolio items and the importance of clear applied value rather than opaque potential value of portfolio assets.

In this post, I again examine parts of the Patent Valuation standard DIN 77100 [2] and try to replace “patent” with “data set” to see how an IP Valuation standard that emphasizes patents would apply to data. Here, I begin to cover the implications of data’s awkward fit with exclusive use protections and of the concept of complementary goods in IP Value.

My summary findings in this piece are:

· The lack of consensus about data’s definition, its classification as property and its ability to be owned has an impact on the IP mechanisms that can be used to support exclusive use

· The ability to exclude others from using an intellectual asset seems key to underwriting the credibility of predictions for its future value in a valuation activity

· It seems likely that Big Data assets will take on starring roles rather than just supporting roles in IP valuations

· The DIN 77100 concept of “complementary goods” suggests that good data management practices would be critically important in supporting the value of data sets

The remainder of the article seeks to cover each point in a more expanded context..

The Importance of Protections in the Market

As I discussed in an earlier post, “data” is difficult to define [3], and it is not something that is universally recognised as a legal form of “property”[4][5] or which has clear ownership rights[6]. There is no specific IP right for data sets [7][8] and they seem awkwardly protected by combinations of copyright, sui generis database rights and trade secrets [7][9]. The importance of these observations becomes apparent when we focus on the word “protected” in this part of the standard [2]:

“The exploitation scenario describes the exploitation of the patent as the commercial implementation of the invention protected by the patent in the market.”

Like the other traditional forms of intellectual property, a patent can both be a thing that is used to generate economic benefit and a thing that comes with standard protections that safeguard exclusive use for the owner. These kinds of assets are like intangible steel chests that help limit access to intangible intellectual material inside. The chests don’t have physical form but their wide support in international law creates a shared understanding of them as if they did [10].

If we replace “patent” with “data set”, we can see it doesn’t quite make sense: “The exploitation scenario describes the exploitation of the data set as the commercial implementation of the invention protected by the data set in the market.”

A data set is merely the intellectual material and our challenge is to find a box to put all of it in. If the data sets emphasise original and creative expression, then we can put them in a “copyright” box. If they are organised into a database, we may be able to put them into a “sui generis database right” box. Otherwise, we may have to put them in the “trade secret” box, which does not technically represent a form of IP right [8][9]. The awkward fit can lead us to envision a portfolio of data sets where some parts of them are in strong protected boxes and others might be lying unprotected on the ground beside them.

In the IP valuation course, I wrote down this thought [1]:

“IP rights give the owner the ability to exclude third parties from using the outcomes of their creative work. Such forbidding rights do not represent a monetary value per se for a company. The monetary value only arises from the use of the exclusivity effect in a business model.“

I was left asking myself: do all valuation activities require that the intellectual material be protected by an exclusivity effect? If the answer is “yes”, then applying an IP valuation approach to data sets may require some more thought than when it is applied to patents. The need for exclusivity would be the same between patents and data sets, but the means of achieving it with patents is far more obvious.

Data as Dependent or Independent Assets for Valuation

One of the most important concepts that emerged from the course [1] was the role of complementary goods. The standard defines their importance [2]:

“The exploitation scenario determines the complementary goods necessary for the implementation of the technical invention and for the exploitation of the [data set].”

Complementary goods could include laboratory equipment, different sources of domain expertise, data sets, specific laboratory materials and other patents. Indeed, some of the complementary goods may themselves be the subject of separate valuations. This part of the standard raises a question: Should data sets have a starring role or a supporting role in an IP valuation?

Data sets seem best suited for a supporting role in hypothesis-driven activities where the data is created in response to a single specific question. In this scenario, the data sets become subservient to both the question, and any IP products that may follow from the answer to it. The value of a data set could be estimated by its relative contribution to another established valuation rather than warranting one itself.

Data sets would earn more attention if they were created to answer one question and re-used to answer another. Their value could then be estimated by summing the value they played servicing multiple research questions. Estimation would be more complicated this way, but it would still not warrant being the focus of a major valuation activity.

Big Data sets seem suited for a starring role in data-driven activities where they could answer multiple questions. Some questions would be supported now, whereas others would only be meaningful to ask when the data sets were combined with future data sources. In this scenario, Big Data sets will be perceived as having a life of their own, and could be part of a portfolio that warranted its own valuation activity. The rest of this discussion presumes that data sets are important enough to be the focus of an IP valuation.

The Importance of Managing Data as Assets

If data sets become the focus of an IP valuation, then data management practices and expertise in knowledge infrastructure become complementary goods for their exploitation scenarios. These goods become crucial for establishing the trustability of data and in some cases the ethical origins of it.

The face value of other forms of IP may be more obvious than the face value of data sets. Patents are designed to contain all the information necessary for someone skilled in the art to replicate an invention and to appreciate its context in a broader body of prior art [11]. Copyrights are designed to protect a form of original expression [12]. However, the face value of data sets may be more difficult to assess, especially if they lack novelty, originality or sufficient provenance to guide their use. To make a compelling case for value of a data set, we may need to understand how, why and when it was created, as well as how it evolved and who used it.

One of the core aspects of data management is to track data sets through their lifecycle [13]. Adequate capture of metadata about the data sets as they are created and age can be critical in establishing their potential future exploitation scenarios. Whereas the originators of a patent leave behind enough information about their work to inspire future innovation, the originators of a data set can only achieve a similar outcome if their data is sufficiently documented. Without provenance, the value of a data set is reduced to whatever value is immediately obvious from individual values or records.

Another important aspect of data management is monitoring aspects of compliance. In the case of regulated data, not capturing information about its compliance could turn a data asset into a liability that could virally degrade the value of other data assets that are combined with it. Evidence of compliance for regulation could mean demonstrating ethical use of personal data or showing that standardised processing has applied to forms of industrial data.

A third aspect of data management is ensuring that the data has been processed with adequate security controls. This is especially important with respect to data that may be protected through trade secrets. Whereas patents, trademarks and copyrights can all be protected through registration, a trade secret cannot.

WIPO identifies the general criteria for trade secrets: “… information must be commercially valuable because it is secret, be known only to a limited group of persons and be subject to reasonable steps taken by the rightful holder of the information to keep it secret, including the use of confidentiality agreements for business partners and employees.” [14]

The implications seem clear to me: data management practices would need to identify and isolate secret data sets, and legal expertise would be needed to manage how data is shared by various parties.

Software infrastructure is also important for showing the value of data sets, as they are often the end product of an analysis that relies on software. Demonstrating the care and attention put into developing and maintaining software could shape perceptions of the data sets they may help generate. Ideally, the metadata for data sets should include links to versioned snapshots of the software that produced it.

I have written these articles from the perspective of someone who has had a career working with knowledge infrastructure. Learning about the role of complementary goods in exploitation scenarios was important to me because it recognised the contributions of software developers, data managers, security experts and software infrastructure engineers that may not be obvious when one is just looking at an end product.

The Next Article

The next article is entitled “Big Data Valuation: The Impact of Specificity of Complementary Goods on Data Reuse”. I will discuss how the degree of specialisation in complementary goods used to support an economic use scenario could usefully limit the potential for re-use of Big Data assets.

Articles in the Series

References

[1] Wurzer, Alexander. Certified University Course IP Valuation 1. IP Business Academy. https://ipbusinessacademy.org/certified-university-course-ip-valuation-i

[2] Patent Valuation — General principles for monetary patent valuation. English translation, DIN 77100:2011–5.

[3] Borgman, Christine L. Big data, little data, no data: Scholarship in the networked world. MIT press, 2016

[4] Van Asbroeck, Benoit, Debussche Julien, Cesar, Jasmien (Mar 2019). Big Data Issues and Opportunities: Data Ownership. Bird and Bird. https://www.twobirds.com/en/news/articles/2019/global/big-data-and-issues-and-opportunities-data-ownership

[5] McFarlane, Ben. Data Trusts and Defining Property (Oct 2019). https://www.law.ox.ac.uk/research-and-subject-groups/property-law/blog/2019/10/data-trusts-and-defining-property

[6] Ritter, J., & Mayer, A. (2017). Regulating data as property: a new construct for moving forward. Duke L. & Tech. Rev., 16, 220. https://scholarship.law.duke.edu/cgi/viewcontent.cgi?article=1320&context=dltr

[7] Barczewski, Maciej. Value of information: intellectual property, privacy and big data. №7. Peter Lang Publishing Group, 2018

[8] Intellectual property in a data-driven world (October 2019). https://www.wipo.int/wipo_magazine/en/2019/05/article_0001.html

[9] Debussche, Julien, Cesar, Jasmien.Big Data and Issues and Opportunities: Intellectual Property Rights (Mar 2019). https://www.twobirds.com/en/news/articles/2019/global/big-data-and-issues-and-opportunities-ip-rights

[10] Patent-related treaties administered by WIPO. https://www.wipo.int/patent-law/en/treaties.html

[11] Guidelines for Examination. European Patent Office, https://www.epo.org/law-practice/legal-texts/html/guidelines/e/g_iv_2.htm

[12] Requirements for Copyright Protection. Copyright Alliance. https://copyrightalliance.org/education/copyright-law-explained/copyright-basics/requirements-for-copyright-protection/

[13] Document, Discover, Interoperate. DDI Alliance. https://ddialliance.org/

[14] Trade Secrets, WIPO. https://www.wipo.int/tradesecrets/en/