Try out our Laces apps Laces Suite

FAIR data

What is FAIR data?


  • Insight
2 minutes
What does FAIR stand for?

FAIR is an acronym for Findable, Accessible, Interoperable, and Reusable. These topics are relevant to those wishing to reuse their data and have machines automatically find and process their data to help people do so.

  • Findable data refers to data that has been given an address for others to locate. This information about the data, or metadata, should be easy to find for both humans and computers.
  • Accessible means data sources can be accessed, possibly including authentication and authorization to retrieve data.
  • Interoperability data is suitable for machines to work together and exchange data with minimal effort.
  • Reusable data refers to information that is not just technically reusable, but the organization and legislation for use are in place as well.


The authors published the ‘FAIR Guiding Principles’ (for scientific data management and stewardship) in Scientific Data in 2016 (refer also to Mark D. Wilkinson et al., 2016) to assist individuals in making their data FAIR. Although predominantly academic stakeholders initiated FAIR with the aim of addressing scientific and scholarly data, today, new technologies are opening up entirely different domains that we will address in this blog.

Why do we need FAIR data?

In science, the desire to cooperate freely and not have technology (or people) be a barrier is as ancient as science itself. Cooperation is a means to get more done in less time and to avoid doing work redundantly or, worse, inconsistently. As in science, these desires, needs, and ambitions are no different. For example, an American study found that 92% of job vacancies analyzed demanded digital skills (NSC, 2023).

However, since the rise of information technology, the amount of data has increased ten times in the past ten years and has grown exponentially (Statista, 2023). With the increasing power of computer processing, the need for automation grew as well. Together with that was the ambition to have computers do analysis people could never do themselves (think of the current expectations of AI).

Another driver for FAIR data arises from the complexity that emerges when data cannot be stored in the same database anymore due to technical and organizational reasons. It’s becoming too much data for too many different purposes to stay manageable. To counter this ‘monolithic’ approach, there is currently a whole movement of working distributed and federated (using other but cooperating databases/machines). As Wilkinson et al (2016) already noticed, a growing, less centralized data ecosystem will make data itself more diverse, as well as increasingly require it to become FAIR.

So what is the problem with data that is not FAIR by default? In other words, what makes data UNFAIR?

What makes data UNFAIR?

To help people, we first need people to help computers. It is people who make agreements on how to describe data and how computers should handle it. We know this as making data ‘machine-actionable’, and it helps because, by default, there is no common ground for understanding data by man and machines. But why is this the case?

The biggest challenge is to instruct computers -or to be more specific, software applications- to process data in a common (FAIR) way. However, there are barriers. Firstly, software applications arise in their own ‘vacuum’, and the need for sharing data outside that vacuum often comes at a later stage. Thus, most applications aren’t created with FAIR in mind from the start (Laces is!). Secondly, the way a specific application handles data is something we always have to optimize for its own specific processing. That’s why applying unFAIR principles is common, and FAIR principles aren’t a priority for software developers.

In conclusion, most software applications use their metadata within their own locating systems (unfindable), with their own authentication and authorization mechanisms (unaccessible), giving their own meaning to their data and data formats (not interoperable) and having no common ways of organizing the creation, maintenance, and use of data (not reusable). That makes data, most of the time, unFAIR.

Requirements to degrees of FAIRness?

As GO FAIR puts it: there are degrees of FAIRness. The FAIR Guiding Principles are high-level guidelines. FAIR is not a standard nor a specific technology or solution. So, if FAIR doesn’t prescribe solutions, what solutions are already available that tick the FAIR boxes?

To be Findable and Accessible, data resources need to be identified, made searchable, and use a standard communication protocol for computers, including authentication and authorization procedures. Furthermore, we have to ensure that technology is open, free, and universally implementable. Another requirement is for data to be Interoperable- it needs to contain all meaning necessary for an application to query and understand (interpret and process) the data.

Last but certainly not least: to be Reusable, data needs to be created with its reuse in mind. The community of users and the data ecosystem should agree on all relevant information for the community as a whole to use the data, like license terms and clear descriptions of contextual information and provenance. To make data Reusable, it requires people to determine who does what, when, and how. And because these are not technical choices, they involve software functionality to support people in the process.

What technology is available that fits these requirements?

Linked FAIR Data?

The most evident technology standards that tick these boxes are W3C’s web standards: the dominant technology standards, accepted and managed by the Internet community and authority W3C. Particularly web standards that exceed communication standards for mere websites but aim to enable all (types of) data to become FAIR. These standards are gathered in a technology called Linked Data.

Linked Data is in essence, a combination of existing and additional agreements on identifying data online, making it retrievable and allowing for both man and machine to describe and interpret all meaning necessary for understanding the contents.

By publishing data as Linked Data, many FAIR boxes are ticked by default. It makes data Findable, Accessible, Interoperable, and, when done right, Reusable. It does require software developers to make either translations and transformations from their proprietary data to Linked Data or more and more software developers to take FAIRness into account from the start.

By using Laces, users are not burdened with the complexity of implementing FAIR Principles but are enabled to manage, publish, and share FAIR data from scratch using off-the-shelf tooling.

Curious about the Laces solutions? Schedule a free demo with one of our experts. It will take about 45 minutes, no strings attached.

Reference:


Read more
Whitepaper

Case study – Requirements Management for Civil Engineering

Have you ever wondered how you can improve your requirements management process? This white paper is for all Project managers, Tender managers, Requirements managers, Technical managers, or Design engineers who want to know more about: Find out more about how you can structure requirements management and drive value and efficiency. You will also see some […]

Read
Insight
model based approach with laces

Smarter requirements for MBSE [5 steps to a model-based approach]

Systems Engineering is a multidisciplinary and integrative management approach to system development. When we add formal (digital) representations of a system to the approach, we talk about Model-Based Systems Engineering (MBSE). These digital models are made of structured data, preferably interchangeable between software. Although the term model focuses mostly on geometric data and data for […]

Read
Insight

Why to use Laces for Publishing Data

With the infinite number of ways that professional content—such as product information, company standards, metadata, and classification systems—is used today, you need to take control by becoming publishers of high-quality data. It needs to be structured, easily accessible, and reusable by both people and machines. After all, like it or not, professionals are judged not […]

Read