Getting value from your data shouldn’t be that difficult


The potential impact of the ongoing global data explosion continues to excite the imagination. A 2018 report estimates that each person produces an average of 1.7 MB of data every second of every day – and annual data creation has more than doubled since then and is expected to double again by 2025. McKinsey Report The Global Institute estimates that skillful use of big data could generate an additional $ 3 trillion in economic activity, allowing for a variety of applications such as self-driving cars, personalized healthcare and traceable food supply chains.

But adding all this data to the system also creates confusion about how to find, use, manage, and share it legally, securely, and efficiently. Where does a particular set of data come from? Who owns what? Who has the right to see certain things? Where is? Can it be shared? Can it be sold? Can people see how it is used?

As data applications grow and become more ubiquitous, manufacturers, users, and owners and data managers find that they do not have a game book to follow. Users want to connect with data they trust to make the best decisions possible. Manufacturers need tools to securely share their data with those who need it. But technology platforms are lacking and there are no real common sources of truth to connect the two sides.

How do we find data? When should we move it?

In a perfect world, data will flow freely as a utility accessible to all. It can be packaged and sold as raw material. It can be viewed easily, without complications, by anyone authorized to see it. Its origins and movements can be traced, eliminating any worries about malicious use somewhere along the line.

Today’s world, of course, does not work that way. The massive data explosion has created a long list of problems and opportunities that make it difficult to share pieces of information.

Because data is created almost anywhere in and outside an organization, the first challenge is to identify what is being collected and how to organize it so that it can be found.

The lack of transparency and sovereignty over the stored and processed data and infrastructure raises trust issues. Today, moving data to centralized locations from multiple technology stacks is expensive and inefficient. The lack of open metadata standards and widely available application programming interfaces can hamper data access and consumption. The availability of sector-specific data ontologies can make it difficult for people outside the sector to take advantage of new data sources. Many stakeholders and difficulties in accessing existing data services can make it difficult to share without a governance model.

Europe is in the lead

Despite the problems, data sharing projects are being undertaken on a large scale. One, backed by the European Union and a non-profit group, is creating an interoperable data exchange called Gaia-X, where businesses can share data under the protection of strict European data privacy laws. The exchange is designed as a data sharing vessel between industries and an information repository for artificial intelligence (AI) data services, analysis and the Internet of Things.

Hewlett Packard Enterprise recently announced a framework for solutions to support the participation of companies, service providers and community organizations in Gaia-X. The data space platform, which is currently being developed and based on open and cloud standards, ennobles democratize access to data, data analysis and AI, making it more accessible to domain experts and ordinary users. It provides a place where domain experts can more easily identify reliable datasets and securely analyze operational data – without always requiring expensive data relocation to centralized locations.

Using this framework to integrate sophisticated data sources into the IT landscape, businesses will be able to provide large-scale transparency of data so that everyone – whether a scientist or not – knows what data they have, how to access it and how to access it. used in real time.

Data sharing initiatives are also at the top of the business agenda. An important priority facing businesses is to verify the data used to train internal AI models and machine learning. AI and machine learning are already widely used in enterprises and industry to stimulate continuous improvement in everything from product development to recruitment to manufacturing. And we’re just getting started. IDC predicts that the global artificial intelligence market will grow from $ 328 billion in 2021 to $ 554 billion in 2025.

To unlock the true potential of AI, governments and businesses need to better understand the collective legacy of all the data that governs these models. How do AI models make their decisions? Do they have biases? Are they reliable? Did unreliable persons have access to or change of the data against which the enterprise trained its model? Connecting data producers with data users more transparently and more efficiently can help answer some of these questions.

Building data maturity

Businesses will not decide how to unlock all their data overnight. But they can prepare to take advantage of technologies and management concepts that help create a data-sharing mentality. They can ensure that they develop the maturity to consume or share data strategically and efficiently, rather than on an ad hoc basis.

Data producers can prepare for wider dissemination of data by taking a series of steps. They need to understand where their data is and how to collect it. They then need to make sure that the people who are consuming the data have access to the right data sets at the right time. This is the starting point.

Then comes the harder part. If the data producer has users – who can be inside or outside the organization – they need to connect to the data. This is both an organizational and a technological challenge. Many organizations want control over data sharing with other organizations. The democratization of data – at least the ability to find it in organizations – is a matter of organizational maturity. How do you deal with this?

Companies that contribute to the automotive industry actively share data with suppliers, partners and subcontractors. It takes a lot of parts – and a lot of coordination – to assemble a car. The partners readily share information on everything from engines to tires to repair web feeds. Automotive data spaces can serve more than 10,000 providers. But in other industries it may be more insular. Some large companies may not want to share sensitive information even within their own network of business units.

Creating a data mentality

Companies on both sides of the consumer-producer continuum can improve their data-sharing mentality by asking themselves the following strategic questions:

  • If companies build artificial intelligence and machine learning solutions, where do teams get their data? How do they relate to this data? And how do they track this history to ensure the reliability and origin of the data?
  • If data has value for others, what is the revenue path that the team is taking today to expand that value and how will it be managed?
  • If a company already exchanges or monetizes data, can it allow a wider range of services across multiple platforms – on-site and in the cloud?
  • For organizations that need to share data with providers, how do these providers coordinate with the same datasets and updates today?
  • Do manufacturers want to copy their data or force people to provide them with models? Datasets can be so large that they cannot be replicated. Should a company host software developers on its platform, where its data is, and move models in and out?
  • How can employees in a data-consuming department influence data producers’ upward practices within their organization?

Taking action

The data revolution is creating business opportunities – along with a lot of confusion about how to search, collect, manage and gain insights from this data in a strategic way. Data producers and data users are becoming increasingly disconnected. HPE builds a platform that supports both local and public clouds, using open source as a basis and solutions such as the HPE Ezmeral software platform to provide the common position that both countries need to make the data revolution work for them .

Read the original Enterprise.nxt article.

This content is produced by Hewlett Packard Enterprise. Not written by the MIT Technology Review.



Source link