Data User Guide | EHDS4all Framework

You want to use health data but are unsure where to start or which ideas have the greatest potential. This guide supports you throughout the process. It follows a structured framework that leads you from the exploration of available health data to the prioritization of promising ideas and use cases. By following this framework, you can systematically identify relevant data sources, explore potential opportunities, and assess which concepts create the most value so that health data can be used in a meaningful and responsible way.

This guide supports you in making effective use of health data. It provides knowledge, methods, and practical tools to help you systematically explore opportunities, develop ideas, and make informed decisions about how health data can be used in a responsible and impactful way.

What is the European Health Data Space (EHDS)?

The European Health Data Space (EHDS) is a new European regulation that changes how you can access and work with electronic health data across Europe. While it introduces new governance structures and technical standards, it also opens up new opportunities to discover, access and use health data in a structured and legally secure way. This creates new possibilities for developing data-driven research projects, innovative services and new forms of collaboration across institutions and countries.

Through the EHDS, you can discover and access health data more transparently across countries and institutions. Instead of navigating fragmented national systems, you will increasingly interact with structured access mechanisms and federated infrastructures that enable cross-border data use under clearly defined conditions.

At the same time, the EHDS maintains strong safeguards for privacy and security by requiring controlled access procedures and secure processing environments.

For an official overview of the regulation and its implementation timeline, see: https://health.ec.europa.eu/ehealth-digital-health-and-care/european-health-data-space-regulation-ehds_en

In practice, the EHDS changes how you discover, access and analyze health data. Instead of relying on isolated data sources or informal data sharing arrangements, you will increasingly work within structured governance frameworks and controlled data environments.

This creates new opportunities to develop data-driven research projects, services and collaborations based on health data. At the same time, working with such data requires responsible handling, including compliance with established data protection principles and the use of secure environments for data analysis.

Navigating this emerging landscape can be challenging. The EHDS4All framework helps you understand how to explore available data sources, develop meaningful data-driven initiatives and reflect on both the opportunities and responsibilities that come with working with health data.

Understanding the EHDS helps you recognize where real opportunities for working with health data exist and how you can realistically develop data-driven initiatives. By understanding how health data can be accessed and used within the emerging European data landscape, you can better identify feasible data sources, anticipate access requirements and design initiatives that align with existing rules and infrastructures.

Within the EHDS4All framework, the EHDS is therefore not treated as a separate compliance topic. Instead, it serves as an underlying reference point that helps you explore the health data landscape, develop meaningful concepts and make informed decisions throughout the framework journey.

How the framework is structured...

The EHDS4All framework guides you through a structured development journey consisting of five consecutive phases. Each phase helps you move from an initial idea towards a more concrete, reflected and prioritized data-driven initiative. Along the way, you explore relevant health data, generate use cases, specify promising ideas, assess effort and value, and identify which initiatives are worth pursuing further.

In this phase, you build an overview of the health data landscape relevant to your organization. You identify and compare potential data sources, reflect on what types of health data may be relevant for your context and examine how existing data availability could open up new directions for action. This helps you move from a vague interest in health data to a more informed understanding of which data resources may be worth considering further.

Building on this overview, you develop initial use case ideas. You connect observed organizational challenges, needs or opportunities with the data sources identified in the previous phase and translate them into potential data-driven initiatives. Structured ideation methods help you broaden your perspective, combine different impulses and formulate use cases that are specific enough to be developed further.

In the specification phase, you work out the most promising use case ideas in more detail. You clarify what the initiative is meant to achieve, who it is relevant for, what role the data plays and what form the initiative could take, for example as a study, a data-driven service or a data product. This phase helps you turn broad use case ideas into more concrete concepts with a clearer purpose and application context.

Once concepts have been specified, you assess them in a more structured way. You reflect on implementation effort from different angles, such as data-related requirements, technical development and organizational implications, and you consider the potential value the initiative could create. This helps you understand not only what seems attractive in principle, but also what appears realistic and worthwhile in practice.

In the final phase, you compare the developed concepts and decide which ones should be pursued further. You bring together your reflections on feasibility, value and strategic relevance to create a transparent basis for decision making. This helps you identify the initiatives that offer the strongest fit for your organization and provides a clear starting point for next steps.

Conceptual framing end EHDS alignment

The EHDS4All framework helps you turn insights from research on data ecosystems, data governance and data-driven innovation into concrete health data initiatives. At the same time, it helps you work within the European Health Data Space by integrating key EHDS principles such as structured data access and secure data processing directly into how you explore data, develop use cases and assess implementation options.

What can be achieved by using the framework

By working through the EHDS4All framework, you develop a structured understanding of how to design and develop data-driven initiatives within your organizational context. The framework enables you to…

The framework does not provide predefined solutions. Instead, it offers a transparent development journey that helps users understand where they stand, what they can achieve in each phase and how to move from early exploration towards well-founded, EHDS-aligned initiatives.

References

Alexander von Humboldt Institut für Internet und Gesellschaft (HIIG), & Kompetenzzentrum Wasser Berlin (KWB). (n.d.). Finale Prüfung: Gleise den Prozess auf und legt los! Data Governance Wegweiser.

The framework does not provide predefined solutions. Instead, it offers a transparent development journey that helps users understand where they stand, what they can achieve in each phase and how to move from early exploration towards well-founded, EHDS-aligned initiatives.

Methods & Analytical Tools

01. Clarifying the Purpose of Data Use

Before assessing data, you should first clarify why you want to use the data. Different usage logics lead to different expectations regarding value creation, data quality and governance requirements. If you do not clarify the intended purpose early on, you risk applying inappropriate evaluation criteria.

Research-Oriented Use
If you want to use health data for scientific studies, the main value lies in generating reliable knowledge and answering well-defined research questions. To achieve this, you need datasets that support methodological robustness. This includes completeness, internal validity, consistent variable definitions and transparent documentation of data provenance. Routine health data often originate from operational systems and were not designed for research purposes, which requires careful quality assessment (Hersh et al., 2013). From a governance perspective, you must ensure that research use complies with regulatory requirements for secondary use and secure processing environments (Kahn et al., 2016).

Model Training and Advanced Analytics
If you plan to train predictive models or develop AI-based solutions, the primary value lies in improving analytical performance and generating scalable insights. This requires large, well-structured datasets with sufficient granularity, standardization and representativeness. Fragmented or poorly harmonized data limit model performance and reduce generalizability. You therefore need to assess whether the available data support robust modeling and whether interoperability standards and metadata enable reproducibility and long-term usability (Rajkomar et al., 2018). Governance considerations include responsible data use, transparency of model development and compliance with regulatory requirements for AI systems.

Service or Product Development
If you want to develop digital services, clinical decision tools or operational optimization solutions, the main value lies in improving processes or creating new data-driven services. In this context, you should evaluate whether the data support real operational use. Relevant quality aspects include interoperability with existing systems, update frequency and integration into organizational workflows. From a governance perspective, you must ensure that data access, maintenance and long-term availability remain sustainable and compliant with existing regulatory requirements (Batini & Scannapieco, 2016).

Data Product or Intermediary Models
If you aim to monetize data or act as a data intermediary, the main value lies in creating economic opportunities through controlled data access and trusted data exchange. You therefore need to assess whether your data assets provide distinctive value for external actors and whether they can support scalable data services or partnerships. Data quality requirements often focus on reliability, standardization and trustworthiness rather than purely analytical precision. Governance implications become particularly important in this scenario, as you must establish clear access rules, contractual frameworks and trusted mechanisms for sharing data across organizational boundaries (Jones & Tonetti, 2020).

Clarifying the intended use logic helps you align value expectations, quality requirements and governance implications before you begin evaluating specific datasets.

02. Clarifying the Purpose of Evaluation

Even when the use logic is clear, you should also clarify why you evaluate a dataset. Different evaluation purposes lead to different conclusions about the same data. For example, a dataset that offers little short-term revenue potential may still hold strategic or generative value for your organization.

Evaluating Revenue or Market Potential
If you treat data as economic assets, you should first consider whether the dataset can generate external value. This includes assessing market demand, competitive differentiation and the ability to scale usage. In data ecosystems, value often arises from controlled access, trusted governance and network effects rather than from the intrinsic production cost of the data itself. At the same time, you may need to consider potential access fees or data acquisition costs if data holders charge for providing access. The key question therefore becomes: Can these data generate sustainable economic value under the existing regulatory and governance conditions?

Evaluating Strategic
Impact Even if immediate monetization is not the primary goal, datasets may still strengthen your organization’s long-term position. Data can improve analytical capabilities, support better decision making or enable collaboration across organizational boundaries. Structured data governance and systematic data use can therefore contribute to organizational maturity and strategic resilience (Otto, 2011). You should therefore ask: How can these data strengthen our long-term capabilities and organisational positioning?

Evaluating Generative Potential
In data space environments, an additional dimension becomes important: generativity. Generativity describes the ability of digital resources to enable ongoing recombination, adaptation and unexpected innovation (Yoo et al., 2010). Some datasets create value not only through their immediate use but through their potential to combine with other data sources or support future services and collaborations. You should therefore consider whether the dataset enables new forms of innovation, partnerships or ecosystem participation over time.

Evaluating Expected Costs
Finally, you should assess the effort required to work with the data. Within the EHDS context, data access pathways, documentation requirements and the use of secure processing environments influence implementation costs (European Commission, 2025). In practice, this may include harmonization effort, compliance requirements, technical integration and ongoing maintenance. A realistic assessment of these factors helps you estimate the organizational and economic effort required to use the dataset effectively.

References

Batini, C., & Scannapieco, M. (2016). Data and information quality: Dimensions, principles and techniques. Springer.

European Commission. (2025). European Health Data Space regulation. Publications Office of the European Union.

Hersh, W. R., Weiner, M. G., Embi, P. J., Logan, J. R., Payne, P. R. O., Bernstam, E. V., Lehmann, H. P., Hripcsak, G., Hartzog, T. H., Cimino, J. J., Saltz, J. H., & AMIA EHR 2020 Task Force. (2013). Caveats for the use of operational electronic health record data in comparative effectiveness research. Medical Care, 51(8 Suppl. 3), S30–S37.

Jones, C. I., & Tonetti, C. (2020). Nonrivalry and the economics of data. American Economic Review, 110(9), 2819–2858.

Kahn, M. G., Callahan, T. J., Barnard, J., Bauck, A. E., Brown, J., Davidson, B. N., Estiri, H., Goerg, C., Holve, E., Johnson, S. G., Liaw, S. T., Hamilton Lopez, M., Meeker, D., Ong, T. C., Ryan, P. B., Shang, N., Weiskopf, N. G., Weng, C., Zozus, M. N., & Schilling, L. (2016). A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. eGEMs, 4(1), 1244.

Otto, B. (2011). Organizing data governance: Findings from the telecommunications industry and consequences for large service providers. Communications of the Association for Information Systems, 29, Article 3.

Rajkomar, A., Dean, J., & Kohane, I. (2018). Machine learning in medicine. New England Journal of Medicine, 380(14), 1347–1358.

Yoo, Y., Henfridsson, O., & Lyytinen, K. (2010). The new organizing logic of digital innovation: An agenda for information systems research. Information Systems Research, 21(4), 724–735.