SciTransfer
PROCESS · Project

Ready-Made Software to Process Massive Datasets Too Big for Current Systems

digitalTestedTRL 6

Imagine your company collects so much data that no single computer — or even a cluster of them — can handle it fast enough. That's the "exascale" problem. PROCESS built open-source software that lets organizations crunch enormous datasets by orchestrating computing power across multiple data centres, almost like a traffic controller for supercomputers. They tested it on real problems: analyzing medical records, optimizing airline ticket pricing, and mapping global disaster risks.

By the numbers
5
large-scale data service prototypes built and validated
EUR 2,972,250
EU funding invested in development
8
consortium partners across 6 countries
26
total project deliverables produced
3
industry use cases tested (medical, airline, disaster risk)
The business problem

What needed solving

Companies in healthcare, aviation, and risk analytics are drowning in data that exceeds the capacity of their current computing infrastructure. Processing terabytes or petabytes of information — patient records, flight booking patterns, satellite disaster imagery — requires orchestrating resources across multiple supercomputing centres, which is technically complex and expensive to build from scratch.

The solution

What was built

The project built 5 exascale data service prototypes, progressing from first prototype through alpha and beta releases to a first packaged software version. Deliverables include a data orchestration system, an integrated user interface portal, and benchmark workflows — all open source and tested on real-world use cases in medical informatics, airline revenue management, and disaster risk reduction.

Audience

Who needs this

Airlines and travel platforms running large-scale revenue optimizationHospital networks and health data analytics companies processing massive patient datasetsReinsurance and catastrophe modelling firms working with global disaster dataResearch data centres needing to orchestrate exascale workloadsCloud service providers looking to offer managed big-data-as-a-service
Business applications

Who can put this to work

Airlines & Travel
enterprise
Target: Airlines and revenue management providers

If you are an airline or travel platform dealing with massive booking and pricing datasets that overwhelm your current analytics — this project developed exascale data services tested specifically on airline revenue management. The open-source tools orchestrate processing across distributed computing resources, letting you run pricing optimization models on datasets too large for conventional systems.

Healthcare & Medical Informatics
enterprise
Target: Hospital networks and health data analytics firms

If you are a healthcare organization struggling to process large volumes of patient records, imaging data, or genomic datasets — this project built and validated data service prototypes using medical informatics as one of its 5 real-world pilot applications. The packaged software release handles data orchestration across high-performance computing centres without requiring deep HPC expertise.

Insurance & Disaster Risk
enterprise
Target: Reinsurance companies and catastrophe modelling firms

If you are a reinsurance or risk analytics company needing to process global disaster data for risk modelling — this project specifically tested its exascale data platform on open data for global disaster risk reduction. The 8-partner consortium across 6 countries built tools that integrate heterogeneous data sources at scales current solutions cannot handle.

Frequently asked

Quick answers

What would this cost us to adopt?

PROCESS used an open-source strategy, meaning the software releases are freely available. Your costs would be infrastructure (HPC or cloud computing resources) and integration effort. The project invested EUR 2,972,250 in EU funding across 8 partners to develop the platform over 3 years.

Can this handle our data volumes at industrial scale?

Yes — the entire point was exascale computing, which means processing datasets at the largest scales currently possible. The project delivered 5 very large data service prototypes validated in real-world settings including industry pilot deployments. Use cases ranged from medical data to airline revenue optimization.

What about IP and licensing?

The project explicitly chose an open-source strategy to maximise uptake and reuse. This means you can adopt and modify the software without licensing fees. Check the project repository and deliverables for specific license terms (typically Apache or similar for EU-funded open-source projects).

How hard is it to integrate with our existing systems?

PROCESS was designed with a modular architecture and mature software engineering practices specifically to minimise setup and maintenance effort. The project delivered a user interface portal and data orchestration tools to ease the learning curve for the broadest possible range of users.

Is this still maintained after the project ended in 2020?

The project ended in October 2020. Based on available project data, long-term maintenance depends on the consortium partners, particularly Ludwig-Maximilians-Universität München as coordinator. The open-source codebase may have community contributions, but you should verify current activity on the project website.

What's the timeline to get this running?

The project produced a first packaged version of the PROCESS software, followed by alpha and beta releases of the data service. This suggests a deployable product, but integration timelines depend on your infrastructure. The modular design was intended to reduce deployment time.

Consortium

Who built it

The consortium of 8 partners across 6 countries (Germany, Netherlands, Spain, Poland, Switzerland, Slovakia) is heavily academic — 5 universities and 1 research organization versus only 2 industry partners (25% industry ratio), with zero SMEs. This is typical for infrastructure-level computing research. The coordinator, Ludwig-Maximilians-Universität München, is a top German research university. The low industry presence means the technology was validated primarily in research settings, and commercial adoption would benefit from a systems integrator or cloud provider picking up the open-source tools. The geographic spread across Central and Western Europe gives good coverage of HPC centres but limited direct links to commercial end-users.

How to reach the team

The coordinator is Ludwig-Maximilians-Universität München (Germany). Use SciTransfer's coordinator lookup to find the project lead's contact details.

Next steps

Talk to the team behind this work.

Want to explore whether PROCESS tools can solve your big data bottleneck? SciTransfer can connect you directly with the development team and help assess fit for your use case.