If you are a financial institution dealing with the challenge of selecting the right data integration and processing tools for compliance reporting — this project developed an open benchmarking platform tested on 1PB of real industry data across 4 domains. Instead of relying on vendor claims, you can run standardized tests to compare how well different tools handle your data volumes before committing to expensive licenses.
Open Benchmarking Platform That Tests How Well Your Big Data Tools Actually Perform
Imagine you're buying a car but there's no crash-test rating, no fuel-economy sticker — you just have to trust the dealer. That's what companies face when choosing big data processing software: no independent, standardized way to compare tools. HOBBIT built an open benchmarking platform — like a Consumer Reports for linked-data technology — where vendors and buyers can run real-world tests on 1 petabyte of industry data from 4 different sectors. The results are public, machine-readable, and repeatable, so you actually know what you're paying for.
What needed solving
Companies investing in big data processing tools have no independent, standardized way to compare performance before buying. Vendor benchmarks are self-serving, and running your own tests is expensive and inconsistent. This means procurement decisions worth hundreds of thousands of euros are based on marketing claims rather than verified performance data.
What was built
The project built an open, cloud-based benchmarking platform for testing big linked data tools across the entire data processing lifecycle. Deliverable D2.2.1 delivered the first working version of the HOBBIT Platform with a user manual for integrating new benchmarks, tested on 1PB of real industry data from 4 domains. In total, 52 deliverables were produced.
Who needs this
Who can put this to work
If you are an e-commerce company struggling with product data quality across suppliers and catalogs — this project built modular benchmarks that test every step of the big linked data lifecycle, from ingestion to querying. You can evaluate which data-linking software actually performs at scale with your real data patterns, rather than discovering bottlenecks after deployment.
If you are a health-data platform integrating patient records, clinical trials, and research datasets from multiple sources — this project created cloud-based evaluation infrastructure that benchmarks data processing tools under realistic conditions. With 10 consortium partners from 6 countries validating the platform, you get vendor-neutral performance data to guide procurement decisions.
Quick answers
How much would it cost to use the HOBBIT benchmarking platform?
The HOBBIT platform was designed as open and publicly available. The project's exit strategy involved creating a membership association sustained by subscriptions from industry and academia. Based on available project data, specific pricing was not published, but the open-source nature means the core platform can be accessed without license fees.
Can this handle data at the scale our company needs?
The platform was built and tested on approximately 1 petabyte of real industry-relevant data from 4 different domains. The architecture relies on cloud infrastructure specifically to ensure scalability. This is industrial-grade volume, not a lab demo.
Who owns the intellectual property and can we license it?
HOBBIT was funded as an EU Research and Innovation Action (RIA) with €3,718,250 in public funding. The platform and benchmarks were designed to be open and publicly available, with code accessible online. Based on available project data, the IP follows standard EU open-access provisions for publicly funded research.
Is this still actively maintained after the project ended in 2018?
The project planned an association as its exit strategy, created after the second project year, sustained by membership subscriptions. The project website (project-hobbit.eu) was established for ongoing access. Based on available project data, long-term maintenance depends on the association's continued activity.
How difficult is it to integrate our own benchmarks into the platform?
The platform was specifically designed to be modular and easily extensible. Deliverable D2.2.1 includes a user manual for the integration of new benchmarks by third parties. The architecture supports adding custom benchmarks for any step of the data processing lifecycle.
Which industries has this been validated in?
The project assembled real industry-relevant data from 4 different domains at launch, with plans to extend through collaborations. The consortium included 5 industry partners and 5 research organizations across 6 countries. Based on available project data, specific domain names were not listed in the objective summary.
Are there compliance or regulatory benefits to using standardized benchmarks?
The platform produces human- and machine-readable public periodic reports, creating auditable performance records. For industries facing data-processing regulations, having independent benchmark results provides documented evidence of tool capabilities. Based on available project data, no specific regulatory certifications were mentioned.
Who built it
The HOBBIT consortium brings together 10 partners from 6 countries (Belgium, Switzerland, Germany, Greece, Poland, UK), with a balanced 50/50 split between 5 industry partners and 5 research organizations. The coordinator, INFAI in Germany, is a recognized applied informatics institute. Having 5 industry partners signals that real-world data needs drove the platform design, not just academic curiosity. However, only 1 partner is classified as an SME, meaning the consortium leaned toward established organizations — important context for any SME considering adoption, as the tool was primarily shaped by larger players' data challenges.
- INSTITUT FUR ANGEWANDTE INFORMATIK (INFAI) EVCoordinator · DE
- INTERUNIVERSITAIR MICRO-ELECTRONICA CENTRUMparticipant · BE
- IDRYMA TECHNOLOGIAS KAI EREVNASparticipant · EL
- NATIONAL CENTER FOR SCIENTIFIC RESEARCH "DEMOKRITOS"participant · EL
- AGT GROUP (R&D) GMBHparticipant · DE
INFAI (Institut fur Angewandte Informatik) in Germany coordinated the project. Use SciTransfer's coordinator lookup service to get the right contact.
Talk to the team behind this work.
Want to know if HOBBIT benchmarks are relevant to your data stack? SciTransfer can assess fit and arrange a direct introduction to the research team.