If you are a genomics company struggling with the growing cost of storing and processing sequencing data — this project developed prototype data structures that compress and index biological sequences more efficiently. With DNA data growing faster than hardware can keep up, these tools could reduce the compute time and storage needed to deliver genetic test results to patients.
Faster Software Tools for Processing Massive Genomic and Search Data
Imagine trying to read every book in a library in one afternoon — that's roughly what computers face when processing DNA data from modern sequencing machines. The data is growing faster than computers are getting faster, so the old ways of searching and organizing it just can't keep up. BIRDS brought together experts in biology data and internet search technology from 4 continents to design smarter ways of compressing, indexing, and searching through enormous biological datasets. Think of it like inventing a better filing system so you can find any patient's genetic info in seconds instead of hours.
What needed solving
Companies processing large-scale genomic or biological data face a growing gap: DNA sequencing technology produces data faster than computer hardware can keep up with. Storage costs balloon, analysis takes too long, and the promise of personalized medicine stalls because the infrastructure can't handle the data volumes efficiently.
What was built
The project produced prototypes of integrated research (WP5) combining bioinformatics and information retrieval techniques for compressing, indexing, and searching biological data. Across 9 total deliverables, the team developed new data structures and algorithms designed to handle genome-scale datasets more efficiently than existing approaches.
Who needs this
Who can put this to work
If you are a pharma R&D team spending weeks running genome-scale analyses for drug target identification — this project built integrated research prototypes combining bioinformatics and information retrieval techniques. Faster data processing means shorter discovery cycles and lower compute infrastructure costs for your pipeline.
If you are a software company dealing with indexing and searching massive repetitive data collections — this project's algorithms for compressing and navigating large-scale data were designed for exactly this challenge. The techniques developed across 8 partner institutions in 6 countries could be adapted to improve search performance in your products.
Quick answers
What would it cost to license or use this technology?
BIRDS was funded under MSCA-RISE with EUR 648,000, primarily for researcher exchanges and knowledge sharing. The project mentioned cooperation with an SME software development company to bring results to market, but specific licensing terms or pricing are not available in the project data. You would need to contact the coordinator at Universidade da Coruña to discuss commercial terms.
Can this work at industrial scale with real production data?
The project produced prototypes of integrated research (WP5 deliverable), but there is no evidence of industrial-scale deployment or stress testing with production-level data volumes. The algorithms were designed to handle genome-scale datasets, but moving from research prototypes to production systems would require further engineering and validation.
Who owns the intellectual property?
IP from MSCA-RISE projects is typically shared among the consortium partners under their grant agreement. With 8 partners across 6 countries including universities in Spain, Portugal, Finland, Japan, Chile, and Australia, IP arrangements may be complex. The coordinating university (Universidade da Coruña) would be the first point of contact for licensing discussions.
Is this still being developed or has the project ended?
BIRDS ran from 2016 to 2019 and is now closed. However, the international research network it established may still be active, and the algorithms and prototypes developed could be available for further development or commercialization. The project website (birdsproject.eu) may have updates on post-project activities.
How hard would it be to integrate these tools into our existing systems?
Based on available project data, the outputs are research prototypes rather than plug-and-play software products. Integration would likely require collaboration with the research teams to adapt the data structures and algorithms to your specific use case. The project's EuroSciVoc tags indicate work on software development, DNA, genomes, and proteins.
Were any industry partners involved in testing?
The consortium included 1 industry partner out of 8 total partners. The objective specifically mentions cooperation with an SME software development company to bring research results to market, but detailed results of that industry collaboration are not available in the project data.
Who built it
The BIRDS consortium is academic-heavy: 6 out of 8 partners are universities, with just 1 industry partner and 1 research organization, giving a low industry ratio of 12%. The geographic spread is impressive — 6 countries across 4 continents (Australia, Chile, Spain, Finland, Japan, Portugal) — but this reflects the networking nature of MSCA-RISE rather than a market-driven partnership. No SMEs are listed among the partners despite the objective mentioning SME cooperation. For a business buyer, this means the technology is firmly in academic hands and would need significant commercial development effort to become a usable product.
- UNIVERSIDADE DA CORUNACoordinator · ES
- HELSINGIN YLIOPISTOparticipant · FI
- KOKURITSU DAIGAKU HOJIN KYUSHU DAIGAKUpartner · JP
- UNIVERSITY OF MELBOURNEpartner · AU
- UNIVERSIDAD DE CHILEpartner · CL
- INESC ID - INSTITUTO DE ENGENHARIADE SISTEMAS E COMPUTADORES, INVESTIGACAO E DESENVOLVIMENTO EM LISBOAparticipant · PT
- UNIVERSIDAD DE CONCEPCIONpartner · CL
Universidade da Coruña (Spain) — contact their computer science or bioinformatics department for licensing or collaboration inquiries
Talk to the team behind this work.
Want to explore whether BIRDS data compression algorithms could cut your genomic data processing costs? SciTransfer can arrange a direct introduction to the research team and help assess fit for your use case.