If you are a pharma company spending months identifying viable drug targets — this project developed software for automatic tandem repeat protein annotation and homology modeling, plus database APIs that cross-link into InterPro and UniProt. That means your target discovery pipeline can screen repeat protein families faster and with a unified classification, reducing redundant analysis across internal teams.
Protein Repeat Analysis Tools for Faster Drug Target Discovery and Biotech Design
Many proteins in our bodies are built from repeating building blocks, like beads on a string. Understanding these repeating patterns helps scientists figure out what a protein does and what goes wrong in diseases. REFRACT brought together 16 teams across 12 countries to build better software that identifies, classifies, and models these protein repeats. Think of it as building a search engine and quality-control toolkit specifically for one of the trickiest parts of protein biology.
What needed solving
Companies in pharma, biotech, and diagnostics spend significant time and resources characterizing repeat proteins — a class of proteins involved in many diseases and biotechnology applications. Existing tools are fragmented, use different classification systems, and lack benchmarking, making it hard to compare results or build reliable automated pipelines.
What was built
The project built 9 specialized software tools including automatic protein repeat annotation, homology modeling, sequence and structure alignment, and homolog search. They also deployed database APIs, curated Pfam models, cross-linked data into InterPro, and created a benchmarking platform to compare detection methods.
Who needs this
Who can put this to work
If you are a biotech company engineering repeat proteins for therapeutic or industrial use — this project built alignment tools for both sequence- and structure-based comparison of tandem repeat proteins, along with benchmarking metrics across detection methods. These tools let you evaluate engineered repeat protein variants against known families with higher confidence, cutting trial-and-error in the design cycle.
If you are a bioinformatics platform provider looking to strengthen your protein annotation capabilities — this project delivered open database APIs, curated Pfam hidden Markov models for repeat proteins, and a consensus benchmarking platform across 9 specialized software deliverables. Integrating these into your platform adds a specialized module your competitors likely lack.
Quick answers
What would it cost to access these tools and databases?
The project outputs — databases, APIs, and software — were developed under a publicly funded MSCA-RISE grant (EUR 2,226,400) and integrated into open resources like InterPro and Pfam. Most tools are likely freely available for academic use, but commercial licensing terms would need to be discussed directly with the consortium coordinator at the University of Padova.
Can these tools work at industrial scale for large protein libraries?
The project delivered database APIs and software for automatic annotation, homology modeling, and sequence database searching, suggesting batch processing capability. However, the consortium had only 1 industry partner out of 16, so large-scale industrial validation may be limited. Performance benchmarking data is available through their dedicated benchmarking platform.
What is the IP situation — can we license this?
With 11 university and 4 research partners, IP is likely held by academic institutions. The project explicitly aimed for propagation into ELIXIR core data resources and possible industrial exploitation, suggesting openness to licensing. Contact the coordinator at the University of Padova for specific terms.
How mature are these software tools?
The project ran for over 5 years (2019–2024) and delivered 23 total deliverables including 9 demonstrated software tools and databases. These include deployed APIs and cross-linked data in established resources like InterPro. The tools are functional research software, not yet packaged as commercial products.
Can these tools integrate with our existing bioinformatics pipeline?
Yes — the project specifically built database APIs and cross-linked data into widely used resources like InterPro and UniProt. The software covers sequence search, alignment, annotation, and modeling, which are standard pipeline components. Integration effort would depend on your existing infrastructure.
Is there regulatory relevance for these protein tools?
Based on available project data, the tools focus on protein characterization and classification rather than direct regulatory compliance. However, better protein annotation supports regulatory submissions for biologics by providing more thorough target characterization and structural understanding.
Who built it
REFRACT is a heavily academic consortium: 11 universities and 4 research organizations with just 1 industry partner (6% industry ratio) and zero SMEs. The 16 partners span 12 countries including 7 EU member states and 9 Latin American institutions, making it geographically diverse but research-oriented. For a business looking to adopt these tools, this means the science is solid and peer-reviewed, but you would be among the first commercial users — there is no established industry adoption pathway yet, and commercial support infrastructure would need to be built from scratch with the academic teams.
- UNIVERSITA DEGLI STUDI DI PADOVACoordinator · IT
- JOHANNES GUTENBERG-UNIVERSITAT MAINZparticipant · DE
- UNIVERSIDAD NACIONAL DE LA PLATApartner · AR
- EUROPEAN MOLECULAR BIOLOGY LABORATORYparticipant · DE
- AGENCIA ESTATAL CONSEJO SUPERIOR DE INVESTIGACIONES CIENTIFICASparticipant · ES
- PONTIFICIA UNIVERSIDAD CATOLICA DEL PERUpartner · PE
- PONTIFICIA UNIVERSIDAD CATOLICA DE CHILEpartner · CL
- UNIVERSIDAD NACIONAL DE QUILMESpartner · AR
- UNIVERSIDAD PERUANA CAYETANO HEREDIApartner · PE
- CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE CNRSparticipant · FR
- ZURCHER HOCHSCHULE FUR ANGEWANDTE WISSENSCHAFTENparticipant · CH
- UNIVERSIDAD NACIONAL AUTONOMA DE MEXICO (UNAM)partner · MX
- STOCKHOLMS UNIVERSITETparticipant · SE
The coordinator is Universita degli Studi di Padova in Italy. SciTransfer can facilitate an introduction to discuss licensing or collaboration.
Talk to the team behind this work.
Want to explore how REFRACT's protein analysis tools could strengthen your drug discovery or protein engineering pipeline? SciTransfer can connect you with the research team and help evaluate fit for your specific use case.