📊 Projekt

KI-FOR 5363 DeSBi (Fusing Deep Learning and Statistics towards Understanding Structured Biomedical Data)

Institution: Humboldt-Universität zu Berlin Category: Project
Website: https://desbi.de/

Short Description

The service provides a method for statistical testing of conditional independence for structured data such as images and tabular information. It enables the application of deep learning methods for data embedding and combines these with nonparametric tests. The target audience consists of researchers in biomedicine analyzing complex multimodal datasets. Universities benefit from improved statistical inference and model validation in high-dimensional data.

General Description

Thematic Classification

Subject Areas

Computer Science
Medicine
Statistics
Biology
Machine Learning
Artificial Intelligence
Biomathematics
Computational Biology
Neuroinformatics
Genomics
Image Processing
Biostatistics
Data Science

Research Fields

Deep Learning
Statistics
Medical Imaging
Genomics
Causal Inference
Time Series Analysis
Computational Biology
Explainability of Artificial Intelligence (Explainable AI)
Image Analysis
Structured Biomedical Data
Conditional Independence Tests
Transfer Learning
Uncertainty Quantification
Confounding Control
Genome-Wide Association Studies (GWAS)
Neural Networks
Data Integration (Multimodality)

Specializations

Development of conditional independence tests (CITs) for structured data such as images
Use of deep learning for data embedding (data embedding) in multimodal datasets
Application of statistical tests as key instruments for inference in multimodal datasets
Ensuring control of the type-I error rate in large hypothesis sets
Increasing statistical power through transfer learning and optimally learned embeddings
Development of efficient algorithms and user-friendly software for application in biomedicine
Application to large biomedical datasets such as the UK Biobank
Integration into other projects (P2, P4, P7) for visual explanation and analysis of data

Keywords

P1: Deep conditional independence tests / - Conditional Independence Tests / - Deep Learning / - Image Analysis / - Multimodal Data / - Statistical Inference / - UK Biobank / - Transfer Learning / - Nonparametric Testing / - Data Embedding

Funding

Funding Provider: -
Funding Program: KI-FOR 5363
Funding Reference: KI-FOR 5363
Funding Period: 2023 - 2027
Project Volume: Das Volumen oder "INSUFFICIENT"

Team & Partners

Project Leadership

Prof. Dr. Sonja Greven (Humboldt-Universität zu Berlin)
Prof. Dr. Christoph Lippert (University of Potsdam / Hasso-Plattner Institute)

Involved Persons

Marco Simnacher (PhD Candidate)
Hani Park (PhD Candidate)
Xiangnan Xu (Postdoc)
Clara Hoffmann (PhD Student)
Dilyara Bareeva (PhD student)
Jim Berend (PhD student)
Lorenz Hufe (PhD student)
Sahar Iravani (Postdoc)
Masoumeh Javanbakhat (Postdoc)
Georg Keilbar (Postdoc)
Piotr Komorowski (PhD student)
Wei-Cheng Lai (PhD candidate)
Gabriel Nobis (PhD Candidate)
Roshan Rane (PhD Candidate)
Moritz Seiler (PhD Candidate)
Manuel Pfeuffer (PhD Cadidate)
Paulo Yanez Sarmiento (PhD Candidate)
Hadya Yassin (PhD Candidate)
Claudia Winklmayr (PhD Cadidate)
Maximilian Dreyer (PhD student)
Eshant English (PhD student)
Maarten Jung (PhD student)
Marta Lemanczyk (PhD student)
Alexander Rakowski (PhD student)
Sepideh Saran (PhD student)
Juliana Schneider (PhD student)
Ekkehard Schnoor (Postdoc)

Affiliated Institutions

External Partners

Humboldt-Universität zu Berlin, University of Potsdam, Hasso Plattner Institute (HPI), Max Delbrück Center for Molecular Medicine (MDC), Karlsruher Institut für Technologie (KIT), Charité – Universitätsmedizin Berlin, Fraunhofer Heinrich Hertz Institute (Fraunhofer HHI), Technische Universität Berlin

Project Contents

Goals

Development of conditional independence tests (CITs) for structured data such as images through the use of deep learning for data embedding
Ensuring Type-I error control in statistical tests for multimodal datasets in the biomedical domain
Improving statistical power through transfer learning, optimally learned embeddings, and tailored CITs for these embeddings
Provision of efficient algorithms and user-friendly software for application to large biomedical datasets such as the UK Biobank
Application of the tests in further projects (P2, P4, P7) for visual explanation and analysis of multimodal data

Work Packages

P1: Deep conditional independence tests with an application to imaging genetics
P2: Visual explanations for statistical tests
P3: Explainable AI for microscopy image analysis
P4: Explainable AI for functional genomics
P5: Sparse and robust explanations for structured data
P6: Uncertainty quantification in biomedical deep learning
P7: Causal inference with multimodal data

Methods

Deep nonparametric conditional independence tests (DNCITs)
Embedding maps for feature representation extraction
Layer-wise relevance propagation (LRP) with pruning for sparsity
Pruned layer-wise relevance propagation for sparse explanations
Transfer learning for optimal embeddings
Adversarially learned penalty for feature subspace independence
Metadata-guided Feature Disentanglement (MFD)
Procedurally generated dataset (Arctique) for uncertainty quantification
Online visualization tool with Con-score metric (DeepRepViz)
Virtual inspection layers for time series data
Reactive model correction via conditional bias suppression (R-ClArC)
Gradient penalization in latent space for bias unlearning
Concept Activation Vectors (CAVs) with pattern-based computation
DualView for post-hoc data attribution via surrogate modeling
Regression in quotient metric spaces (e.g., square-root-velocity framework)
Model guidance via explanations to turn classifiers into segmentation models
PURE method for disentangling polysemantic neurons via relevant circuits
Explaining predictive uncertainty through second-order effects (CovLRP, CovGI)
Reveal to Revise (R2R) framework for iterative bias correction
Right for the right reasons paradigm for weakly supervised segmentation
Understanding model decisions via prototypical concept-based explanations
TransferGWAS for genome-wide association studies on high-dimensional imaging data

Expected Outcomes

Development of conditional independence tests (CITs) for structured data such as images, based on deep learning-based data embeddings
Ensuring Type-I error control under large amounts of the null hypothesis of conditional independence
Increasing statistical power through transfer learning, optimally learned embeddings, and powerful CITs specifically tailored for these embeddings
Provision of efficient algorithms and user-friendly software for application in research
Validation of the methods using the UK Biobank dataset to assess applicability to large-scale biomedical datasets
Integration of the tests into visual explanation methods (P2) as well as application in projects P4 and P7
Provision of sample size and power calculations for CITs to enable researchers to plan experiments based on scientific questions

Contact

Contact Person: Eliza Mandieva, Project Coordinator
Email: eliza.mandieva@hu-berlin.de
Project Website: https://desbi.de/

Recorded: 2026-01-14
Source: https://desbi.de/

Visit Website

Information

Institution: Humboldt-Universität zu Berlin

Category: Project
Added: 14.01.2026
Source: Original Website ↗