📊 Projekt

KI-FOR 5363 DeSBi (Fusing Deep Learning and Statistics towards Understanding Structured Biomedical Data)

Humboldt-Universität zu Berlin

KI-FOR 5363 DeSBi (Fusing Deep Learning and Statistics towards Understanding Structured Biomedical Data)

Institution: Humboldt-Universität zu Berlin Category: Project
Website: https://desbi.de/

Short Description

The service provides a method for statistical testing of conditional independence for structured data such as images and tabular information. It enables the application of deep learning methods for data embedding and combines these with nonparametric tests. The target audience consists of researchers in biomedicine analyzing complex multimodal datasets. Universities benefit from improved statistical inference and model validation in high-dimensional data.

General Description

-


Thematic Classification

Subject Areas

Computer Science
Medicine
Statistics
Biology
Machine Learning
Artificial Intelligence
Biomathematics
Computational Biology
Neuroinformatics
Genomics
Image Processing
Biostatistics
Data Science

Research Fields

  • Deep Learning
  • Statistics
  • Medical Imaging
  • Genomics
  • Causal Inference
  • Time Series Analysis
  • Computational Biology
  • Explainability of Artificial Intelligence (Explainable AI)
  • Image Analysis
  • Structured Biomedical Data
  • Conditional Independence Tests
  • Transfer Learning
  • Uncertainty Quantification
  • Confounding Control
  • Genome-Wide Association Studies (GWAS)
  • Neural Networks
  • Data Integration (Multimodality)

Specializations

  • Development of conditional independence tests (CITs) for structured data such as images
  • Use of deep learning for data embedding (data embedding) in multimodal datasets
  • Application of statistical tests as key instruments for inference in multimodal datasets
  • Ensuring control of the type-I error rate in large hypothesis sets
  • Increasing statistical power through transfer learning and optimally learned embeddings
  • Development of efficient algorithms and user-friendly software for application in biomedicine
  • Application to large biomedical datasets such as the UK Biobank
  • Integration into other projects (P2, P4, P7) for visual explanation and analysis of data

Keywords

  • P1: Deep conditional independence tests / - Conditional Independence Tests / - Deep Learning / - Image Analysis / - Multimodal Data / - Statistical Inference / - UK Biobank / - Transfer Learning / - Nonparametric Testing / - Data Embedding

Funding

Funding Provider: -
Funding Program: KI-FOR 5363
Funding Reference: KI-FOR 5363
Funding Period: 2023 - 2027
Project Volume: Das Volumen oder "INSUFFICIENT"


Team & Partners

Project Leadership

  • Prof. Dr. Sonja Greven (Humboldt-Universität zu Berlin)
  • Prof. Dr. Christoph Lippert (University of Potsdam / Hasso-Plattner Institute)

Involved Persons

  • Marco Simnacher (PhD Candidate)
  • Hani Park (PhD Candidate)
  • Xiangnan Xu (Postdoc)
  • Clara Hoffmann (PhD Student)
  • Dilyara Bareeva (PhD student)
  • Jim Berend (PhD student)
  • Lorenz Hufe (PhD student)
  • Sahar Iravani (Postdoc)
  • Masoumeh Javanbakhat (Postdoc)
  • Georg Keilbar (Postdoc)
  • Piotr Komorowski (PhD student)
  • Wei-Cheng Lai (PhD candidate)
  • Gabriel Nobis (PhD Candidate)
  • Roshan Rane (PhD Candidate)
  • Moritz Seiler (PhD Candidate)
  • Manuel Pfeuffer (PhD Cadidate)
  • Paulo Yanez Sarmiento (PhD Candidate)
  • Hadya Yassin (PhD Candidate)
  • Claudia Winklmayr (PhD Cadidate)
  • Maximilian Dreyer (PhD student)
  • Eshant English (PhD student)
  • Maarten Jung (PhD student)
  • Marta Lemanczyk (PhD student)
  • Alexander Rakowski (PhD student)
  • Sepideh Saran (PhD student)
  • Juliana Schneider (PhD student)
  • Ekkehard Schnoor (Postdoc)

Affiliated Institutions

-

External Partners

Humboldt-Universität zu Berlin, University of Potsdam, Hasso Plattner Institute (HPI), Max Delbrück Center for Molecular Medicine (MDC), Karlsruher Institut für Technologie (KIT), Charité – Universitätsmedizin Berlin, Fraunhofer Heinrich Hertz Institute (Fraunhofer HHI), Technische Universität Berlin


Project Contents

Goals

  • Development of conditional independence tests (CITs) for structured data such as images through the use of deep learning for data embedding
  • Ensuring Type-I error control in statistical tests for multimodal datasets in the biomedical domain
  • Improving statistical power through transfer learning, optimally learned embeddings, and tailored CITs for these embeddings
  • Provision of efficient algorithms and user-friendly software for application to large biomedical datasets such as the UK Biobank
  • Application of the tests in further projects (P2, P4, P7) for visual explanation and analysis of multimodal data

Work Packages

  • P1: Deep conditional independence tests with an application to imaging genetics
  • P2: Visual explanations for statistical tests
  • P3: Explainable AI for microscopy image analysis
  • P4: Explainable AI for functional genomics
  • P5: Sparse and robust explanations for structured data
  • P6: Uncertainty quantification in biomedical deep learning
  • P7: Causal inference with multimodal data

Methods

  • Deep nonparametric conditional independence tests (DNCITs)
  • Embedding maps for feature representation extraction
  • Layer-wise relevance propagation (LRP) with pruning for sparsity
  • Pruned layer-wise relevance propagation for sparse explanations
  • Transfer learning for optimal embeddings
  • Adversarially learned penalty for feature subspace independence
  • Metadata-guided Feature Disentanglement (MFD)
  • Procedurally generated dataset (Arctique) for uncertainty quantification
  • Online visualization tool with Con-score metric (DeepRepViz)
  • Virtual inspection layers for time series data
  • Reactive model correction via conditional bias suppression (R-ClArC)
  • Gradient penalization in latent space for bias unlearning
  • Concept Activation Vectors (CAVs) with pattern-based computation
  • DualView for post-hoc data attribution via surrogate modeling
  • Regression in quotient metric spaces (e.g., square-root-velocity framework)
  • Model guidance via explanations to turn classifiers into segmentation models
  • PURE method for disentangling polysemantic neurons via relevant circuits
  • Explaining predictive uncertainty through second-order effects (CovLRP, CovGI)
  • Reveal to Revise (R2R) framework for iterative bias correction
  • Right for the right reasons paradigm for weakly supervised segmentation
  • Understanding model decisions via prototypical concept-based explanations
  • TransferGWAS for genome-wide association studies on high-dimensional imaging data

Expected Outcomes

  • Development of conditional independence tests (CITs) for structured data such as images, based on deep learning-based data embeddings
  • Ensuring Type-I error control under large amounts of the null hypothesis of conditional independence
  • Increasing statistical power through transfer learning, optimally learned embeddings, and powerful CITs specifically tailored for these embeddings
  • Provision of efficient algorithms and user-friendly software for application in research
  • Validation of the methods using the UK Biobank dataset to assess applicability to large-scale biomedical datasets
  • Integration of the tests into visual explanation methods (P2) as well as application in projects P4 and P7
  • Provision of sample size and power calculations for CITs to enable researchers to plan experiments based on scientific questions

Contact

Contact Person: Eliza Mandieva, Project Coordinator
Email: eliza.mandieva@hu-berlin.de
Project Website: https://desbi.de/


Recorded: 2026-01-14
Source: https://desbi.de/

Visit Website