- Python
- NumPy
- SciPy
- SymPy
- TkInter
Task
In everyday clinical practice, access to complete, continuous patient data is often severely limited. Especially for rare disease courses or specific therapies there are often insufficient case numbers or continuously recorded data over the entire course of the disease. These gaps make it difficult to draw reliable conclusions about the success of medical interventions – particularly in evidence-based studies that rely on meaningful comparison groups.
The starting point of the project was therefore the idea of developing a flexible framework for generating so-called virtual patients. Virtual patients serve as simulated models of real disease courses and make it possible to test medical hypotheses even when real data are fragmented or insufficient.
The plan was to generate, from fragmented datasets – whether real or synthetically generated – a parametric model that captures the temporal evolution of relevant clinical variables. This model was to support different model types (e.g., functions or differential equations), perform parameter fitting automatically, and allow predictions for arbitrary time points. In addition, a method was envisioned to quantify the uncertainty of the estimated model parameters.
In the course of the project the focus ultimately shifted to developing a modular fitting app with a graphical user interface that implements key components of the original goal – in particular with regard to model adjustment, extensibility and user friendliness. This created a solid foundation on which future extensions – such as the simulation of virtual patient data – can build.
Approach
The technical realization of our project essentially consisted of developing an interactive and user-friendly application with which arbitrary mathematical models can be fitted to real or synthetic data. The focus was on a modular architecture and a clearly structured graphical interface that should enable even users without deeper programming knowledge to perform model adjustments.
At the beginning of the project we faced fundamental questions such as: How can a mathematical function entered by the user be processed as executable code? To enable this, we relied on the SymPy library, which allows symbolic expressions to be converted into executable Python functions. These functions were then fitted to the provided data using SciPy. Pandas was used for data processing, and the graphical interface was implemented with customTkinter.
A particular challenge was the validation and interpretation of user inputs: our program automatically checks whether the entries are sensible and complete before a computation is started. More complex errors, such as incomplete data formats or unsuitable model variables, are also caught and communicated with understandable error messages. The goal was to enable robust behavior – our program therefore does not crash but provides comprehensible guidance.
Another key feature was visualization: the quality of the fit is displayed graphically via residuals, giving the user an intuitive assessment of model quality.
One aspect we are proud of is the well-thought-out logging architecture. We implemented custom handlers and formatters that allow systematic logging of information at all relevant points in the program flow. These logs are stored in structured form and can optionally be viewed directly via the command line in debug mode. In case of an error this makes it quick to trace where the process failed – an enormous help for later maintenance or extension.
Even though we did not achieve our originally formulated goal – the complete generation of virtual patient cohorts – the tool we developed provides a strong foundation for it. It enables flexible fitting of mathematical models to data and can be extended with components for data synthesis and simulation with comparatively little effort.
Our particular focus on modularity, user friendliness, and both extensibility and maintainability makes the project not only functionally convincing but also sustainable in the long term for more complex medical applications.