Invited speakers

Invited speakers

The RSSB2024 has the pleasure of hosting the following keynote speakers:

  • Eva Ceulemans (KU Leuven)
  • Andrea Meilán (Universidad Carlos III de Madrid)
  • Annette Kopp-Schneider (German Cancer Research Center)
  • François Portier (ENSAI)
  • Ardo van den Hout (University College London)
  • Achim Zeileis (Universität Innsbruck)
ceulemans

Eva Ceulemans is full professor of Quantitative Psychology and Individual Differences at the KU Leuven in Belgium. During the first 15 years of her research career, Dr. Ceulemans published methodological work on clustering, multiway and multiset analysis, regression models and model selection. Since ca. 2014 she focuses on methods for time series analysis, where she investigated amongst others the pitfalls of dynamic network models, retrospective change point detection methods for flagging changes in means or correlation across time, prospective monitoring of ESM data by means of statistical process control to flag imminent onset of psychopathology, sample size planning of ESM studies, and extensions of these methods to dyadic data from romantic partners or mother-child pairs. Applying these methods, she contributed empirical results on the psychology of emotions, early detection of depression, attachment, and parenting.
https://www.kuleuven.be/wieiswie/en/person/00031954

Eva's abstract

The EWMA control chart for real-time detection of developing mood disorders: Key principles, optimization of performance, and applications

Statistical methods that can accurately detect early signs of developing mood disorders in intensive longitudinal data (e.g., experience sampling method (ESM) data where people regularly report on their momentary feelings and thoughts using a smartphone app) in real-time are much needed, as such methods would allow to intervene preventively in order to prevent an episode from occurring or to mitigate its severity. Statistical process control (SPC) procedures, originally developed for monitoring production processes, seem promising and statistically sound methods to achieve this goal. SPC procedure capture the natural variation present in a set of in-control data (i.e., the baseline), used to establish control limits. Afterwards, incoming data are compared to the in-control distribution, to detect and test whether and when the incoming data go out-of-control (i.e., when the data go beyond the control limits). We start this talk by introducing SPC and argue why we selected the exponentially weighted moving average (EWMA) method as our SPC method of choice. Next we demonstrate that ESM data violate some crucial EWMA assumptions and discuss how we dealt with these violations by monitoring day averages rather than the individual measurement occasions. This approach of focusing on day statistics also offers a neat solution to the detection of variance changes, by applying EWMA to day statistics of variability. To illustrate the added value of the method, we examine whether the recurrence of depression can accurately be foreseen by applying the EWMA procedure. EWMA results of 41 formerly depressed patients are presented, who were now in remission and discontinuing antidepressant medication. Finally, we look into the baseline problem. One of the biggest challenges of applying the EWMA procedure to ESM data, is the amount of in-control data that is needed for optimal performance, which amounts to at least 50 days. Clearly, it is not trivial to obtain such a large amount of in-control data of a single person. We therefore investigate several potential solutions.

meilan

Andrea Meilán is Visiting Professor at the Department of Statistics at Universidad Carlos III de Madrid. Her research interests focus on the development of statistical methodologies related to nonparametric density and regression estimation, and goodness-of-fit tests for models involving more complex data, such as directional, functional, or spatial data.
https://researchportal.uc3m.es/display/inv48240

Andrea's abstract

Polyspherical data analysis: kernel density estimation and its applications

Polyspherical data refers to observations recorded as directions within the product space of hyperspheres. The polysphere encompasses particular important cases, such as the circle, the sphere, and the torus. In this work, we introduce a kernel density estimator tailored for this type of data. We derive the main asymptotic properties of the estimator, including mean square error, normality, and optimal bandwidths. We delve into the kernel theory of the estimator and provide bandwidth selectors for practical use. Additionally, we present some applications of the kernel density estimator, which will be utilized to analyze the morphology of a sample of infants’ hippocampi embedded in a high-dimensional polysphere via skeletal representations (s-reps).

koppschneider

Annette Kopp-Schneider was trained as a mathematician. She is Head of the Division of Biostatistics at the German Cancer Research Center. As the trial statistician for the INFORM2 (Individualized therapy For Relapsed Malignancies in childhood) trial series, she is interested in Bayesian clinical trial design for very small trials. She is especially interested in characterizing frequentist properties of Bayesian trial designs.
https://www.dkfz.de/en/biostatistics/staff/kopp-schneider.html

Annette's abstract

Borrowing from external information in clinical trials: methods, benefits and limitations

When trials can only be performed with small sample sizes as, for example, in the situation of precision medicine where patients cohorts are defined by a specific combination of biomarker and targeted therapy, borrowing information from historical data is currently discussed as an approach to improve the efficiency of the trial. In this context, borrowing information is often also referred to as evidence synthesis or extrapolation, where external data could be historical data or another source of co-data.

A number of approaches for borrowing from external data that dynamically discount the amount of information transferred from external data based on the discrepancy between the external and current data have been proposed. The robust mixture prior (Schmidli et al, 2014) is a popular method that is a weighted mixture of an informative prior incorporating external information and a more dispersed prior to address potential prior-data conflict and robustify the analysis. As an alternative, the power prior approach discounts external data in the informative prior by raising its likelihood by a weight parameter. An Empirical Bayes approach for the estimation of the weight parameter from the similarity of external and current data has been proposed by Gravestock et al. (2017). The compromise decision approach (Calderazzo et al. 2024) relates the amount of borrowing to the type I error rate inflation one is willing to tolerate.

We will discuss the frequentist operating characteristics (FOC) of trials using these adaptive borrowing approaches, evaluating, e.g., type I error rate and power. Use of the robust mixture prior requires the selection of the mixture weight, the mean and the variance of the robust component and we will discuss the impact of the selection on FOC. The concept of prior effective sample size facilitates quantification and communication of prior information by equating it to a sample size. When prior information arises from historical observations, the traditional approach identifies the ESS with a historical sample size, a measure that is independent of the current observed data, and thus does not capture an actual loss of information induced by the prior in case of prior-data conflict. The effective current sample size of a prior (Wiesenfarth and Calderazzo 2020) is introduced which relates prior impact to the number of (virtual) samples from the current data model. All aspects that will be discussed show that in the frequentist perspective, borrowing cannot be beneficial for any possible true parameter value (Kopp-Schneider et al. 2020). However, benefits can be obtained if prior information is reliable and consistent.

Calderazzo S., Wiesenfarth M., Kopp‐Schneider A. (2024). Robust incorporation of historical information with known type I error rate inflation. Biometrical Journal, 66(1), 2200322.

Gravestock I, Held L (2017). Adaptive power priors with empirical Bayes for clinical trials. Pharmaceutical Statistics 16:349-360.

Kopp-Schneider A, Calderazzo S, Wiesenfarth M (2020). Power gains by using external information in clinical trials are typically not possible when requiring strict type I error control. Biometrical Journal 62(2):361-374

Schmidli, H., Gsteiger, S., Roychoudhury, S., O’Hagan, A., Spiegelhalter, D., Neuenschwander, B. (2014). Robust meta‐analytic‐predictive priors in clinical trials with historical control information. Biometrics, 70(4), 1023-1032.

Wiesenfarth M, Calderazzo S (2020). Quantification of prior impact in terms of effective current sample size. Biometrics 76(1), 326-336.

portier

François Portier obtained his PhD in Applied Mathematics at the University of Rennes under the supervision of Bernard Delyon. He was a Postdoctoral Researcher at UCLouvain from 2013 to 2016, working with Johan Segers and Ingrid van Keilegom. He then joined Télécom Paris as an Assistant Professor until 2021. Currently, he serves as an Associate Professor at ENSAI (Rennes – Bretagne) and is a member of CREST. His research interest includes Statistical learning theory and Monte Carlo methods. He is the head of the international Master’s program for Smart Data Science (held by ENSAI) and an associate editor of the Electronic Journal of Statistics, as well as the journal Computational Statistics & Data Analysis.
https://ensai.fr/equipe/portier-francois/

François' abstract

Non-parametric covariate shift adaptation with conditional sampling

Many existing covariate shift adaptation methods estimate sample weights to be used in the risk estimation in order to mitigate the gap between the source and the target distribution. However, non-parametrically estimating the optimal weights typically involves computationally expensive hyper-parameter tuning that is crucial to the final performance. In this paper, we propose a new non-parametric approach to covariate shift adaptation which avoids estimating weights and has no hyper-parameter to be tuned. Our basic idea is to label unlabeled target data according to the k-nearest neighbors in the source dataset. Our analysis indicates that setting k = 1 is an optimal choice. Thanks to this property, there is no need to tune any hyper-parameters, unlike other non-parametric methods. Moreover, our method achieves a running time quasi-linear in the sample size with a theoretical guarantee, for the first time in the literature to the best of our knowledge. Our results include sharp rates of convergence for estimating the joint probability distribution of the target data. In particular, the variance of our estimators has the same rate of convergence as for standard parametric estimation despite their non-parametric nature. Our numerical experiments show that proposed method brings drastic reduction in the running time with accuracy comparable to that of the state-of-the-art methods.

vandenhout

Ardo van den Hout works in the Department of Statistical Science at UCL in the United Kingdom. He studied mathematics and philosophy at the University of Nijmegen and completed a PhD in social statistics at the University of Utrecht in 2004. Most of his current research concerns methods for the analysis of longitudinal data in biostatistics. He has published on multi-state models and on joint models.
https://www.ucl.ac.uk/statistics/people/ardovandenhout

Ardo's abstract

Two-dimensional spline models for multi-state processes

Multi-state models are routinely used in research where change of status over time is of interest. In epidemiology and medical statistics, for example, the models are used to describe health-related processes over time, where status is defined by a disease or a condition. In social statistics and in demography, the models are used to study processes such as region of residence, work history, or marital status.

The first part of the talk will be an introduction to continuous-time multi-state survival models. I will discuss longitudinal data requirements, the link with stochastic processes, and maximum likelihood inference. An important distinction is whether or not exact times are observed for transitions between the states.

In the second part of the talk, I will discuss two-dimensional P-spline models for continuous-time multi-state processes with a large number of ordered states. The numerical ordering of states is used to model transition hazards using penalised spline regression.

zeileis

Achim Zeileis is Professor of Statistics at Ceu. His main area of research is open-source statistical software, often combining classical parametric methods with flexible data-driven modeling. He is co-editor-in-chief of the open-access Journal of Statistical Software and ordinary member of the R Foundation.
https://www.zeileis.org/

Achim's abstract

R/exams: A One-for-All Exams Generator

The open-source package “exams” for R () provides a one-for-all approach to automatic exams generation, tying together various open-source packages (in R and beyond). R/exams is based on individual exercises for multiple-choice or single-choice questions, numeric or text answers, or combinations of these. The format can be either in R/Markdown or R/LaTeX containing questions/solutions with some random numbers, text snippets, plots/diagrams, R output, or individualized datasets. The exercises can be combined to exams and easily rendered into a number of output formats including PDFs for classical written exams (with automatic evaluation), import formats for various learning management systems (including Moodle, Canvas, or Blackboard), live voting, and the possibility to create custom output (in PDF, HTML, Docx, …). In addition to a software overview some potential applications are discussed, focusing on formative and/or summative assessments in statistics courses.