Learning Strong Inference Models in Small Data Domains

Sarah Ostadabbas


April 13, 2022 / 1:00-2:00 p.m. EDT

Expeditions in Experiential AI

Recent efforts in machine learning (especially with the new waves of deep learning introduced in the last decade) have obliterated records for regression and classification tasks that have previously seen only incremental accuracy improvements. There are many other fields that would significantly benefit from machine learning (ML)-based inferences where data collection or labeling is expensive. In these domains (i.e. Small Data domains), the challenge we now face is how to learn efficiently with the same performance with less data. Many applications will benefit from a strong inference framework with deep structure that will: (i) work with limited labeled training samples; (ii) integrate explicit (structural or data-driven) domain knowledge into the inference model as editable priors to constrain search space; and (iii) maximize the generalization of learning across domains. My research aims to explore a generalized ML approach to solve the small data problem that leverages existing research and fills in key gaps with original work.

There are two basic approaches to reduce data needs during model training: (1) decrease inference model learning complexity via data-efficient machine learning, and (2) incorporate domain knowledge in the learning pipeline through the use of data-driven or simulation-based generative models. In this talk, I present my recent work on merging the benefits of these two approaches to enable the training of robust and accurate (i.e. strong) inference models that can be applied on real-world problems dealing with data limitation. My plan to achieve this aim is structured in four research thrusts: (i) introduction of physics- and/or data-driven computational models here referred to as weak generator to synthesize enough labeled data in an adjacent domain; (ii) design and analysis of unsupervised domain adaptation techniques to close the gap between the domain adjacent and domain specific data distributions; (iii) combined use of the weak generator, a weak inference model and an adversarial framework to refine the domain adjacent dataset by employing a set of unlabeled domain specific dataset; and (iv) development and analysis of co-labeling/active learning techniques to select the most informative datasets to refine and adapt the weak inference model into a strong inference model in the target application.



Sarah Ostadabbas is an assistant professor in the Department of Electrical and Computer Engineering and director of the Augmented Cognition Laboratory. Her research focuses on enhancing human information-processing capabilities through the design of adaptive interfaces via physical, physiological, and cognitive state estimation.

Ostadabbas has developed augmented and virtual reality tools for both the assessment and enhancement portions of interfaces, based on rigorous models adaptively parameterized using machine learning and computer vision algorithms.

Her work extends to medical and military applications in the small data domain where data collection and labeling is expensive, individualized, and protected by stringent privacy or classification laws. She has developed learning frameworks with deep structures that work with limited labeled training samples. Her work has also involved integrating domain knowledge into prior learning and synthetic data augmentation and maximizing generalized learning across domains by learning invariant representations.

Ostadabbas has co-authored more than 70 peer-reviewed journal and conference articles. Her research has received funding from the National Science Foundation, Mathworks, Amazon Web Services, Biogen, and NVIDIA. Within the Institute of Electrical and Electronics Engineers (IEEE), she is a member of the Computer Society, Women in Engineering, the Signal Processing Society, Engineering in Medicine & Biology Society, and the Young Professionals group. She serves on the International Society for Virtual Rehabilitation and the Association for Computing Machinery Special Interest Group on Computer-Human Interaction. She has helped organize workshops on topics ranging from multimodal data fusion to deep learning in small data.

Ostadabbas is now associate editor of IEEE’s Transactions on Biomedical Circuits and Systems journal, on the editorial board of both IEEE’s Sensors Letters and the Digital Biomarkers journals, and has been technical and session chair for several signal processing and machine learning conferences. She completed her postdoctoral research at Georgia Institute of Technology after earning her doctoral degree at the University of Texas at Dallas.