Dandelion

Faster Modeling with Small Data

The Problem

A team of modelers was facing a problem: there was only a small amount of data accessible for modeling an important phenomenon they were targeting. They needed a model that they could rely on in production despite small amounts of training data. As it was, it took weeks to adapt the model to changing conditions.

The Analysis

We constructed a robust model by taking the following steps:

  • Identifying a small number of conditions of the phenomenon that can be modeled separately.
  • Modeling each condition with a different parameterization from a common family.
  • Performing a statistical analysis to determine the amount of training data that was sufficient for proper estimation of the parameterizations.

The Solution

We provided an easy-to-operate modeling algorithm appropriate for the small amount of data available, along with these components:

  • A training component, appropriate for the common family of the model.
  • An inference component, appropriate for the same family.
  • A health monitoring component that reflected the sufficiency of the available training data.

The Impact

The team received a robust modeling solution they could adapt to different conditions within days instead of weeks, making the team more productive. The prediction results of the model in production were consistently very good. This allowed the team to showcase their product with confidence to a series of prospective clients by adapting the model to different conditions.

Scroll to top