FAQ: frequently asked questions
Where can I learn more about the algorithm used?
How do I get the data?
How do I learn more than one value per point?
My runner failed, how do I get the error message?
learner.interpolated_on_grid() optionally with a argument
n to specify the the amount of points in
Why can I not use a
lambda with a learner?
When using the
Runner the learner’s function is evaluated in different Python processes.
function needs to be serialized (pickled) and send to the other Python processes;
lambdas cannot be pickled.
Instead you can probably use
functools.partial to accomplish what you want to do.
How do I run multiple runners?
Check out Adaptive scheduler, which solves the following problem of needing to run more learners than you can run with a single runner. It easily runs on tens of thousands of cores.
What is the difference with FEM?
The main difference with FEM (Finite Element Method) is that one needs to globally update the mesh at every time step.
For Adaptive, we want to be able to parallelize the function evaluation and that requires an algorithm that can quickly return a new suggested point. This means that, to minimize the time that Adaptive spends on adding newly calculated points to the data strucute, we only want to update the data of the points that are close to the new point.
What is the difference with Bayesian optimization?
Indeed there are similarities between what Adaptive does and Bayesian optimization.
The choice of new points is based on the previous ones.
There is a tuneable algorithm for performing this selection, and the easiest way to formulate this algorithm is by defining a loss function.
Bayesian optimization is a perfectly fine algorithm for choosing new points within adaptive. As an experiment we have interfaced
scikit-optimize and implemented a learner that just wraps it.
However there are important differences why Bayesian optimization doesn’t cover all the needs. Often our aim is to explore the function and not minimize it. Further, Bayesian optimization is most often combined with Gaussian processes because it is then possible to compute the posteriour exactly and formulate a rigorous optimization strategy. Unfortunately Gaussian processes are computationally expensive and won’t be useful with tens of thousands of points. Adaptive is much more simple-minded and it relies only on the local properties of the data, rather than fitting it globally.
We’d say that Bayesian modeling is good for really computationally expensive data, regular grids for really cheap data, and local adaptive algorithms are somewhere in the middle.
Missing a question that you think belongs here? Let us know.