-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorization with emcee solver #83
Comments
A minimal example based on the linear regression integration test: import numpy as np
from probeye.definition.inverse_problem import InverseProblem
from probeye.definition.forward_model import ForwardModelBase
from probeye.definition.sensor import Sensor
from probeye.definition.likelihood_model import GaussianLikelihoodModel
from probeye.inference.emcee.solver import EmceeSolver
# ============================================================================ #
# Set numeric values #
# ============================================================================ #
# Options
n_steps = 200
n_initial_steps = 100
n_walkers = 20
# 'true' value of a, and its normal prior parameters
a_true = 2.5
mean_a = 2.0
std_a = 1.0
# 'true' value of b, and its normal prior parameters
b_true = 1.7
mean_b = 1.0
std_b = 1.0
# 'true' value of additive error sd, and its uniform prior parameters
sigma = 0.5
low_sigma = 0.0
high_sigma = 0.8
# the number of generated experiment_names and seed for random numbers
n_tests = 50
seed = 1
# ============================================================================ #
# Define the Forward Model #
# ============================================================================ #
class LinearModel(ForwardModelBase):
def interface(self):
self.parameters = [{"a": "m"}, "b"]
self.input_sensors = Sensor("x")
self.output_sensors = Sensor("y", std_model="sigma")
def response(self, inp: dict) -> dict:
# this method *must* be provided by the user
x = inp["x"]
m = inp["m"]
b = inp["b"]
return {"y": m * x + b}
def jacobian(self, inp: dict) -> dict:
x = inp["x"] # vector
one = np.ones((len(x), 1))
return {"y": {"x": None, "m": x.reshape(-1, 1), "b": one}}
# ============================================================================ #
# Define the Inference Problem #
# ============================================================================ #
problem = InverseProblem("Linear regression (AME)")
problem.add_parameter(
"a",
"model",
tex="$a$",
info="Slope of the graph",
prior=("normal", {"mean": mean_a, "std": std_a}),
)
problem.add_parameter(
"b",
"model",
info="Intersection of graph with y-axis",
tex="$b$",
prior=("normal", {"mean": mean_b, "std": std_b}),
)
problem.add_parameter(
"sigma",
"likelihood",
domain="(0, +oo)",
tex=r"$\sigma$",
info="Standard deviation, of zero-mean additive model error",
prior=("uniform", {"low": low_sigma, "high": high_sigma}),
)
# add the forward model to the problem
linear_model = LinearModel("LinearModel")
problem.add_forward_model(linear_model)
# ============================================================================ #
# Add test data to the Inference Problem #
# ============================================================================ #
# data-generation; normal likelihood with constant variance around each point
np.random.seed(seed)
x_test = np.linspace(0.0, 1.0, n_tests)
y_true = linear_model.response(
{linear_model.input_sensor.name: x_test, "m": a_true, "b": b_true}
)[linear_model.output_sensor.name]
y_test = np.random.normal(loc=y_true, scale=sigma)
# add the experimental data
problem.add_experiment(
f"TestSeries_1",
fwd_model_name="LinearModel",
sensor_values={
linear_model.input_sensor.name: x_test,
linear_model.output_sensor.name: y_test,
},
)
# ============================================================================ #
# Add likelihood model(s) #
# ============================================================================ #
# add the likelihood model to the problem
problem.add_likelihood_model(
GaussianLikelihoodModel(
prms_def="sigma",
experiment_name="TestSeries_1",
model_error="additive",
name="SimpleLikelihoodModel",
)
)
# ============================================================================ #
# Solve problem with inference engine(s) #
# ============================================================================ #
emcee_solver = EmceeSolver(
problem,
show_progress=True,
)
inference_data = emcee_solver.run_mcmc(
n_walkers=n_walkers,
n_steps=n_steps,
n_initial_steps=n_initial_steps,
vectorize=True
) |
I looked into this, and I saw that there are quite a few changes necessary in order to make this feature possible. But I also think the code will get more complicated, because there will be much more ifs and for-loops in a lot of methods and function. So, it is possible, but at the cost of additional code complexity. If there is not an urgent need, I would not implement this at the moment. |
Emcee allows for vectorized calls to the log-likelihood function by setting
vectorize=True
in the EnsembleSampler. With this option enabled, the input to the log-likelihood function is a 2D array oftheta
values of size(n_walkers x n_theta)
. Doing this in probeye results in an error in check_parameter_domains (see minimal example below).Having this functionality could be useful for:
The text was updated successfully, but these errors were encountered: