cometspec.mcmc¶
MCMC fitting kernel for the multi-isotopologue fluorescence model.
- cometspec.mcmc.mcmc_fitting(data=None, window=None, *, pumping=None, isotopologues='12C14N', systems=None, linelists=None, include_rotations=True, include_deltaJ0_parity_mix=True, require_X_only_for_rot=True, priors=None, nwalkers=50, nsteps=1000, n_cores=None, lsf=None, lsf_method=None, make_plots=False, progress=True, A_min=10000.0, a=3, velocity_kms=0.0, delta_lambda_A=0.0, init_logQ=None, init_logQ_by_iso=None, init_T=300.0, init_T_by_iso=None, init_v_kms=0.0, init_v_kms_by_iso=None, init_dlam=0.0, init_dlam_by_iso=None, init_logN=None, init_logN_by_iso=None, init_sigma=None, init_sigma1=None, init_sigma2=None, init_sigma_G=None, init_fwhm_L=None, init_ratio=None, fig_file=None, wave_col='WAVE', flux_col='FLUX_STACK', error_col='ERR_STACK', continuum_col='CONTINUUM', omega=None, verbose=True, pruning=True, N_Model=20000, config=None)[source]¶
Run MCMC fitting for the fluorescence model in a wavelength window.
This routine builds the model for one or more isotopologues, applies an optional line-spread function (LSF), and samples posterior distributions for the parameters provided in
priors.This is the key function for fitting. It can be used standalone, but is used by
cometspec.fluorescence.FluorescenceModelas its default fitting method.Defaults and selector behavior¶
isotopologuesdefaults to"12C14N".systemsdefaults toNone, which maps to["BX00", "AX_dv1"]. Only used for CN; ignored for other species.- Accepted string selectors for
systemsinclude: "both"/"bx+ax"/"bxax"->["BX00", "AX_dv1", "AX_dv2", "AX_dv3"]"all"->["ALL"]Extremely high computation cost; not recommended."bx","b-x","bx(0,0)","bx00","bx_00","b_x_00"->["BX00"]"ax"/"a-x"->["AX_dv1", "AX_dv2"]"ax(dv=1)"/"ax_dv1"->["AX_dv1"]"ax(dv=2)"/"ax_dv2"->["AX_dv2"]"ax(dv=3)"/"ax_dv3"->["AX_dv3"]
- Accepted string selectors for
emcee
nwalkers=50andnsteps=1000by default.Collisions are gated by
logQ: if neither a per-isologQ_{iso}nor a sharedlogQprior is given it will fall back toinit_logQ, and ifinit_logQorinit_logQ_by_isoare also not given for an isotopologue, that isotopologue is treated as collisionless. Other collision controls default toinclude_deltaJ0_parity_mix=True(to allow \(\Delta J = 0\)) andrequire_X_only_for_rot=True(to enable just collisions to the lower electronic state found).It is recommended to set explicitly
include_rotations=Falseif no collisions is the desired behavior.Pumping-shift controls default to
velocity_kms=0.0anddelta_lambda_A=0.0. These values are used when computing \(J_\nu\) for the radiative rates and they are different fromv_kmsanddlamparameters of a model which are shifts of the output spectrum lines.Fallback parameter values starts with
init_check the default behavior here if this function is used directly, check them incometspec.fluorescence.FluorescenceModelif this function is used via the model’s default fitting method.If parameters are not present in the priors, they will fall to the
init_default values (if provided) or to the hardcoded defaults in the model_flux function (e.g. T=300K, v_kms=0, etc). This means that if you want to fit for a parameter but don’t provide a prior for it, it will not be fitted and instead will use the fallback value. This allows you to control which parameters are fitted and which are fixed without having to change the code of this function.
Note
Priors consist on a dict mapping parameter name to a (min, max) tuple defining the uniform prior range for that parameter. The possible keys are:
"logN","logQ","T","dlam","v_kms", ‘logN_<isotopologue>’, ‘logQ_<isotopologue>’, ‘T_<isotopologue>’, ‘dlam_<isotopologue>’, ‘v_kms_<isotopologue>’ (any iso not found falls back to <parameter>) and lsf priors,sigma,sigma1,sigma2,sigma_G,fwhm_L,ratio.- Parameters:
data (
Any) – Observed spectrum table or DataFrame which is going to be fitted.window (
tuple[float,float]) – Wavelength fitting window(min_A, max_A).pumping (
Any) – Pumping spectrum withWAVEandFLUXcolumns.isotopologues (
strorSequence[str], optional, default"12C14N") – One or more isotopologue labels.systems (
strorSequence[str], optional, defaultNone) – CN system selector(s). IfNone, uses["BX00", "AX_dv1"].linelists (
pandas.DataFrameordict[str,pd.DataFrame]orSequence[pd.DataFrame], optional, defaultNone) – Optional normalized line-list DataFrame or isotopologue mapping.priors (
dict[str,tuple[float,float]]) – Parameter prior ranges. Required.nwalkers (
int, optional, default50) – Number of walkers.nsteps (
int, optional, default1000) – Number of MCMC steps.n_cores (
int, optional, defaultNone) – Number of CPU cores used to parallelize the per-step walker likelihood evaluations through themultiprocesslibrary (a drop-in replacement for the standardmultiprocessingthat usesdillfor serialization).Noneor1keeps the sampler single-threaded (the prior default behavior). Values>1spawn aspawn-based pool that emcee uses to evaluate walkers in parallel; the value is clamped toos.cpu_count(). Thespawnstart method is used on every platform so the pool is safe on macOS (Apple Accelerate’s GCD-backed BLAS deadlocks or segfaults when a multi-threaded process forks) as well as on Linux and Windows. Usingmultiprocess(withdill) instead of stockmultiprocessingmeans user-providedlsfcallables defined as lambdas, closures, or top-level functions in a Jupyter notebook cell are serialized correctly — stock pickle would silently fail on those and hang the sampler. Each worker has its BLAS thread pool pinned to 1 viathreadpoolctlto avoid thread oversubscription. The tqdm progress bar (whenprogress=True) keeps working under both modes.Important
When calling this function with
n_cores > 1from a.pyscript (not a Jupyter notebook), the call must be guarded byif __name__ == "__main__":. Eachspawnworker re-imports the user’s script, and without the guard the call would re-execute recursively; Python detects this and aborts with aRuntimeError. No guard is needed in notebooks or whenn_coresisNone/1.Note
Whether
n_cores>1is faster thann_cores=1depends on how expensive each likelihood evaluation is, which scales with the line-list size — driven byA_min, the number of isotopologues and systems, and the window width. Multiprocessing adds a fixed IPC overhead per emcee step (pickle + pipe roundtrip per walker half). When eachlnprobcall is slow (large line list — e.g.A_min=1e4for CN with B-X + A-X, or multi-isotopologue runs) the per-step cost dwarfs IPC andn_cores=4typically gives a ~3-4x speedup. When each call is cheap (small line list — e.g.A_min≥ 1e6 with a single isotopologue, or a narrow window with few transitions) IPC overhead can dominate andn_cores=1may end up equal or faster. If unsure, time one short run at each setting on your problem before committing to a long MCMC.lsf (
Callable[[numpy.ndarray],numpy.ndarray], optional, defaultNone) – Optional custom LSF callable.lsf_method (
str, optional, defaultNone) – Built-in LSF method name. It must be one of"2Gauss","Gauss_Lorentz","Gauss", or"Lorentz". Required iflsfis not provided.include_rotations (
bool, optional, defaultTrue) – Enable rotational collisions in the rate matrix.omega (
float, optional, defaultNone) – Optional aperture solid angle in sr.config (
MCMCFitConfig, optional, defaultNone) – Optional grouped configuration. Any field set on this object overrides the corresponding individual keyword argument. Fields left atcometspec.config.UNSET(the default) are ignored, so callers that pass individual kwargs not in the config are unaffected. Seecometspec.config.MCMCFitConfig.
- Other Parameters:
include_deltaJ0_parity_mix (
bool, optional, defaultTrue) – Allow parity-changingDelta J = 0collisions.require_X_only_for_rot (
bool, optional, defaultTrue) – Restrict collisions to the lower electronic state (auto-detected as thelower_eslabel with the smallest minimumE_lower_cm1; works for any spectroscopic notation).make_plots (
bool, optional, defaultFalse) – Generate diagnostic plots (fit, corner and traces).progress (
bool, optional, defaultTrue) – Show emcee progress output.A_min (
float, optional, default1e4) – Minimum Einstein A threshold for line lists. User provided line lists are not filtered by A_ula (
float, optional, default3) – Stretch-move parameter for emcee.velocity_kms (
float, optional, default0.0) – Velocity shift used when evaluating pumping J_nu.delta_lambda_A (
float, optional, default0.0) – Additive wavelength shift used when evaluating pumping J_nu.init_logQ (
float, optional, defaultNone) – FallbacklogQvalue used by every isotopologue when nologQprior is sampled and no per-iso entry is given.Nonedisables collisions for any isotopologue not covered byinit_logQ_by_iso.init_logQ_by_iso (
dict[str,floatorNone], optional, defaultNone) – Per-isotopologue fallbacklogQmap. Each value may beNoneto force that isotopologue to be collisionless. Isotopologues not present in the map fall back toinit_logQ.init_T (
float, optional, default300.0) – Global fallback temperature whenTis not sampled and no per-iso entry is given.init_T_by_iso (
dict[str,float], optional, defaultNone) – Per-isotopologue fallback temperature map. For each isotopologue,init_T_by_iso[iso]is used when neitherT_{iso}norTappears in the sampled parameters. Isotopologues not in this map fall back toinit_T.init_v_kms (
float, optional, default0.0) – Global fallback emission velocity shift whenv_kmsis not sampled and no per-iso entry is given.init_v_kms_by_iso (
dict[str,float], optional, defaultNone) – Per-isotopologue fallback emission velocity shift. Falls back toinit_v_kmsfor isotopologues not present in the map.init_dlam (
float, optional, default0.0) – Global fallback emission wavelength shift whendlamis not sampled and no per-iso entry is given.init_dlam_by_iso (
dict[str,float], optional, defaultNone) – Per-isotopologue fallback additive wavelength shift. Falls back toinit_dlamfor isotopologues not present in the map.init_logN (
float, optional, defaultNone) – Global fallbacklogNwhen neitherlogN_{iso}norlogNappears in the sampled parameters and no per-iso entry is given.init_logN_by_iso (
dict[str,float], optional, defaultNone) – Per-isotopologue fallbacklogNmap. For each isotopologue,init_logN_by_iso[iso]is used when neitherlogN_{iso}norlogNare sampled. Falls back toinit_logNfor isotopologues not present in the map.init_sigma (
float, optional, defaultNone) – Fallback Gaussian sigma when"sigma"is not sampled.init_sigma1 (
float, optional, defaultNone) – Fallbacksigma1for"2Gauss"when not sampled.init_sigma2 (
float, optional, defaultNone) – Fallbacksigma2for"2Gauss"when not sampled.init_sigma_G (
float, optional, defaultNone) – Fallbacksigma_Gfor"Gauss_Lorentz"when not sampled.init_fwhm_L (
float, optional, defaultNone) – Fallback Lorentzian FWHM when not sampled.init_ratio (
float, optional, defaultNone) – Fallback mixture ratio when not sampled.fig_file (
str, optional, defaultNone) – Base path for output figures.wave_col (
str, optional, default"WAVE") – Wavelength column indata.flux_col (
str, optional, default"FLUX_STACK") – Flux column indata.error_col (
str, optional, default"ERR_STACK") – Uncertainty column indata.continuum_col (
str, optional, default"CONTINUUM") – Continuum column indata.pruning (
bool, optional, defaultTrue) – Apply posterior pruning.N_Model (
int, optional, default20000) – Number of elements in the model grid.
- returns:
dict[str,Any]– Dictionary with posterior summaries, samples, and model envelopes. Keys:param_keys(listofstr) – Ordered list of sampled parameter names, matching the column order ofsamples_pruned.median_params(dict[str,float]) – Posterior median for each parameter.up_errors_params(dict[str,float]) – Upper 1-sigma error (p84 - p50) for each parameter.low_errors_params(dict[str,float]) – Lower 1-sigma error (p50 - p16) for each parameter.samples_pruned(numpy.ndarrayoffloat, shape(N_samples, ndim)) – Pruned and burn-in-removed posterior samples.lnprob_pruned(numpy.ndarrayoffloat, shape(N_samples,)) – Log-probability for each pruned sample.model_wave(numpy.ndarrayoffloat, shape(N_Model,)) – Wavelength grid over the fitting window.median_model(numpy.ndarrayoffloat, shape(N_Model,)) – Model flux evaluated at the posterior median parameters.model_p16(numpy.ndarrayoffloat, shape(N_Model,)) – 16th percentile of the model ensemble (lower 1-sigma envelope).model_p84(numpy.ndarrayoffloat, shape(N_Model,)) – 84th percentile of the model ensemble (upper 1-sigma envelope).best_model(numpy.ndarrayoffloat, shape(N_Model,)) – Model flux evaluated at the highest-likelihood sample.
- raises ValueError:
If priors or required parameters are inconsistent.
- raises ValueError:
If required columns are missing from the data.
Notes
If extreme prior values cause the rate-matrix solver to fail (
LinAlgErrorfrom a numerically degenerate matrix, typically due to NaN/Inf collision rates at very high Q),solve_with_normalization_fastreturns NaN populations.lnlikethen detects a non-finite model and returns-inf, so the walker step is rejected gracefully rather than crashing.
- cometspec.mcmc.H_CGS¶
Planck constant, in erg*s.