cometspec.linelist¶
Line-list parsing and normalization routines.
Routines¶
normalize_cn_systems_arg()– Normalize user-friendly CN system selectors to canonical tokens.from_user_linelist()– Convert a user line list into the normalized transition schema.make_sym()– Build a symmetry label.from_cn_brooke()– Convert a Brooke CN line list (e.g. fromload_cn_linelist()) to the normalized schema.filter_cn_systems()– Filter a Brooke CN line list by system, wavelength, and A (Einstein coefficient) threshold.load_default_transitions()– Load and normalize packaged transitions per isotopologue.resolve_linelists_with_defaults()– Resolve user-supplied linelists, filling in defaults for any missing isotopologues.default_linelist_source()– Return the file path that would be loaded for a given isotopologue from packaged defaults.linelist_origins()– Return a mapping of isotopologue to source description for a set of linelists.attach_pumping_and_labels()– Attach pumping information and human-friendly labels to a transition table. This is important as it ensures the solar pumping information is correctly associated with each transition.
- cometspec.linelist.normalize_cn_systems_arg(systems)[source]¶
Translate user-friendly CN band-system selectors into canonical tokens.
This is the input-parser for any function that needs to know which CN band system(s) to operate on. It accepts a variety of human-friendly spellings (case-insensitive, with or without dashes/parentheses) and maps each one to a fixed set of internal tokens used downstream. A sequence of selectors is also accepted; results are flattened and deduplicated while preserving order.
The canonical (output) tokens are:
"BX00"– \(B^{2}\Sigma^{+} \to X^{2}\Sigma^{+}\) violet system, \((v', v'') = (0, 0)\) band (~388 nm)."AX_dv0"– \(A^{2}\Pi \to X^{2}\Sigma^{+}\) red system, \(\Delta v = |v' - v''| = 0\) sequence."AX_dv1"– \(A^{2}\Pi \to X^{2}\Sigma^{+}\) red system, \(\Delta v = |v' - v''| = 1\) sequence."AX_dv2"– A–X red system, \(\Delta v = 2\) sequence."AX_dv3"– A–X red system, \(\Delta v = 3\) sequence."XX"– All X–X transitions."ALL"– This token if used, it will include all transitions, resulting in extremely long computation times.
Recognized input forms (all matched case-insensitively after stripping):
None– default selection, returns["BX00", "AX_dv1"]."both","bx+ax","bxax"– violet plus all three red sequences."all"– returns["ALL"]."bx","b-x","bx(0,0)","bx00","bx_00","b_x_00"– the violet \((0,0)\) band."ax","a-x"– the \(\Delta v = 1\) and \(\Delta v = 2\) red sequences."ax(dv=0)","ax_dv0"– A–X \(\Delta v = 0\) only."ax(dv=1)","ax_dv1"– A–X \(\Delta v = 1\) only."ax(dv=2)","ax_dv2"– A–X \(\Delta v = 2\) only."ax(dv=3)","ax_dv3"– A–X \(\Delta v = 3\) only."xx"– all X–X transitions.Any other string – passed through unchanged as a single-element list, letting the caller handle (or reject) unknown tokens.
A sequence (list, tuple, …) of any of the above – each element is normalized recursively, results are concatenated, and duplicates are removed while preserving first-occurrence order.
- Parameters:
systems (
strorsequenceofstr, optional) – Band-system selector(s). See the list of recognized forms above.- Returns:
listofstr– Canonical token list. Order matches the order of the input. No results are duplicated.
Examples
normalize_cn_systems_arg(None) ['BX00', 'AX_dv1'] normalize_cn_systems_arg("both") ['BX00', 'AX_dv1', 'AX_dv2', 'AX_dv3'] normalize_cn_systems_arg("BX") ['BX00'] normalize_cn_systems_arg(["bx", "ax_dv1", "bx"]) # dedup, order preserved ['BX00', 'AX_dv1'] normalize_cn_systems_arg("unknown") ['unknown']
- cometspec.linelist.from_user_linelist(df, *, lam_col, A_col, upper_id_col, lower_id_col, g_upper_col, g_lower_col, lower_es_col=None, lower_v_col=None, lower_J_col=None, lower_sym_col=None, E_lower_cm1_col=None)[source]¶
Convert a user line list into the normalized transition schema.
- Parameters:
df (
pandas.DataFrame) – Input line list table.lam_col (
str) – Wavelength column in vacuum \(\AA\).A_col (
str) – Einstein \(A\) coefficient column in \(\mathrm{s}^{-1}\).upper_id_col (
str) – Upper-state identifier column.lower_id_col (
str) – Lower-state identifier column.g_upper_col (
str) – Upper-state degeneracy column.g_lower_col (
str) – Lower-state degeneracy column.lower_es_col (
str, optional, defaultNone) – Optional lower electronic-state column.lower_v_col (
str, optional, defaultNone) – Optional lower vibrational-level column.lower_J_col (
str, optional, defaultNone) – Optional lower rotational-level column.lower_sym_col (
str, optional, defaultNone) – Name of an optional column holding a composite lower-state spin-orbit/parity label. For Brooke-style line lists this is typically the concatenation of the lower-state \(F''\), \(p''\), and \(eS''\) columns, which together identify the fine-structure/parity sublevel within its electronic state.E_lower_cm1_col (
str, optional, defaultNone) – Optional lower-state energy column in \(\mathrm{cm}^{-1}\). A pair of levels will use these values to get the \(\Delta E\) for the collisions.
- Returns:
pandas.DataFrame– Normalized transition table. Note that the output has E_cm1 and optionally E_lower_cm1, they are different, the first is the energy corresponding to the transition (energy from the line wavelength) and the second one the energy of a state with respect the ground state.- Raises:
ValueError – If required columns are missing or values are invalid.
- cometspec.linelist.make_sym(F, p, use_omega=False, es=None)[source]¶
Build a compact CN-style symmetry label.
- cometspec.linelist.from_cn_brooke(df, *, lam_col='lambda_vac_A_from_Cal', A_col='A', use_omega_labels=False, E_lower_col="E''")[source]¶
Convert a Brooke CN line list (e.g. the output
load_cn_linelist()) to the normalized schema.- Parameters:
df (
pandas.DataFrame) – Brooke-format CN line list.lam_col (
str, optional, default"lambda_vac_A_from_Cal") – Wavelength column in vacuum Angstrom.A_col (
str, optional, default"A") – Einstein A coefficient column.use_omega_labels (
bool, optional, defaultFalse) – Use Omega labels for A-state symmetry tags.E_lower_col (
str, optional, default"E''") – Lower-state energy column in cm^-1.
- Returns:
pandas.DataFrame– Normalized transition table. Each row is one rovibronic transition.lambda_vac_A(float) – Vacuum wavelength in Å.A_ul(float) – Einstein \(A\) coefficient (spontaneous emission rate), in s-1.upper_id(str) – String key identifying the upper level, formatted asES|v=V|J=J|sym=S.lower_id(str) – String key identifying the lower level, formatted asES|v=V|J=J|sym=S.g_upper(float) – Upper-level degeneracy.g_lower(float) – Lower-level degeneracy.E_cm1(float) – Transition energy in cm-1, computed as \(1/\lambda_{\mathrm{vac}}\).lower_es(str) – Lower electronic state label (e.g.X).lower_v(float) – Lower vibrational quantum number \(v''\).lower_J(float) – Lower rotational quantum number \(J''\).lower_sym(str) – Lower-level symmetry tag (e-f parity and \(\Omega\) component).E_lower_cm1(float) – Lower-state energy in cm-1, taken directly fromE_lower_col. A pair of levels will use these values to get the \(\Delta E\) for the collisions.
- Raises:
ValueError – If required columns are missing or contain invalid values.
- cometspec.linelist.filter_cn_systems(df_all, *, systems=None, lambda_min_A=2990.001, lambda_max_A=10009.998, A_min=10000.0, lam_col='lambda_vac_A_from_Cal')[source]¶
Filter a Brooke CN line list by system, wavelength, and A (Einstein coefficient) threshold.
- Parameters:
df_all (
pandas.DataFrame) – Full Brooke/Sneden CN table.systems (
strorSequence[str], optional, defaultNone) – System selector(s) accepted bynormalize_cn_systems_arg().lambda_min_A (
float, optional, default2990.001) – Minimum wavelength in Angstrom.lambda_max_A (
float, optional, default10009.998) – Maximum wavelength in Angstrom.A_min (
float, optional, default1e4) – Minimum Einstein A threshold, orNoneto disable.lam_col (
str, optional, default"lambda_vac_A_from_Cal") – Wavelength column name.
- Returns:
pandas.DataFrame– Filtered CN line list.
- cometspec.linelist.load_default_transitions(*, isotopologues='12C14N', systems=None, A_min=10000.0, lambda_min_A=2990.001, lambda_max_A=10009.998, use_omega_labels=False, line_paths=None)[source]¶
Load and normalize packaged default transitions per isotopologue. The options are “12C2”, “12C13C”, “13C2”, “12C14N”, “13C14N”, “12C15N”, “Fe”. For CN if the isotopologue is not found it will fall back to “12C14N”. Any string with Fe on it will load the fe_normalized.csv file. For C2 if the isotopologue is not found it will fail. If CN is choosen, the systems to include can be given as a parameter. Default system is BX(0,0) and AX(Δv=+1), but this can be changed with the
systemsargument. The options forsystemsis list containing one or more of the following str:“both” or “bx+ax”: BX(0,0), AX(Δv=±1), AX(Δv=±2) and AX(Δv=±3)
“all”: all systems in the Brooke linelist (including minor ones, this will lead to extremely high computation times)
“bx”, “b-x”, “bx(0,0)”, “bx00”, “bx_00”, “b_x_00” or “b-x”: BX(0,0) only
“ax” or “a-x”: for “AX_dv1”, ‘AX_dv2’
“ax(dv=0)”, “ax_dv0”: AX(Δv=0) only
“ax(dv=1)”, “ax_dv1”: AX(Δv=±1) only
“ax(dv=2)”, “ax_dv2”: AX(Δv=±2) only
“ax(dv=3)”, “ax_dv3”: AX(Δv=±3) only
‘xx’: all X-X transitions
At the end the references of each line list can be found [1] [2] [3] [4] [5]. Additional filers to the line list were applied and are explained in [6].
Note
Rows on the line lists with missing or invalid values in any of the necessary columns are dropped.
- Parameters:
isotopologues (
strorSequence[str], optional, default"12C14N") – One or more isotopologue labels.systems (
strorSequence[str], optional, defaultNone) – CN system selector(s).A_min (
float, optional, default1e4) – Minimum Einstein A threshold.lambda_min_A (
float, optional, default2990.001) – Minimum wavelength in Angstrom.lambda_max_A (
float, optional, default10009.998) – Maximum wavelength in Angstrom.use_omega_labels (
bool, optional, defaultFalse) – Use Omega labels for A-state symmetry tags.line_paths (
dict[str,str], optional, defaultNone) – Optional mapping of isotopologue to explicit file path.
- Returns:
dict[str,pandas.DataFrame]– Dictionary mapping isotopologue label to normalized transition table. The keys are exactly the entries inisotopologues; the values are DataFrames with the same schema as described infrom_cn_brooke()orfrom_user_linelist().
References
- cometspec.linelist.resolve_linelists_with_defaults(linelists, iso_list, *, systems=None, A_min=10000.0, lambda_min_A=2990.001, lambda_max_A=10009.998, use_omega_labels=False, line_paths=None)[source]¶
Function to take a list of linelists and a list of isotopologues. It is going to match all the line lists with their isotopologues, if the len linelists is less than the len of isotopologues, the remaining isotopologues will be loaded with the default linelists. Thus if the user wants to mix the default linelists and custome ones the isotopologues should be ordered by first the ones with provided line lists and then the ones without provided line lists, so the function can match them correctly.
Resolution rules:
linelists is None-> every iso loaded from packaged defaults viaload_default_transitions().Single
pandas.DataFrame-> assigned toiso_list[0]; the remaining isotopologues fall back to defaults.dictmapping iso label to DataFrame -> entries used for matching labels iniso_list; any iso label not present in the dict falls back to defaults. Keys not iniso_listare ignored.Sequence (
list/tuple) of DataFrames -> positional pairing with the firstlen(linelists)entries ofiso_list; the remainder fall back to defaults.
Loading a default for an isotopologue without a packaged file (e.g.
"COH") raisesValueErrorfromload_default_transitions().- Parameters:
linelists (
pandas.DataFrameordict[str,pandas.DataFrame]orSequence[pandas.DataFrame]orNone) – User-supplied line list(s). See resolution rules aboveiso_list (
Sequence[str]) – Isotopologue labels, in the order they should be returned. Each label is matched against the user-supplied line lists (if any) according to the resolution rules above, and any isotopologue without a user-supplied line list is loaded from the packaged defaults.systems (
strorSequence[str], optional, defaultNone) – CN system selector(s) for default CN line lists. Seenormalize_cn_systems_arg()for accepted forms.A_min (
float, optional, default1e4) – Minimum Einstein A threshold for default line lists, orNoneto disable.lambda_min_A (
float, optional, default2990.001) – Minimum wavelength in Angstrom for default line lists.lambda_max_A (
float, optional, default10009.998) – Maximum wavelength in Angstrom for default line lists.use_omega_labels (
bool, optional, defaultFalse) – Use Omega labels for A-state symmetry tags in default CN line lists.line_paths (
dict[str,str], optional, defaultNone)
- Returns:
dict[str,pandas.DataFrame]–{iso: DataFrame}ordered exactly asiso_list.
- cometspec.linelist.default_linelist_source(iso)[source]¶
Return the file path that would be loaded for
isofrom packaged defaults.- Parameters:
iso (
str) – Isotopologue label.- Returns:
str– File path that would be loaded forisofrom packaged defaults.- Raises:
ValueError – If
isodoes not match any supported default pattern for the packaged default line lists (12C14N, 13C14N, 12C15N, 12C2, 13C2, 12C13C, or any label containing “Fe”). (CN-like, C2-like, or containing"Fe").
- cometspec.linelist.linelist_origins(linelists, iso_list, *, line_paths=None)[source]¶
Return a per-isotopologue origin string (file) for the configured line lists.
Mirrors the resolution rules of
resolve_linelists_with_defaults():Entries supplied by the user (DataFrame, dict entry, or positional list slot) are reported as
"custom (user-provided)".Entries with an explicit override in
line_pathsare reported as that path.Otherwise the path returned by
default_linelist_source()is used.
Does not load any data just to determine the origin used.
- Parameters:
linelists (
pandas.DataFrameordict[str,pandas.DataFrame]orSequence[pd.DataFrame]orNone) – User-supplied line list(s). See resolution rules inresolve_linelists_with_defaults().iso_list (
Sequence[str]) – Isotopologue labels, in the order they should be returned. Each label is matched against the user-supplied line lists (if any) according to the resolution rules above, and any isotopologue without a user-supplied line list is assigned the origin of the packaged defaultline_paths (
dict[str,str], optional, defaultNone) – Optional mapping of isotopologue to explicit file path, used for reporting the origin of any isotopologue without a user-supplied line list. If an isotopologue is present in this dict, its origin is reported as the corresponding path instead of the default path returned bydefault_linelist_source(). This is intended to be used when the user has provided a custom path
- Returns:
dict[str,str]– Mapping of isotopologue label to origin string (e.g. file path). The keys are exactly the entries iniso_list; the values are determined according to the resolution rules above.
- cometspec.linelist.attach_pumping_and_labels(df, pumping, *, line_v_kms=0.0, line_dlam_A=0.0, lsf_for_Jnu=None, lam_col='lambda_vac_A')[source]¶
Attach the solar flux incident in the comet for a given wavelength to a transition table.
- Parameters:
df (
pandas.DataFrame) – Normalized transition DataFrame.pumping (
Any) – Pumping spectrum withWAVEandFLUXcolumns.line_v_kms (
float, optional, default0.0) – Doppler velocity shift applied to line wavelengths, in km/s.line_dlam_A (
float, optional, default0.0) – Additive wavelength shift in Angstrom.lsf_for_Jnu (
Callable[[numpy.ndarray],numpy.ndarray], optional, defaultNone) – Optional kernel used to average flux around each line.lam_col (
str, optional, default"lambda_vac_A") – Input wavelength column name indf.
- Returns:
astropy.table.Table– Astropy table with wavelength, frequency, flux-at-line, J_nu and original dataframe columns.