API Documentation
emulator.py
This module houses the Emulator of the opacity mixing
- class opac_mixer.emulator.Emulator(opac, prange_opacset=(1e-06, 1000, 30), trange_opacset=(100, 10000, 30), filename_data=None)[source]
The supervisor that handels the training and evaluation of the opacity emulator.
- __init__(opac, prange_opacset=(1e-06, 1000, 30), trange_opacset=(100, 10000, 30), filename_data=None)[source]
Construct the emulator class.
- Parameters:
- opac: opac_mixer.read.ReadOpac
a list of input opacity readers. Can be setup, but does not need to. Will do the setup itself otherwise.
- prange_opacset: array(3)
optional, the range to which the reader should interpolate the pressure grid to (lower, upper, num_points).
- trange_opacset: array(3)
optional, the range to which the reader should interpolate the temperature grid to (lower, upper, num_points).
- filename_data: str
A filename, used to save the training and testing data to
- setup_scaling(input_scaling=None, output_scaling=None, inv_output_scaling=None)[source]
(optional) Change the callback functions for the scaling of in and output. Defaults are given as opac_mixer.scalings.default_<name>. See opac_mixer/utils/scalings.py for inspiration
- Parameters:
- input_scaling: function or None
The function to use for input scaling. If None, use opac_mixer.scalings.default_input_scaling
- output_scaling: function or None
The function to use for output scaling. If None, use opac_mixer.scalings.default_output_scaling
- inv_output_scaling: function or None
The function to use for output scaling. If None, use opac_mixer.scalings.default_inv_output_scaling
- setup_sampling_grid(approx_batchsize=800000.0, extra_abus=None, bounds=None)[source]
Setup the sampling grid. Sampling along MMR and pressure is in logspace. Sampling along temperature is in linspace.
- Parameters:
- approx_batchsize: int
Number of total sampling points. Needs to be a power of 2 for sobol sampling
- bounds: dict or None
the lower and upper bounds for sampling. Shape: {‘species’:(lower, upper)} The key can be either a species name in opac.spec or p and T for pressure and Temperature. It will use opac_mixer.emulator.DEFAULT_MMR_RANGES for mmrs, opac_mixer.emulator.DEFAULT_PRANGE for pressure, and opac_mixer.emulator.DEFAULT_TRANGE for temperautre for all missing values
- extra_abus: array(num_sample, ls, lp, lt)
Extra abundancies (mmrs) used for the training data. Could be e.g., a grid of eq. chem abundancies
- Returns:
- input_data (array(batchsize, opac.lg, opac.ls)):
The sampled inputdata to train/test the emulator. The input_data consists of kappas(g) for each species
- setup_mix(test_size=0.2, split_seed=None, do_parallel=True)[source]
Setup the mixer and generate the training and testdata.
- Parameters:
- test_size: float
fraction of data used for testing
- split_seed: int
A seed to be used for shuffling training and test data before splitting
- do_parallel bool
If you want to create the data in parallel or not
- load_data(filename=None, test_size=None, split_seed=None, use_split_seed=True)[source]
Load the training and test data from a h5 file.
- Parameters:
- filename: str
optional, can be set either here or in the constructor. Make sure the filename comes without the npy suffix
- test_size: float
optional, use a different test size than the one loaded
- split_seed: int
optional, use a different seed to shuffle data before spliting training and testing data
- use_split_seed: bool
optional, if true, it will just use the provided or loaded split seed, else it will create a new random one
- setup_model(model=None, filename=None, load=False, learning_rate=0.001, hidden_units=None, verbose=True, **model_kwargs)[source]
Setup the emulator model and train it. Note: This will reset all previously trained weights in keras models.
- Parameters:
- model: sklearn compatible model
(optional): a model to learn. Needs to be contructed already. Use DeepSet by default
- filename: str or None
optional, A filename to save the model
- load: bool
optional, load a -pretrained- model instead of constructing one
- fit(*args, **kwargs)[source]
Train the model.
- Parameters:
- args:
Whatever you want to pass to the model to fit
- kwargs:
Whatever you want to pass to the model to fit
- predict(X, *args, **kwargs)[source]
Predict using the trained model.
- Parameters:
- X: array(num_samples, opac.lg, opac.ls
The values you want predictions for
- args:
Whatever you want to pass to the model for prediction
- kwargs:
Whatever you want to pass to the model for prediction
- score(validation_set=None)[source]
Print some metrics for the training and test data.
- Parameters:
- validation_set: list(X_test, y_test)
validation set to be used instead of (self.X_test, self.y_test) Note the dimensions of X_test: array(num_samples, opac.lg, opac.ls) and y_test: array(num_samples, opac.lg)
- export(path, file_format='exorad')[source]
Export the weights
- Parameters:
- path: str
path where the weights should be stored
- file_format: str
the format in which the weights should be stored. Can be either exorad or numpy.
- plot_predictions(validation_set=None)[source]
Plot the predictions vs the true values
- Parameters:
- validation_set: list(X_test, y_test)
validation set to be used instead of (self.X_test, self.y_test) Note the dimensions of X_test: array(num_samples, opac.lg, opac.ls) and y_test: array(num_samples, opac.lg)
read.py
The Module that houses the ktable grid model and read in capabilities
- class opac_mixer.read.ReadOpac(ls, lp, lt, lf, lg)[source]
The opacity reader base class
The reader class only needs to define a read in function and pass important metadata to the constructor of the parent class. That’s it.
The constructor (__init__) needs to call the parent constructor with the following arguments:
ls (int): number of species that are read in
lp (array(ls)): array that holds the number of pressure grid points for each species
lt (array(ls)): array that holds the number of temperature grid points for each species
lf (array(ls)): array that holds the number of frequency grid points for each species
lg (array(ls)): array that holds the number of $g$ grid points for each species
Note, that we require that `lf[0]==lf[i]` and `lg[0]==lg[i]` for all i in number of species
The read in function (read_opac) has to fill the following arrays:
self.spec (array(ls): array holding the names of the opacity species
self.T (array(ls, max(lt))): array holding the temperature in K at which the k-table grid is defined
self.p (array(ls, max(lp))): array holding the pressure values in bar at which the k-table grid is defined
self.bin_edges (array(ls, lf[0]+1)): array holding the wave number ($1/lambda$) values in 1/cm of the edges of the wavenumber grid at which the k-table grid is defined
self.bin_center (array(ls, lf[0])): array holding the wave number ($1/lambda$) values in 1/cm of the center of the wavenumber grid at which the k-table grid is defined.
self.weights (array(ls, lg[0])): array holding the weights of the k-tables (see below for conversion from $g$ values)
self.kcoeff (array(ls, max(lp), max(lt), lf[0], lg[0]): array holding the actual values of the k-table grid in cm2/g.
Note, the data arrays are initialized with space up unto the maximum number of temperature and pressure grid points, hence the max(lt) and max(lp).
Note, that we need weights instead of g-values. The conversion between the two can be done using these two functions: compute_ggrid(w, Ng), compute_weights(g, Ng) from mix.py
- __init__(ls, lp, lt, lf, lg)[source]
Construct the reader. Initialize all arrays.
- Parameters:
- ls: int
number of species that are read in
- lp: array(ls)
array that holds the number of pressure grid points for each species
- lt: array(ls)
array that holds the number of temperature grid points for each species
- lf: array(ls)
array that holds the number of frequency grid points for each species
- lg: array(ls)
array that holds the number of $g$ grid points for each species
- setup_temp_and_pres(temp=None, pres=None)[source]
Interpolate k coeffs to different pressure and temperature values.
- Parameters:
- temp: optional, array-like
A 1D temperature array (K) to which the k-table grid should be interpolated to. If not set, it wil use a linspace grid between the maximum and minimum found in the temperature grids.
- pres: optional, array-like
A 1D pressure array (bar) to which the k-table grid should be interpolated to. If not set, it wil use a logspace grid between the maximum and minimum found in the pressure grids.
- Note that, right now, it takes the values outside of the defined range to be the last defined values.
- plot_opac(pres, temp, spec, ax=None, **plot_kwargs)[source]
Simple plotting routine of the opacity.
- Parameters:
- pres: float
pressure at which the opacity is to be plotted, will pick closest lower point
- temp: float
temperature at which the opacity is to be plotted, will pick closest lower point
- spec: str
name of species to plot
- ax: matplotlib ax
optional, matplotlib ax object on which the plot should be placed
- plot_kwargs:
everything else will be just passed to the plotting routine
- Returns:
- lines: list
list of line plots
mix.py
Housing the mixing methods.
There are two mixers: CombineOpacIndividual and CombineOpacGrid
CombineOpacIndividual: – takes arbitrary abundances and temperatures, pressures for each species – slow
CombineOpacGrid: – takes arbitrary abundances but keeps temperatures, pressures for each species from underlying grid – fast
The current implementation of the Emulator builds on the CombineOpacGrid, since its faster
- class opac_mixer.mix.CombineOpacGrid(opac)[source]
A class for mixing arbitrary abundances but keeps temperatures, pressures for each species from underlying grid
- add_single(input_data, method='RORR')[source]
mix one kgrid
- Parameters:
- input_data: array(ls, lp, lt) or dict:
The mass mxing ratios for every pressure-temperature grid point for all species. The mmr could be a dictionary of species names {spec_i: mmr_i for spec_i in self.opac.spec}
- method: str
Can be RORR, or linear. The mixing method to be used
- Returns:
- kout: array(lp,lt,lf,lg)
The mixed k tables
- add_batch(input_data, method='RORR')[source]
mix the kgrid multiple times.
- Parameters:
- input_data: array(batchsize, ls, lp, lt) or dict
The mass mxing ratios for every pressure-temperature grid point for all species. The mmr could be a dictionary of species names {spec_i: mmr_i for spec_i in self.opac.spec}
- method: str
Can be RORR, or linear. The mixing method to be used
- Returns:
- kout: array(batchsize, lp,lt,lf,lg)
The mixed k tables
- add_batch_parallel(input_data, method='RORR', **pool_kwargs)[source]
Parallel version of add_batch
- Parameters:
- input_data: array(batchsize, ls, lp, lt) or dict
The mass mxing ratios for every pressure-temperature grid point for all species. The mmr could be a dictionary of species names {spec_i: mmr_i for spec_i in self.opac.spec}
- method: str
Can be RORR, or linear. The mixing method to be used
- pool_kwargs: dict
anything else that may be of interest for the multiprocessing.Pool instance (e.g., pool size, etc.)
- Returns:
- kout: array(batchsize, lp,lt,lf,lg)
The mixed k tables