Cramer_Distance¶

class turbustat.statistics.Cramer_Distance(cube1, cube2, noise_value1=-inf, noise_value2=-inf)[source] [edit on github]¶

Bases: object

Compute the Cramer distance between two data cubes. The data cubes are flattened spatially to give 2D objects. We clip off empty channels and keep only the top quartile in the remaining channels.

Parameters:

Parameters:	cube1 : numpy.ndarray or astropy.io.fits.PrimaryHDU or SpectralCube First cube to compare. cube2 : numpy.ndarray or astropy.io.fits.PrimaryHDU or SpectralCube Second cube to compare. noise_value1 : float, optional Noise level in the first cube. noise_value2 : float, optional Noise level in the second cube. data_format : str, optional Method to arange cube into 2D. Only ‘intensity’ is currently implemented.

cube1 : numpy.ndarray or astropy.io.fits.PrimaryHDU or SpectralCube

First cube to compare.

cube2 : numpy.ndarray or astropy.io.fits.PrimaryHDU or SpectralCube

Second cube to compare.

noise_value1 : float, optional

Noise level in the first cube.

noise_value2 : float, optional

Noise level in the second cube.

data_format : str, optional

Method to arange cube into 2D. Only ‘intensity’ is currently implemented.

Methods Summary

`cramer_statistic`([n_jobs])	Applies the Cramer Statistic to the datasets.
`distance_metric`([normalize, n_jobs])	This serves as a simple wrapper in order to remain with the coding convention used throughout the rest of this project.
`format_data`([data_format, seed, normalize])	Rearrange data into a 2D object using the given format.

Methods Documentation

cramer_statistic(n_jobs=1)[source] [edit on github]¶

Applies the Cramer Statistic to the datasets.

Parameters:

Parameters:	n_jobs : int, optional Sets the number of cores to use to calculate pairwise distances. Default is 1.

n_jobs : int, optional

Sets the number of cores to use to calculate pairwise distances. Default is 1.

distance_metric(normalize=True, n_jobs=1)[source] [edit on github]¶

This serves as a simple wrapper in order to remain with the coding convention used throughout the rest of this project.

Parameters:

Parameters:	normalize : bool, optional See `Cramer_Distance.format_data`. n_jobs : int, optional See `Cramer_Distance.cramer_statistic`.

normalize : bool, optional

See Cramer_Distance.format_data.

n_jobs : int, optional

See Cramer_Distance.cramer_statistic.

format_data(data_format='intensity', seed=13024, normalize=True, **kwargs)[source] [edit on github]¶

Rearrange data into a 2D object using the given format.

Parameters:

Parameters:	data_format : {‘intensity’, ‘spectra’}, optional The method to use to construct the data matrix. The default is intensity, which picks the brightest values in each channel. The other option is ‘spectra’, which will pick the N brightest spectra to compare. seed : int, optional When the data are mismatched, the larger data set is randomly sampled to match the size of the other. normalize : bool, optional Forces the data sets into the same interval, removing the effect of different ranges of intensities (or whatever unit the data traces). kwargs : Passed to `_format_data`.

data_format : {‘intensity’, ‘spectra’}, optional

The method to use to construct the data matrix. The default is intensity, which picks the brightest values in each channel. The other option is ‘spectra’, which will pick the N brightest spectra to compare.

seed : int, optional

When the data are mismatched, the larger data set is randomly sampled to match the size of the other.

normalize : bool, optional

Forces the data sets into the same interval, removing the effect of different ranges of intensities (or whatever unit the data traces).

kwargs : Passed to _format_data.