PDF_Distance¶

class turbustat.statistics.PDF_Distance(img1, img2, min_val1=-inf, min_val2=-inf, do_fit=True, normalization_type=None, nbins=None, weights1=None, weights2=None, bin_min=None, bin_max=None)[source] [edit on github]¶

Bases: object

Calculate the distance between two arrays using their PDFs.

Note

Pre-computed PDF classes cannot be passed to PDF_Distance as the data need to be normalized and the PDFs should use the same set of histogram bins.

Parameters:

Parameters:	img1 : numpy.ndarray or astropy.io.fits.PrimaryHDU or astropy.io.fits.ImageHDU or spectral_cube.Projection or spectral_cube.Slice or SpectralCube Array (1-3D). img2 : numpy.ndarray or astropy.io.fits.PrimaryHDU or astropy.io.fits.ImageHDU or spectral_cube.Projection or spectral_cube.Slice or SpectralCube Array (1-3D). min_val1 : float, optional Minimum value to keep in img1 min_val2 : float, optional Minimum value to keep in img2 do_fit : bool, optional Enables fitting a lognormal distribution to each data set. normalization_type : {“normalize”, “normalize_by_mean”}, optional See `data_normalization`. nbins : int, optional Manually set the number of bins to use for creating the PDFs. weights1 : numpy.ndarray or astropy.io.fits.PrimaryHDU or astropy.io.fits.ImageHDU or spectral_cube.Projection or spectral_cube.Slice or SpectralCube, optional Weights to be used with img1 weights2 : numpy.ndarray or astropy.io.fits.PrimaryHDU or astropy.io.fits.ImageHDU or spectral_cube.Projection or spectral_cube.Slice or SpectralCube, optional Weights to be used with img2 bin_min : float, optional Minimum value to use for the histogram bins after normalization is applied. bin_max : float, optional Maximum value to use for the histogram bins after normalization is applied.

img1 : numpy.ndarray or astropy.io.fits.PrimaryHDU or astropy.io.fits.ImageHDU or spectral_cube.Projection or spectral_cube.Slice or SpectralCube: Array (1-3D).
img2 : numpy.ndarray or astropy.io.fits.PrimaryHDU or astropy.io.fits.ImageHDU or spectral_cube.Projection or spectral_cube.Slice or SpectralCube: Array (1-3D).
min_val1 : float, optional: Minimum value to keep in img1
min_val2 : float, optional: Minimum value to keep in img2
do_fit : bool, optional: Enables fitting a lognormal distribution to each data set.
normalization_type : {“normalize”, “normalize_by_mean”}, optional: See data_normalization.
nbins : int, optional: Manually set the number of bins to use for creating the PDFs.
weights1 : numpy.ndarray or astropy.io.fits.PrimaryHDU or astropy.io.fits.ImageHDU or spectral_cube.Projection or spectral_cube.Slice or SpectralCube, optional: Weights to be used with img1
weights2 : numpy.ndarray or astropy.io.fits.PrimaryHDU or astropy.io.fits.ImageHDU or spectral_cube.Projection or spectral_cube.Slice or SpectralCube, optional: Weights to be used with img2
bin_min : float, optional: Minimum value to use for the histogram bins after normalization is applied.
bin_max : float, optional: Maximum value to use for the histogram bins after normalization is applied.

Methods Summary

`compute_ad_distance`(self)	Compute the distance using the Anderson-Darling Test.
`compute_hellinger_distance`(self)	Computes the Hellinger Distance between the two PDFs.
`compute_ks_distance`(self)	Compute the distance using the KS Test.
`compute_lognormal_distance`(self)	Compute the combined t-statistic for the difference in the widths of a lognormal distribution.
`distance_metric`(self[, statistic, verbose, …])	Calculate the distance.

Methods Documentation

compute_ad_distance(self)[source] [edit on github]¶: Compute the distance using the Anderson-Darling Test.

compute_hellinger_distance(self)[source] [edit on github]¶: Computes the Hellinger Distance between the two PDFs.

compute_ks_distance(self)[source] [edit on github]¶: Compute the distance using the KS Test.

compute_lognormal_distance(self)[source] [edit on github]¶: Compute the combined t-statistic for the difference in the widths of a lognormal distribution.

distance_metric(self, statistic='all', verbose=False, plot_kwargs1={'color': 'b', 'marker': 'D', 'label': '1'}, plot_kwargs2={'color': 'g', 'marker': 'o', 'label': '2'}, save_name=None)[source] [edit on github]¶

Calculate the distance. NOTE: The data are standardized before comparing to ensure the distance is calculated on the same scales.

Parameters:

Parameters:	statistic : ‘all’, ‘hellinger’, ‘ks’, ‘lognormal’ Which measure of distance to use. labels : tuple, optional Sets the labels in the output plot. verbose : bool, optional Enables plotting. plot_kwargs1 : dict, optional Pass kwargs to `plot` for `dataset1`. plot_kwargs2 : dict, optional Pass kwargs to `plot` for `dataset2`. save_name : str,optional Save the figure when a file name is given.

statistic : ‘all’, ‘hellinger’, ‘ks’, ‘lognormal’: Which measure of distance to use.
labels : tuple, optional: Sets the labels in the output plot.
verbose : bool, optional: Enables plotting.
plot_kwargs1 : dict, optional: Pass kwargs to plot for dataset1.
plot_kwargs2 : dict, optional: Pass kwargs to plot for dataset2.
save_name : str,optional: Save the figure when a file name is given.