➥ Tip! Refine or expand your search. Authors are sometimes listed as 'Smith, J. K.' instead of 'Smith, John' so it is useful to search for last names only. Note this is currently a simple phrase search.
WeirdestGalaxies finds the weirdest galaxies in the Sloan Digital Sky Survey (SDSS) by using a basic outlier detection algorithm. It uses an unsupervised Random Forest (RF) algorithm to assign a similarity measure (or distance) between every pair of galaxy spectra in the SDSS. It then uses the distance matrix to find the galaxies that have the largest distance, on average, from the rest of the galaxies in the sample, and defined them as outliers.
PRF (Probabilistic Random Forest) is a machine learning algorithm for noisy datasets. The PRF is a modification of the long-established Random Forest (RF) algorithm, and takes into account uncertainties in the measurements (i.e., features) as well as in the assigned classes (i.e., labels). To do so, the Probabilistic Random Forest (PRF) algorithm treats the features and labels as probability distribution functions, rather than as deterministic quantities.