Category Archives: news

Best Practices for Software Registries and Repositories

by Alejandra Gonzalez-Beltran, Alice Allen, Allen Lee, Daniel Garijo, Thomas Morrell, SciCodes Consortium

This post is cross-posted on the SciCodes website, the US Research Software Sustainability Institute blog, the UK Software Sustainability Institute blog, and the FORCE11 blog.

Software is a fundamental element of the scientific process, and cataloguing scientific software is helpful to enable software discoverability. During the years 2019-2020, the Task Force on Best Practices for Software Registries of the FORCE11 Software Citation Implementation Working Group worked to create Nine Best Practices for Scientific Software Registries and Repositories. In this post, we explain why scientific software registries and repositories are important, why we wanted to create a list of best practices for such registries and repositories, the process we followed, what the best practices include, and what the next steps for this community are.

Why are scientific software registries and repositories important?

Scientific software registries and repositories support identifying and finding software, provide information for software citation, foster long-term preservation and reuse of computational methods, and ultimately, improve research reproducibility and replicability.

Why did we write these guidelines?

Managers of scientific software registries and repositories have been working independently to run their services and provide useful information and tools to users in different communities. The Best Practices for Software Registries Task Force participants had different perspectives representing a heterogeneous set of resources, but came together for the common goal of creating a list of best practices for scientific software registries. These shared practices help to raise awareness of software as a research output, enable credit for software creators, and guide curators working on software catalogues through the steps to consider when setting up their software registries. In the longer term, we hope to improve the interoperability of the software metadata supported by different services.

The goals that we considered for writing the guidelines were:

  • to have a minimal number of best practices, easy to adopt by repository managers
  • to be broadly applicable to most or all of our resources
  • to be descriptive on a meta level, not prescriptive, and focused on what the best practices should do or provide, not on what a suggested policy or element should specifically say.

What are the best practices?

Our guidelines, listed below, provide an overview of the key points to take into consideration when creating a software registry. They are:

  • Provide a public scope statement (examples)
  • Provide guidance for users
  • Provide guidance to software contributors
  • Establish an authorship policy (examples)
  • Share your metadata schema (examples)
  • Stipulate conditions of use (examples)
  • State a privacy policy (examples)
  • Provide a retention policy (examples)
  • Disclose your end-of-life policy (examples)

Our pre-print offers more explanation about each guideline and a longer list of implementations that we found when we were doing our work on these practices.

What process did we follow to produce the guidelines?

Representatives from numerous software registries and repositories were involved in the FORCE11 Software Citation Implementation Working Group (SCIWG). Alice Allen proposed that we form a task force within the SCIWG for writing up some best practices for the registries and repositories, and with acceptance by the co-chairs of the SCIWG and interest from relevant people, the Task Force on Best Practices for Software Registries was formed. Initially, we gathered information from members of this Task Force to learn more about each resource and to identify some of our overlapping interests. We then identified potential best practices based on prior issues we experienced running our services and  discussed what each potential practice might include or exclude.

Through iterative deliberations, we determined which of the potential practices were the most broadly applicable. With generous funding from the Alfred P. Sloan Foundation, we hosted a workshop for scientific registries and repositories, part of which was devoted to gathering final consensus around the Best Practices.  The workshop included registries who were not part of the Task Force, resulting in a broader set of contributions to the final list.

What are the next steps for the group?

Our goal is to continue our efforts by implementing these practices more uniformly in our own registries and repositories and reducing the burdens of adoption. We have created SciCodes, a consortium of scientific software registries and repositories, which is now defining the next priorities to tackle, such as tracking the impact of good metadata, improving interoperability between registries, and making our metadata more discoverable by search engines and services such as Google Scholar, ORCID, and discipline indexes. We are also sharing tools and ideas in a series of presentations that are recorded and available for viewing on the SciCodes website, so please check them out!

July 2021 additions to the ASCL

Thirty codes were added to the ASCL in July:

AlignBandColors: Inter-color-band image alignment tool
ART: A Reconstruction Tool
Balrog: Astronomical image simulation
Chem-I-Calc: Chemical Information Calculator
cosmic_variance: Cosmic variance calculator

FoF-Halo-finder: Halo location and size
GUBAS: General Use Binary Asteroid Simulator
HERMES: High-Energy Radiative MESsengers
K2-CPM: Causal Pixel Model for K2 data
K2mosaic: Mosaic Kepler pixel data

Kd-match: Correspondences of objects between two catalogs through pattern matching
KeplerPORTS: Kepler Planet Occurrence Rate Tools
light-curve: Light curve analysis toolbox
MCPM: Modified CPM method
nimbus: A Bayesian inference framework to constrain kilonova models

PHL: Persistent_Homology_LSS
PlaSim: Planet Simulator
PMN-body: Particle Mesh N-body code
PyCactus: Post-processing tools for Cactus computational toolkit simulation data
PyROA: Modeling quasar light curves

ReionYuga: Epoch of Reionization neutral Hydrogen field generator
RePrimAnd: Recovery of Primitives And EOS framework
ROA: Running Optimal Average
shapelens: Astronomical image analysis and shape estimation framework
shear-stacking: Stacked shear profiles and tests based upon them

Skylens++: Simulation package for optical astronomical observations
Skymapper: Mapping astronomical survey data on the sky
snmachine: Photometric supernova classification
SpArcFiRe: SPiral ARC FInder and REporter
TRINITY: Dark matter halos, galaxies and supermassive black holes empirical model

June 2021 additions to the ASCL

Forty codes were added to the ASCL in June:

ATES: ATmospheric EScape
atmos: Coupled climate–photochemistry model
aztekas: GRHD numerical code
baofit: Fit cosmological data to measure baryon acoustic oscillations
BiFFT: Fast estimation of the bispectrum

BiHalofit: Fitting formula of non-linear matter bispectrum
CalPriorSNIa: Effective calibration prior on the absolute magnitude of Type Ia supernovae
CoMover: Bayesian probability of co-moving stars
crowdsource: Crowded field photometry pipeline
DarkSirensStat: Measuring modified GW propagation and the Hubble parameter

DM_statistics: Statistics of the cosmological dispersion measure (DM)
dopmap: Fast Doppler mapping program
ehtplot: Plotting functions for the Event Horizon Telescope
EMBERS: Experimental Measurement of BEam Responses with Satellites
FRBSTATS: A web-based platform for visualization of fast radio burst properties

GLEMuR: GPU-based Lagrangian mimEtic Magnetic Relaxation
IRAGNSEP: Spectral energy distribution fitting code
Kadath: Spectral solver
Katu: Interaction of particles in plasma simulator
KOBE: Kepler Observes Bern Exoplanets

Lemon: Linear integral Equations’ Monte carlo solver based On the Neumann solution
MakeCloud: Turbulent GMC initial conditions for GIZMO
Maneage: Managing data lineage
Marvin: Data access and visualization for MaNGA
ModeChord: Primordial scalar and tensor power spectra solver

MultiModeCode: Numerical exploration of multifield inflation models
picca: Package for Igm Cosmological-Correlations Analyses
PORTA: POlarized Radiative TrAnsfer
PyDoppler: Wrapper for Doppler tomography software
Pyshellspec: Binary systems with circumstellar matter

QuasarNET: CNN for redshifting and classification of astrophysical spectra
RedPipe: Reduction Pipeline
redvsblue: Quasar and emission line redshift fitting
simple_reg_dem: Differential Emission Measures in the solar corona
simqso: Simulated quasar spectra generator

so_noise_models: Simons Observatory N(ell) noise models
StarcNet: Convolutional neural network for classifying galaxy images into morphological classes
STaRS: Sejong Radiative Transfer through Raman and Rayleigh Scattering with atomic hydrogen
ztf-viewer: SNAD ZTF data releases object viewer
ZWAD: ZTF anomaly detection

ADASS Prize for an Outstanding Contribution to Astronomical Software

Awarded for the first time in 2020, the Astronomical Data Analysis Software and Systems (ADASS) Prize for an Outstanding Contribution to Astronomical Software is awarded every year at the annual conference. Eligible candidates are the developers of astronomical software ranging from those that have stood the test of time to those that are new and cutting edge. Depending on the software and the nomination, the award is given to either a group or an individual. Nominations can be for a single program, a package, or a library.

Nominations for this year’s ADASS Software Prize are due by midnight UTC on June 15th June 18. After that date, the Program Organizing Committee (POC) will review the nominations and descriptions and determine the winner. The winning software author, or a representative of the winning team, will be invited to give a talk at ADASS this year, have their ADASS conference fee waived, and receive a plaque.

May 2021 additions to the ASCL

Twenty-two codes were added to the ASCL in May:

ATARRI: A TESS Archive RR Lyrae Classifier
BHPToolkit: Black Hole Perturbation Toolkit
BlackBOX: BlackGEM and MeerLICHT image reduction software
ClaRAN: Classifying Radio sources Automatically with Neural networks
COMPAS: Rapid binary population synthesis code

CUDAHM: MCMC sampling of hierarchical models with GPUs
encore: Efficient isotropic 2-, 3-, 4-, 5- and 6-point correlation functions
Kepler’s Goat Herd: Solving Kepler’s equation via contour integration
MCALF: Velocity information from spectral imaging observations
MeerCRAB: Transient classifier using a deep learning model

orvara: Orbits from Radial Velocity, Absolute, and/or Relative Astrometry
PAP: PHANGS-ALMA pipeline
PDM2: Phase Dispersion Minimization
PFITS: Spectra data reduction
Pyrat Bay: Python Radiative Transfer in a Bayesian framework

PyTorchDIA: Difference Image Analysis tool
RandomQuintessence: Integrate the Klein-Gordon and Friedmann equations with random initial conditions
SISPO: Imaging simulator for small solar system body missions
SpheCow: Galaxy and dark matter halo dynamical properties
TesseRACt: Tessellation-based Recovery of Amorphous halo Concentrations

The Sequencer: Detect one-dimensional sequences in complex datasets
ZOGY: Python implementation of proper image subtraction

April 2021 additions to the ASCL

Thirty-one codes were added to the ASCL in April:

Bagpipes: Bayesian Analysis of Galaxies for Physical Inference and Parameter EStimation
cmblensplus: Cosmic microwave background tools
CTR: Coronal Temperature Reconstruction
dense_basis: Dense Basis SED fitting

EPIC5: Lindblad orbits in ovally perturbed potentials
Freeture: Free software to capTure meteors
GAMMA: Relativistic hydro and local cooling on a moving mesh
GGchem: Fast thermo-chemical equilibrium code

globalemu: Global (sky-averaged) 21-cm signal emulation
hera_opm: The HERA Online Processing Module
Hilal-Obs: Authentication agorithm for new moon visibility report
LaFuLi: NASA Langley Fu-Liou radiative transfer code

LAPACK: Linear Algebra PACKage
Librarian: The HERA Librarian
linemake: Line list generator
lofti_gaiaDR2: Orbit fitting with Gaia astrometry

Mo’Astro: MongoDB framework for observational astronomy
OpacityTool: Dust opacities for disk modeling
OpTool: Command-line driven tool for creating complex dust opacities
pfits: PSRFITS-format data file processor

Posidonius: N-Body simulator for planetary and/or binary systems
PyBird: Python code for biased tracers in redshift space
RadioFisher: Fisher forecasting for 21cm intensity mapping and spectroscopic galaxy surveys
RJObject: Reversible Jump Objects

Skye: Equation of state for fully ionized matter
Skyoffset: Sky offset optimization and mosaicing toolkit
SpaceHub: High precision few-body and large scale N-body simulations
Spectractor: Spectrum extraction tool for slitless spectrophotometry

SpectRes: Simple spectral resampling
SSSpaNG: Stellar Spectra as Sparse Non-Gaussian Processes
TES: Terrestrial Exoplanet Simulator

March 2021 additions to the ASCL

Thirty-one codes were added to the ASCL in March:

21cmDeepLearning: Matter density map extractor
ARTIS: 3D Monte Carlo radiative transfer code for supernovae
Astro-Fix: Correcting astronomical bad pixels in Python
AstroNet-Triage: Neural network for TESS light curve triage

AstroNet-Vetting: Neural network for TESS light curve vetting
Carsus: Atomic database for astronomy
CARTA: Cube Analysis and Rendering Tool for Astronomy
CRIME: Cosmological Realizations for Intensity Mapping Experiments

DarkEmulator: Cosmological emulation code for halo clustering statistics
DIAPHANE: Library for radiation and neutrino transport in hydrodynamical simulations
DRAKE: Relic density in concrete models prediction
GalacticDNSMass: Bayesian inference determination of mass distribution of Galactic double neutron stars

GalLenspy: Reconstruction of mass profile in disc-like galaxies from the gravitational lensing effect
ggm: Gaussian gradient magnitude filtering of astronomical images
hfs_fit: Atomic emission spectral line hyperfine structure fitting
LPF: Real-time detection of transient sources in radio data streams

nestle: Nested sampling algorithms for evaluating Bayesian evidence
PION: Computational fluid-dynamics package for astrophysics
Pyedra: Python implementation for asteroid phase curve fitting
PyPion: Post-processing code for PION simulation data

QuickCBC: Rapid and reliable inference for binary mergers
RAiSERed: Analytic AGN model based code for radio-frequency redshifts
redshifts: Spectroscopic redshifts search tool
satcand: Orbital stability and tidal migration constraints for KOI exomoon candidates

schNell: Fast calculation of N_ell for GW anisotropies
Silo: Saving scientific data to binary disk files
spalipy: Detection-based astronomical image registration
SparseBLS: Box-Fitting Least Squares implementation for sparse data

SUPERNU: Radiative transfer code for explosive outflows using Monte Carlo methods
TFF: Template Fourier Fitting
TransitFit: Exoplanet transit fitting package for multi-telescope datasets

MTU Colloquium talk on Schrödinger’s code: Opening the computational box

On Thursday, March 18, I am giving the physics colloquium at Michigan Technological University (MTU), which has hosted the Astrophysics Source Code Library since the ASCL’s inception in 1999. Despite having worked on the ASCL for nearly eleven years, I’ve never been to MTU; though I wish the visit could be in person, the talk will be presented virtually over Zoom. The presentation abstract is below, as is a link to the slides and links for all of the citations and resources mentioned in the talk.

Abstract: Though computational methods are widely used in many disciplines, many researchers do not share the source code they develop, making it difficult to replicate and reuse the work. This presentation will cover the changing landscape that includes funders’ requirements, policy changes for existing journals, community resources, and more, that make it easy to release and archive codes to ensure they are available to support the research they enabled, improve the reproducibility of science, increase confidence in research, and meet new requirements made by funders and journals in many disciplines. It will also cover how the Astrophysics Source Code Library (ASCL), which has been working since 1999 to improve the transparency of research by registering open codes used in research, has made it possible for software to be cited as a first-order research object, and how researchers can garner credit for their codes by having them cited correctly and improve papers by including citations for the computational methods that enabled the research.

Slides (PDF)

Journals

Astronomy and Computing (A&C)
Computational Astrophysics and Cosmology (ComAC)
Computing and Software for Big Science
Computer Physics Communications (CPC)
Journal of Open Research Software (JORS)
SoftwareX
Journal of Open Source Software (JOSS)

Change leaders and guidelines

CITATION file format (CFF)
CodeMeta
FAIR principles
FORCE11/FORCE11 Software Citation Principles
Software Sustainability Institute
Working toward Sustainable Software for Science: Practice and Experiences (WSSSPE)

Social coding sites and archival services

Bitbucket
DOE CODE; more information
Figshare
GitHub
Software Heritage
Zenodo

Other resources

Asclepias
arXiv/arXiv Next Generation
CiteAs
DataCite
Software licensing resources | Licensing Astrophysics Codes special session at AAS 225
Papers with Code

Cited sources (in order of appearance)

Goble (2014)
Ince, Hatton, & Graham-Cumming (2012)
Allen, Teuben, & Ryan (2018)
Ryan, Allen, & Teuben (2019); Data and code
Collberg, Proebsting, & Warren (2014), PDF
Howison & Bullard (2016)
Mangul et al (2018)
Zorotovic, Schreiber, & Parsons (2014)
Smart (2018)
Neupane et al (2019); Vice article
Barba (2019)
DOE policy
DOE policy FAQ
NASA ROSES policy
NSF policy
Nature Portfolio policy
Science policy
AAS Journals policy

February 2021 additions to the ASCL

Thirty codes were added to the ASCL in February:

BALRoGO: Bayesian Astrometric Likelihood Recovery of Galactic Objects
binaryoffset: Detecting and correcting the binary offset effect in CCDs
CMasher: Scientific colormaps for making accessible, informative plots
DaMaSCUS-SUN: Dark Matter Simulation Code for Underground Scatterings – Sun Edition
EqTide: Equilibrium Tide calculations

extinction: Dust extinction laws
ForwardDiff: Forward mode automatic differentiation for Julia
GalRotpy: Parametrize the rotation curve and gravitational potential of disk-like galaxies
GLEAM: Galaxy Line Emission and Absorption Modeling
hardCORE: Exoplanet core radius fractions calculator

HUAYNO: Hierarchically split-Up AstrophYsical N-body sOlver N-body code
lensingGW: Lensing of gravitational waves
Lightbeam: Simulate light through weakly-guiding waveguides
mirkwood: SED modeling using machine learning
MOSAIC: Multipole operator generator for Fast Multipole Method operators

MST: Minimum Spanning Tree algorithm for identifying large-scale filaments
Multi_CLASS: Cross-tracer angular power spectra of number counts using CLASS
MUSE-PSFR: PSF reconstruction for MUSE WFM-AO mode
nway: Bayesian cross-matching of astronomical catalogs
OPUS: Interoperable access to analysis and simulation codes

Piff: PSFs In the Full FOV
Pixell: Rectangular pixel map manipulation and harmonic analysis library
polgraw-allsky: All-sky almost-monochromatic gravitational-wave pipeline
PyAutoFit: Classy probabilistic programming
PyFstat: Continuous gravitational-wave data analysis

RASSINE: Normalizing 1D stellar spectra
spinOS: SPectroscopic and INterferometric Orbital Solution finder
ThumbStack: Map and profile stacking pipeline
viscm: Analyzing colormaps and creating new colormaps
X-PSI: X-ray Pulse Simulation and Inference

January 2021 additions to the ASCL

Eighteen codes were added to the ASCL in January:

3LPT-init: Initial conditions with third-order Lagrangian perturbation for cosmological N-body simulations
apogee: Tools for APOGEE data
Avocado: Photometric classification of astronomical transients and variables with biased spectroscopic samples
BAYES-LOSVD: Bayesian framework for non-parametric extraction of the LOSVD
cFS: core Flight System

Curvit: Create light curves from UVIT data
DarpanX: X-ray reflectivity of multilayer mirrors
Eigentools: Tools for studying linear eigenvalue problems
EphemMatch: Ephemeris matching of DR25 TCEs, KOIs, and EBs for false positive identification
Mask galaxy: Machine learning pipeline for morphological segmentation of galaxies

Nigraha: Find and evaluate planet candidates from TESS light curves
Octo-Tiger: HPX parallelized 3-D hydrodynamic code for stellar mergers
ptemcee: A parallel-tempered version of emcee
pyUPMASK: Unsupervised clustering method for stellar clusters
PyXspec: Python interface to XSPEC spectral-fitting program

radiowinds: Radio Emission from Stellar Winds
stratsi: Stratified streaming instability
whereistheplanet: Predicting positions of directly imaged companions