The ASCL at AAS 235

The ASCL is participating in the American Astronomical Society (AAS) meeting that started yesterday in Honolulu, Hawai’i. We have two events, both on Sunday, January 5:

Best ways to let others know how to cite your research software
January 5; Poster 109.12
Software citation is good for research transparency and reproducibility, and maybe, if you work it right, for your CV, too. You can get credit and recognition through citations for your code! This presentation highlights several powerful methods for increasing the probability that use of your research software will be cited, and cited correctly. The presentation covers how to create codemeta.json and CITATION.cff automagically from Astrophysics Source Code Library (ASCL ascl.net) entries, edit, and use these files, the value of including such files on your code site(s), and efforts underway in astronomy and other fields to improve software citation and credit.

The Future and Future Governance of the Astrophysics Source Code Library
January 5, 2:00 PM – 3:30 PM; HCC – Room 301B
Over the past ten years, the Astrophysics Source Code Library (ASCL, ascl.net) has grown from a small repository holding about 40 codes with hand-coded HTML pages maintained by one person to a resource with citable entries on over 2000 codes with a modern database structure that is user- and editor-friendly maintained by a small group of volunteers. With its 20th anniversary now behind it, it’s time to look at the resource and its governance and management. Does its current structure best serve the astro community? What changes would you like to see to its governance? We don’t know the answers to these and other questions! Please join us for an open discussion on the resource and what a new governance model for the ASCL might be.

November additions to the ASCL

Twenty-four codes were added to the ASCL in November 2019:

ATHOS: A Tool for HOmogenizing Stellar parameters
ATLAS: Turning Dopplergram images into frequency shift measurements
CLUSTEREASY: Lattice simulator for evolving interacting scalar fields in an expanding universe on parallel computing clusters
comb: Spectral line data reduction and analysis package

FFTLog-and-beyond: Generalized FFTLog algorithm
frbpoppy: Fast radio burst population synthesis in Python
Fruitbat: Fast radio burst redshift estimation
HeatingRate: Radioactive heating rate and macronova (kilonova) light curve

HLattice: Scalar fields and gravity simulator for the early universe
IDG: Image Domain Gridding
LATTICEEASY: Lattice simulator for evolving interacting scalar fields in an expanding universe
MARTINI: Mock spatially resolved spectral line observations of simulated galaxies

miluphcuda: Smooth particle hydrodynamics code
MORDI: Massively-Overlapped Ring-Diagram Inversion
OpenSPH: Astrophysical SPH and N-body simulations and interactive visualization tools
OrbWeaver: Galaxy/(sub)halo orbital processing tool

PLAN: A Clump-finder for Planetesimal Formation Simulations
planetplanet: General photodynamical code for exoplanet light curves
PypeIt: Python spectroscopic data reduction pipeline
TreeFrog: Construct halo merger trees and compare halo catalogs

uvplot: Interferometric visibilities plotter
VELOCIraptor-STF: Six-dimensional Friends-of-Friends phase space halo finder
WhereWolf: Galaxy/(sub)Halo ghosting tool for N-body simulations
Zeltron: Explicit 3D relativistic electromagnetic Particle-In-Cell code

A workshop for scientific software registries and repositories

I am involved in several efforts, in addition to the ASCL, to improve recognition and credit for software authors; one such effort is the FORCE11 Software Citation Implementation Working Group (SCIWG), in which several software registries and repositories are involved. These resources, along with others not part of the SCIWG, have formed a Repository Best Practice Task Force, which has held monthly conference calls this year to collaboratively develop a list of best practices for such resources. This has also been an excellent vehicle for enabling people who run these resources to share information about managing software registries and working with software authors, researchers, and journal editors to improve software citation.

Thanks to funding from the Sloan Foundation, members of this Task Force and other software resources are coming together in a Scientific Software Registry Collaboration Workshop to demonstrate unique aspects of our respective services, discuss challenges and share solutions to common issues that arise in managing our resources, finalize a list of best practices for our resources, and work cooperatively to speed adoption of the CodeMeta and/or Citation File Format standards. The workshop has been organized by the Caltech Library and ASCL, and takes place at the University of Maryland (College Park) this coming Wednesday and Thursday (November 13-14). It includes presentations by software registry managers and subject matter experts, break-out sessions for collaborative work, and group discussion.

I’m happy to say we are able to provide remote access to most of the plenary portions of the workshop through Webex; links on the workshop agenda identify the sessions available over Webex. As the workshop has an element of unconferencing, it’s possible that additional portions of the workshop will be suitable for Webex and if so, we will update the agenda accordingly. In addition, we will have someone live-scribing the event; a link to the Google Doc for these notes will be added to the agenda webpage before the workshop begins.

A major focus of this workshop is to discuss and finalize the best practices that have been identified so far in our monthly conference calls. A draft list of the practices (PDF) is available for download below; these are the practices we will be working on in break-out groups during the workshop. Links to the Google Docs we will be using for these breakout sessions are listed on the agenda; this offers another way for anyone interested to see the work being done in this meeting.

I have wanted to meet with others doing work similar to that I do on the ASCL for a long time, and am very grateful to Tom Morrell, Mike Hucka, and Stephen Davison from Caltech Libraries for partnering with me to organize this workshop, and to Josh Greenberg at the Sloan Foundation for thinking this workshop was a good idea and funding the project. My thanks to all of them!

Draft list of Best Practices for research software registries (pdf)

October additions to the ASCL

Twenty-two codes were added to the ASCL in October 2019:

a3cosmos-gas-evolution: Galaxy cold molecular gas evolution functions
ANNz2: Estimating photometric redshift and probability density functions using machine learning methods
AOtools: Adaptive optics modeling and analysis toolkit
AOTOOLS: Reduce IR images from Adaptive Optics
ChainConsumer: Corner plots, LaTeX tables and plotting walks

Cobaya: Bayesian analysis in cosmology
DM_phase: Algorithm for correcting dispersion of radio signals
E0102-VR: Virtual Reality application to visualize the optical ejecta in SNR 1E 0102.2-7219
ECLIPS3D: Linear wave and circulation calculations
EMERGE: Empirical ModEl for the foRmation of GalaxiEs

exoplanet: Probabilistic modeling of transit or radial velocity observations of exoplanets
GetDist: Monte Carlo sample analyzer
LEO-Py: Likelihood Estimation of Observational data with Python
MarsLux: Illumination Mars maps generator
MiSTree: Construct and analyze Minimum Spanning Tree graphs

OCD: O’Connell Effect Detector using push-pull learning
orbitize: Orbit-fitting for directly imaged objects
PEXO: Precise EXOplanetology
PINK: Parallelized rotation and flipping INvariant Kohonen maps
PreProFit: Pressure Profile Fitter for galaxy clusters in Python

qnm: Kerr quasinormal modes, separation constants, and spherical-spheroidal mixing coefficients calculator
TLS: Transit Least Squares

September additions to the ASCL

Fourteen codes were added to the ASCL in September 2019:

AREPO: Cosmological magnetohydrodynamical moving-mesh simulation code
Auto-multithresh: Automated masking for clean
ChempyMulti: Multi-star Bayesian inference with Chempy
CLOVER: Convolutional neural network spectra identifier and kinematics predictor
EBHLIGHT: General relativistic radiation magnetohydrodynamics with Monte Carlo transport

EPOS: Exoplanet Population Observation Simulator
fgivenx: Functional posterior plotter
HADES: Hexadecapolar Analysis for Dust Estimation in Simulations (of CMB B-mode thermal dust emission)
HISS: HI spectra stacker
MultiColorFits: Colorize and combine multiple fits images for visually aesthetic scientific plots

RascalC: Fast code for galaxy covariance matrix estimation
SecularMultiple: Hierarchical multiple system secular evolution model
TPI: Test Particle Integrator
WVTICs: SPH initial conditions using Weighted Voronoi Tesselations

ASCL research poster at ADASS XXIX

This presentation covers research on software authorship and citation, which we carried out between July and September 2019. We examined codes authored by three or fewer people (“short author list” codes) and codes authored by institutional teams, to determine how many codes in the ASCL can be attributed to one of these categories. Utilizing ADS data, we measured the number of citations per authorship category. We carried out further research to determine whether we could infer software usage and code usage statistics from the number of citations to code description papers. Our research shows that citations to code description papers are not a reliable proxy for software usage.

P. Wesley Ryan, Astrophysics Source Code Library

Download poster (PDF)

August additions to the ASCL

Twenty-five codes were added to the ASCL in August 2019:

actsnclass: Active learning for supernova photometric classification
Analysator: Quantitative analysis of Vlasiator files
BEAST: Bayesian Extinction And Stellar Tool
bias_emulator: Halo bias emulator
dips: Detrending periodic signals in timeseries

DustCharge: Charge distribution for a dust grain
EBAI: Eclipsing Binaries with Artificial Intelligence
FastCSWT: Fast directional Continuous Spherical Wavelet Transform
FIRST Classifier: Automated compact and extended radio sources classifier
GBKFIT: Galaxy kinematic modeling

Gramsci: GRAph Made Statistics for Cosmological Information
JPLephem: Jet Propulsion Lab ephemerides package
MAESTROeX: Low Mach number stellar hydrodynamics code
Molsoft: Molonglo Telescope Observing Software
MosfireDRP: MOSFIRE Data Reduction Pipeline

NuRadioMC: Monte Carlo simulation package for radio neutrino detectors
oscode: Oscillatory ordinary differential equation solver
PyRADS: Python RADiation model for planetary atmosphereS
PYSAT: Python Satellite Data Analysis Toolkit
QAC: Quick Array Combinations front end to CASA

QLF: Luminosity function analysis code
SNAPDRAGONS: Stellar Numbers And Parameters Determined Routinely And Generated Observing N-body Systems
TRISTAN-MP: TRIdimensional STANford – Massively Parallel code
Vlasiator: Hybrid-Vlasov simulation code
YMW16: Electron-density model

(per apparent established practice)

I’ve set a goal of bringing the number of entries missing preferred citation information to under 1000, though that might be just beyond possible. When I started this process, there were 1284 entries without a preferred citation; I’ve examined the software sites and documentation of 150+ of these codes so far and have found explicit citation information for just over 14% of these.

In general, we include a preferred citation in an ASCL record when a code’s site or documentation explicitly states what should be cited (“cite [code] with this [ASCL entry/article/DOI/etc.]”). We don’t assume a paper listed under “References” or “Articles” is intended to be for citation, though that may be the intent of some authors listing them, as some list these papers because a code is built upon others’ work, or these papers include research that used the software.

In some cases, a particular software has no citations to the ASCL record and numerous citations (> 25, let’s say) to a code description paper even though the download site or repo does not specify how the software should be cited. Allowing this “apparent established practice” of citation to substitute for an explicit statement and listing the description paper as the preferred citation seems fair to me, and valuable to those who want to do the right thing by citing a software package but don’t find guidance for how to do so on the code’s site.

We very much prefer that authors provide explicit information on their preferred citation for their programming work, but where they don’t, and where there is an apparent established practice of citation, we will now list that citation method as the preferred citation in the ASCL entry. So far, this inferred information has been added to 15 ASCL entries.
Partial screenshot showing location of link to suggest a change or addition to an ASCL entry

Do you want to discuss different software citation methods before selecting a preferred method? Did I get your software’s preferred citation wrong or miss it entirely? If so, please let me know via email or the Suggest a change link at the bottom of your code’s ASCL entry.

Codes past, still present (updated 8/21/2019)

At lunch yesterday, I was asked in what year the earliest code ASCL has was written (or was first created). I didn’t know off the top of my head, but thought probably in late 70s. (The earliest I ever pursued was from the 60s, IIRC, & though I found an working email address for the woman who wrote it, which was amazing in itself, she no longer had the code, alas.)

But the question got me to wondering, so in a quick look, here’s what I found: three codes that were initially created in 1978:

Cloudy (ascl:9910.001)
AIPS (ascl:9911.003)
ADIPLS (ascl:1109.002)

All of these have undergone further development and are still in use, as indicated by citations to them in papers published this year.

Are these the most long-lived codes we have? Are there codes that were started even before 1978 that are still in use? Probably. Maybe part of the Starlink (ascl:1110.012) code base? Something else?

If you know of one or can find one in the ASCL with a history that goes back further than 1978, please let us know in the replies.


UPDATE, August 21, 2019


Screenshot of tweet about the STARS code and its origins in the early 1970sYes! There is one code that goes back even further, to 1972. Warrick Ball (@warrickball), a postdoc at the University of Birmingham (U.K.), replied on Twitter that the stellar evolution code STARS (ascl:1107.008) got its start in 1971, and the 1972 article which describes the code is listed in the ASCL entry for it. The code is still in use and was cited earlier this year. There’ll be dark chocolate heading Dr. Ball’s way as soon as the weather cools off; kudos to him for finding the answer to this question!

July additions to the ASCL

Thirty-two codes were added to the ASCL in July 2019:

Astro-SCRAPPY: Speedy Cosmic Ray Annihilation Package in Python
astrodendro: Astronomical data dendrogram creator
beamconv: Cosmic microwave background detector data simulator
CMDPT: Color Magnitude Diagrams Plot Tool
Dewarp: Distortion removal and on-sky orientation solution for LBTI detectors

GaussPy: Python implementation of the Autonomous Gaussian Decomposition algorithm
GaussPy+: Gaussian decomposition package for emission line spectra
GIST: Galaxy IFU Spectroscopy Tool
healvis: Radio interferometric visibility simulator based on HEALpix maps
intensitypower: Spectrum multipoles modeler

MCRGNet: Morphological Classification of Radio Galaxy Network
MGB: Interactive spectral classification code
molly: 1D astronomical spectra analyzer
OMNICAL: Redundant calibration code for low frequency radio interferometers
Plonk: Smoothed particle hydrodynamics data analysis and visualization

POCS: PANOPTES Observatory Control System
PRISM: Probabilistic Regression Instrument for Simulating Models
pyGTC: Parameter covariance plots
pyuvdata: Pythonic interface to interferometric data sets
REVOLVER: REal-space VOid Locations from suVEy Reconstruction

ROHSA: Separation of diffuse sources in hyper-spectral data
RVSpecFit: Radial velocity and stellar atmospheric parameter fitting
SARA-PPD: Preconditioned primal-dual algorithm for radio-interferometric imaging
sbpy: Small-body planetary astronomy
schwimmbad: Parallel processing pools interface

Skyfield: High precision research-grade positions for planets and Earth satellites generator
SPAM: Hu-Sawicki f(R) gravity imprints search
StePar: Inferring stellar atmospheric parameters using the EW method
TurbuStat: Turbulence statistics in spectral-line data cubes
Wōtan: Stellar detrending methods

XDF-GAN: Mock astronomical survey generator
ZChecker: Zwicky Transient Facility moving target checker for short object lists