No, limits on references in prestigious journals are not harming software citation

In June, I was invited to participate in a one-day workshop as a member of an expert panel for the The Open Source Software Health Index Project. The subject of software citation came up at lunch with other panel members, and someone suggested that because of the limit on references in prestigious publications, citations for software may be dropped to make room for article citations. This surprised me, since I know that several highly-regarded journals have published articles on the importance of research software, have edited their author guidelines to include more and better information on citing software properly, and have improved how citations to ASCL entries, for example, are treated to ensure their proper capture and tracking by indexers.

So I wrote to editors at a number of prestigious publications such as Nature and Science to ask whether their publications might consider exempting software citations from the reference limits. The prompt replies stated that there is no need to do so: there is room for essential references, and even if there are (soft) limits on the number of references in the main text in the print journal, they are unlimited in the online supplementary materials, the reference list appears in full on the website (the version that has the most readers), and all are picked up (or at least made available for ingestion) as citations in bibliographic databases.

Here is a case in point: this Science paper was printed with a limited number of references, but all 113 appear in the online version, and 92 of them were captured by ADS. Those not captured by ADS include one of the four software references, which is only a link to a website, and other references that are similarly not formatted well for tracking or are to resources ADS does not ingest.

Science citation list screenshot showing end of reference list for 113 references

 

 

ADS screenshot showing that 92 references were captured

 

 

 

 

 

 

I’m very pleased — and relieved! — to know the commitment to have code cited well carries over to practice and that limiting citations in print format, when this might occur, does not appear to inhibit nor restrict software citation.

Who writes the codes that make our research sing?

Pie chart showing 66% of consolidated citations of ASCL codes are to codes with 1-3 authors; team-developed codes account for 7% of consolidated citationsWe were asked recently how many of our entries were attributed to one, two, or three authors. Would you guess that over a third of the codes in the ASCL — 35% — have only one author? Codes with 1-3 authors attributed, what we dubbed “short author list” codes, account for 68% of our entries. We ended up writing a short paper, published by Research Notes of the AAS (RNAAS), about authorship and citation numbers for team and short author list codes. It was a quick look and we hope to look more deeply into this; if you’d like to do the same, you can download our public data in JSON and find the code that we used for consolidating citations on GitHub.

June 2019 additions to the ASCL

Twenty-two codes were added to the ASCL in June 2019:

Astroalign: Asterism-matching alignment of astronomical images
Blimpy: Breakthrough Listen I/O Methods for Python
centerRadon: Center determination code in stellar images
FREDDA: A fast, real-time engine for de-dispersing amplitudes
GPUVMEM: Maximum Entropy Method (MEM) GPU algorithm for radio astronomical image synthesis

Kalman: Forecasts and interpolations for ALMA calibrator variability
limb-darkening: Limb-darkening coefficients generator
Lizard: An extensible Cyclomatic Complexity Analyzer
LIZARD: Particle initial conditions for cosmological simulations
mcfit: Multiplicatively Convolutional Fast Integral Transforms

MEGAlib: Medium Energy Gamma-ray Astronomy library
MORPHEUS: A 3D Eulerian Godunov MPI-OpenMP hydrodynamics code with multiple grid geometries
Morpheus: Pixel-level analysis of astronomical image data
OIT: Nonconvex optimization approach to optical-interferometric imaging
PandExo: Instrument simulations for exoplanet observation planning

PlasmaPy: Core Python package for plasma physics
PyA: Python astronomy-related packages
pyLIMA: Microlensing modeling package
PyMORESANE: Python MOdel REconstruction by Synthesis-ANalysis Estimators
T-RECS: Tiered Radio Extragalactic Continuum Simulation

The Exo-Striker: Transit and radial velocity interactive fitting tool for orbital analysis and N-body simulations
turboSETI: Python-based SETI search algorithm

Please meet Assistant Editor Catherine Gosmeyer!

Catherine (Katie) Gosmeyer recently joined our editorial staff as an Assistant Editor. She’s a Scientific Software Developer at NASA Goddard Space Flight Center and works on the ICESat-2 calibration team; formerly she was an analyst at Space Telescope Science Institute (STScI) working on Hubble and James Webb Space Telescopes. Katie is very interested in open source software and improving research and approached me after a presentation I gave at GSFC about volunteering for ASCL as an editor. Yes!! She has dived right in, and joins Associate Editor Kimberly DuPrie, who is at STScI, and me in creating and editing ASCL entries. Welcome, Katie!

Understanding data: Visualisation, machine learning, and reproducibility

The ASCL has once again partnered with others on a Special Session at EWASS. This year’s Special Session (SS34) is titled Understanding data: Visualisation, machine learning, and reproducibility, and will be held on Tuesday, 25 June, in Room 3. Not at EWASS? Follow the session on Twitter at #ewass19ss34.

Full information, including abstracts for the presentations listed below, can be found in the detailed interactive program; look for the sessions in yellow and labeled SS34a, SS34b, and SS34c.

Tuesday, 25 June, 9:00 in Room 3, chaired by Rein Warmels
Reproducibility in computer-aided research by Konrad Hinsen
Publishing associated data: Challenges & opportunities by Pierre Ocvirk
FAIR data in astronomy by Mark Allen
Template for reproducible, shareable & achievable research by Mohammad Akhlaghi
These talks are followed by an open discussion moderated by David Valls-Gabaud.

Tuesday, 25 June, 14:30 in Room 3, chaired by Amruta Jaodand
High-performance machine learning in Astrophysics by Simon Portegies Zwart
Machine learning for the SKA by Anna Scaife
SuperNNova: Open-source, deep learning photometric time-series classifier by Anais Möller
Transfer learning for radio galaxy classification by Hongming Tang
Unsupervised classification of galaxy spectra and interpretability by Didier Fraix-burnet

Tuesday, 25 June, 16:30 in Room 3, chaired by John Wenskovitch
Visual Analytics of Data in Astronomy by Johanna Schmidt
Visual analytics algorithms for multidimensional astronomical data by Dany Vohl
Pulsar to Person (P2P): Data Visualization & Sonification to Experience the Universe by John Wenskovitch
Lightning talks for e-Posters
These talks are followed by an open discussion moderated by the session chair.

This Special Session was organized by:
Rachael Ainsworth (UManchester)
Mohammad Akhlaghi (Instituto De Astrofísica De Canarias)
Amruta Jaodand (ASTRON)
David Valls-Gabaud (Observatoire de Paris)
Rein Warmels (ESO)
John Wenskovitch (Virginia Tech)
Alice Allen (ASCL/UMD)

Open Digital Infrastructure in Astrophysics

I spent two days last week at the Open Digital Infrastructure in Astrophysics meeting at the Kavli Institute for Theoretical Physics (KITP) at UC Santa Barbara. This meeting featured presentations on open-knowledge digital infrastructure projects, the communities around them, their metrics for success, funding, diversity efforts, and plans for sustainability. Yeah, we’re talking code, a lot of code, and code projects, too, from AstroPy to yt, and data, and efforts that support openness and research transparency.

Open data presentations were given on:

STScI data, which includes JWST, Hubble, and PanSTARRS data, and the discovery and analysis software for these archives, by Arfon Smith
SDSS Data Infrastructure, by Joel Brownstein
LSST Transients data, by Federica Bianco
Open gravitational wave data and software tools for these data, by Duncan Brown

These software projects were represented at the meeting:

photograph of the Kavli Institute for Theoretical PhysicsAstropy, by Kelle Cruz
ATHENA++, by Jim Stone
Einstein Toolkit, by Philipp Mösta
emcee, by Daniel Foreman-Mackey
GYRE, by Rich Townsend
JETFIT, by Andrew Macfadyen
MESA Project, by Frank Timmes
TOM Toolkit and the AEON Network, by Rachel Street
yt, by Matt Turk

Other open digital resources presented were:

Journal of Open Source Software, by Arfon Smith
R astrostatistics, by Gwendolyn Eadie
Astrophysics Source Code Library, by yours truly

The meeting hashtag was #OpenAstroInfra, and many of the presentations were live tweeted. They were also video recorded and the podcasts are available on the KITP media page for the meeting, as are most of the slide decks. Participants of the co-located “Better Stars, Better Planets: Exploiting the Stellar-Exoplanetary Synergy” and “The New Era of Gravitational-Wave Physics and Astrophysics” programs were encouraged to attend, and we had a raven or two trying to have lunch with us as well.

Each of the presentations had about 15 minutes devoted to questions and discussion about the project highlighted. In two of these discussion sessions, the presenters were asked whether they were concerned about “improper use” of a code; sometimes people who are not well-schooled in the theory or science underlying a software package will use the code incorrectly, arriving at results that are dodgy, or downright wrong, and in a few cases (I know of only one), have then claimed the software is in error. This fear has been given as the reason some software authors do not release their code. I was cheering in my head with Jim Stone’s response to this question the first time it came up; he stated that there is so much benefit to making the code available that a potential improper use should not stop release. (YES!!!) He further went on to say, as did others in the room, that science will correct the record (YES!!!!!). I could not agree more with these replies, and it was great to hear these sentiments from others.

This was my first visit to KITP, and what a wonderful introduction to the institution it was! So many excellent projects, and so much exciting work being done in the open! My thanks to organizers Frank Timmes, Lars Bildsten, and Rich Townsend for inviting the ASCL to participate, and to the Sloan and Ford Foundations for funding the meeting.

ASCL presentation slides

May 2019 additions to the ASCL

Twenty-seven codes were added to the ASCL in May, 2019:

Astrocut: Tools for creating cutouts of TESS images
Bandmerge: Merge data from different wavebands
beamModelTester: Model evaluation for fixed antenna phased array radio telescopes
Binospec: Data reduction pipeline for the Binospec imaging spectrograph
CASI-2D: Convolutional Approach to Shell Identification – 2D

ClusterPyXT: Galaxy cluster pipeline for X-ray temperature maps
evolstate: Assign simple evolutionary states to stars
FastPM: Scaling N-body Particle Mesh solver
Fermitools: Fermi Science Tools
Fitsverify: FITS file format-verification tool

Grizli: Grism redshift and line analysis software
HAOS-DIPER: HAO Spectral Diagnostic Package For Emitted Radiation
LensCNN: Gravitational lens detector
LensQuEst: CMB Lensing QUadratic Estimator
MMIRS-DRP: MMIRS Data Reduction Pipeline

NAPLES: Numerical Analysis of PLanetary EncounterS
ODEPACK: Ordinary differential equation solver library
PICASO: Planetary Intensity Code for Atmospheric Scattering Observations
Prospector: Stellar Population inference from spectra and SEDs
Py4CAtS: PYthon for Computational ATmospheric Spectroscopy

PyPDR: Chemistry, thermal balance, and molecular excitation code
Q3C: A PostgreSQL package for spatial queries and cross-matches of large astronomical catalogs
rPICARD: Radboud PIpeline for the Calibration of high Angular Resolution Data
SEDPY: Modules for storing and operating on astronomical source spectral energy distribution
SICON: Stokes Inversion based on COnvolutional Neural networks

SPARK: K-band Multi Object Spectrograph data reduction
THALASSA: Orbit propagator for near-Earth and cislunar space

April 2019 additions to the ASCL

Thirty codes were added to the ASCL in April, 2019:

AutoBayes: Automatic design of customized analysis algorithms and programs
CausticFrog: 1D Lagrangian Simulation Package
CDAWeb: Coordinated Data Analysis Web
CGS: Collisionless Galactic Simulator
CLEAR: CANDELS Ly-alpha Emission at Reionization processing pipeline and library

covdisc: Disconnected covariance of 2-point functions in large-scale structure of the Universe
deproject: Deprojection of two-dimensional annular X-ray spectra
dfitspy: A dfits/fitsort implementation in Python
digest2: NEO binary classifier
ehtim: Imaging, analysis, and simulation software for radio interferometry

EightBitTransit: Calculate light curves from pixel grids
eleanor: Extracted and systematics-corrected light curves for TESS-observed stars
FortesFit: Flexible spectral energy distribution modelling with a Bayesian backbone
GALAXY: N-body simulation software for isolated, collisionless stellar systems
JVarStar: Variable Star Analysis Library

nbodykit: Massively parallel, large-scale structure toolkit
nudec_BSM: Neutrino Decoupling Beyond the Standard Model
OoT: Out-of-Transit Light Curve Generator
Properimage: Image coaddition and subtraction
pyRSD: Accurate predictions for the clustering of galaxies in redshift-space in Python

rate: Reliable Analytic Thermochemical Equilibrium
repack: Repack and compress line-transition data
SARAH: SUSY and non-SUSY model builder and analyzer
SBGAT: Small Bodies Geophysical Analysis Tool
simuTrans: Gravity-darkened exoplanet transit simulator

SMILI: Sparse Modeling Imaging Library for Interferometry
Specstack: A simple spectral stacking tool
sxrbg: ROSAT X-Ray Background Tool
TP2VIS: Total Power Map to Visibilities
Vevacious: Global minima of one-loop effective potentials generator

Prize established for astronomy research software at UMD

This week, we saw another example of the importance of software in astronomy research. We are pleased to announce the establishment of an annual software prize from the UMD Astronomy Department for research software written by a registered undergraduate or graduate student while the student was at University of Maryland, College Park. The source code of the software must be open and assigned an acceptable open source license.

The prize consists of a certificate and a cash award, which will be presented at the Astronomy Department annual award ceremony. We reserve the right to withhold the prize if the criteria are not sufficiently met.

More details will be available later; if you would like additional information, please contact us.

Peter Teuben, teuben@astro.umd.edu
Alice Allen, aallen@ascl.net

 

February and March 2019 additions to the ASCL

Twelve codes were added to the ASCL in February, 2019:

dyPolyChord: Super fast dynamic nested sampling with PolyChord
ExPRES: Exoplanetary and Planetary Radio Emissions Simulator
GraviDy: Gravitational Dynamics
LiveData: Data reduction pipeline

LPNN: Limited Post-Newtonian N-body code for collisionless self-gravitating systems
PINT: High-precision pulsar timing analysis package
PyMF: Matched filtering techniques for astronomical images
Radynversion: Solar atmospheric properties during a solar flare

RPFITS: Routines for reading and writing RPFITS files
SNTD: Supernova Time Delays
Specutils: Spectroscopic analysis and reduction
SpecViz: 1D Spectral Visualization Tool

And sixteen codes were added to the ASCL in March, 2019:

allesfitter: Flexible star and exoplanet inference from photometry and radial velocity
AsPy: Aspherical fluctuations on the spherical collapse background
brutifus: A Python module to post-process datacubes from integral field spectrographs
DAVE: Discovery And Vetting of K2 Exoplanets

GalIMF: Galaxy-wide Initial Mass Function
Galmag: Computing realistic galactic magnetic fields
HelioPy: Heliospheric and planetary physics library
ICSF: Intensity Conserving Spectral Fitting

NFWdist: Density, distribution function, quantile function and random generation for the 3D NFW profile
NIFTy5: Numerical Information Field Theory v5
PLATON: PLanetary Atmospheric Transmission for Observer Noobs
PRF: Probabilistic Random Forest

SimSpin: Kinematic analysis of galaxy simulations
SIXTE: Simulation of X-ray Telescopes
SPICE: Observation Geometry System for Space Science Missions
SpiceyPy: Python wrapper for the NAIF C SPICE Toolkit