Author Archives: Alice Allen

ASCL research poster at ADASS XXIX

This presentation covers research on software authorship and citation, which we carried out between July and September 2019. We examined codes authored by three or fewer people (“short author list” codes) and codes authored by institutional teams, to determine how many codes in the ASCL can be attributed to one of these categories. Utilizing ADS data, we measured the number of citations per authorship category. We carried out further research to determine whether we could infer software usage and code usage statistics from the number of citations to code description papers. Our research shows that citations to code description papers are not a reliable proxy for software usage.

P. Wesley Ryan, Astrophysics Source Code Library

Download poster (PDF)

August additions to the ASCL

Twenty-five codes were added to the ASCL in August 2019:

actsnclass: Active learning for supernova photometric classification
Analysator: Quantitative analysis of Vlasiator files
BEAST: Bayesian Extinction And Stellar Tool
bias_emulator: Halo bias emulator
dips: Detrending periodic signals in timeseries

DustCharge: Charge distribution for a dust grain
EBAI: Eclipsing Binaries with Artificial Intelligence
FastCSWT: Fast directional Continuous Spherical Wavelet Transform
FIRST Classifier: Automated compact and extended radio sources classifier
GBKFIT: Galaxy kinematic modeling

Gramsci: GRAph Made Statistics for Cosmological Information
JPLephem: Jet Propulsion Lab ephemerides package
MAESTROeX: Low Mach number stellar hydrodynamics code
Molsoft: Molonglo Telescope Observing Software
MosfireDRP: MOSFIRE Data Reduction Pipeline

NuRadioMC: Monte Carlo simulation package for radio neutrino detectors
oscode: Oscillatory ordinary differential equation solver
PyRADS: Python RADiation model for planetary atmosphereS
PYSAT: Python Satellite Data Analysis Toolkit
QAC: Quick Array Combinations front end to CASA

QLF: Luminosity function analysis code
SNAPDRAGONS: Stellar Numbers And Parameters Determined Routinely And Generated Observing N-body Systems
TRISTAN-MP: TRIdimensional STANford – Massively Parallel code
Vlasiator: Hybrid-Vlasov simulation code
YMW16: Electron-density model

(per apparent established practice)

I’ve set a goal of bringing the number of entries missing preferred citation information to under 1000, though that might be just beyond possible. When I started this process, there were 1284 entries without a preferred citation; I’ve examined the software sites and documentation of 150+ of these codes so far and have found explicit citation information for just over 14% of these.

In general, we include a preferred citation in an ASCL record when a code’s site or documentation explicitly states what should be cited (“cite [code] with this [ASCL entry/article/DOI/etc.]”). We don’t assume a paper listed under “References” or “Articles” is intended to be for citation, though that may be the intent of some authors listing them, as some list these papers because a code is built upon others’ work, or these papers include research that used the software.

In some cases, a particular software has no citations to the ASCL record and numerous citations (> 25, let’s say) to a code description paper even though the download site or repo does not specify how the software should be cited. Allowing this “apparent established practice” of citation to substitute for an explicit statement and listing the description paper as the preferred citation seems fair to me, and valuable to those who want to do the right thing by citing a software package but don’t find guidance for how to do so on the code’s site.

We very much prefer that authors provide explicit information on their preferred citation for their programming work, but where they don’t, and where there is an apparent established practice of citation, we will now list that citation method as the preferred citation in the ASCL entry. So far, this inferred information has been added to 15 ASCL entries.
Partial screenshot showing location of link to suggest a change or addition to an ASCL entry

Do you want to discuss different software citation methods before selecting a preferred method? Did I get your software’s preferred citation wrong or miss it entirely? If so, please let me know via email or the Suggest a change link at the bottom of your code’s ASCL entry.

Codes past, still present (updated 8/21/2019)

At lunch yesterday, I was asked in what year the earliest code ASCL has was written (or was first created). I didn’t know off the top of my head, but thought probably in late 70s. (The earliest I ever pursued was from the 60s, IIRC, & though I found an working email address for the woman who wrote it, which was amazing in itself, she no longer had the code, alas.)

But the question got me to wondering, so in a quick look, here’s what I found: three codes that were initially created in 1978:

Cloudy (ascl:9910.001)
AIPS (ascl:9911.003)
ADIPLS (ascl:1109.002)

All of these have undergone further development and are still in use, as indicated by citations to them in papers published this year.

Are these the most long-lived codes we have? Are there codes that were started even before 1978 that are still in use? Probably. Maybe part of the Starlink (ascl:1110.012) code base? Something else?

If you know of one or can find one in the ASCL with a history that goes back further than 1978, please let us know in the replies.


UPDATE, August 21, 2019


Screenshot of tweet about the STARS code and its origins in the early 1970sYes! There is one code that goes back even further, to 1972. Warrick Ball (@warrickball), a postdoc at the University of Birmingham (U.K.), replied on Twitter that the stellar evolution code STARS (ascl:1107.008) got its start in 1971, and the 1972 article which describes the code is listed in the ASCL entry for it. The code is still in use and was cited earlier this year. There’ll be dark chocolate heading Dr. Ball’s way as soon as the weather cools off; kudos to him for finding the answer to this question!

July additions to the ASCL

Thirty-two codes were added to the ASCL in July 2019:

Astro-SCRAPPY: Speedy Cosmic Ray Annihilation Package in Python
astrodendro: Astronomical data dendrogram creator
beamconv: Cosmic microwave background detector data simulator
CMDPT: Color Magnitude Diagrams Plot Tool
Dewarp: Distortion removal and on-sky orientation solution for LBTI detectors

GaussPy: Python implementation of the Autonomous Gaussian Decomposition algorithm
GaussPy+: Gaussian decomposition package for emission line spectra
GIST: Galaxy IFU Spectroscopy Tool
healvis: Radio interferometric visibility simulator based on HEALpix maps
intensitypower: Spectrum multipoles modeler

MCRGNet: Morphological Classification of Radio Galaxy Network
MGB: Interactive spectral classification code
molly: 1D astronomical spectra analyzer
OMNICAL: Redundant calibration code for low frequency radio interferometers
Plonk: Smoothed particle hydrodynamics data analysis and visualization

POCS: PANOPTES Observatory Control System
PRISM: Probabilistic Regression Instrument for Simulating Models
pyGTC: Parameter covariance plots
pyuvdata: Pythonic interface to interferometric data sets
REVOLVER: REal-space VOid Locations from suVEy Reconstruction

ROHSA: Separation of diffuse sources in hyper-spectral data
RVSpecFit: Radial velocity and stellar atmospheric parameter fitting
SARA-PPD: Preconditioned primal-dual algorithm for radio-interferometric imaging
sbpy: Small-body planetary astronomy
schwimmbad: Parallel processing pools interface

Skyfield: High precision research-grade positions for planets and Earth satellites generator
SPAM: Hu-Sawicki f(R) gravity imprints search
StePar: Inferring stellar atmospheric parameters using the EW method
TurbuStat: Turbulence statistics in spectral-line data cubes
Wōtan: Stellar detrending methods

XDF-GAN: Mock astronomical survey generator
ZChecker: Zwicky Transient Facility moving target checker for short object lists

No, limits on references in prestigious journals are not harming software citation

In June, I was invited to participate in a one-day workshop as a member of an expert panel for the The Open Source Software Health Index Project. The subject of software citation came up at lunch with other panel members, and someone suggested that because of the limit on references in prestigious publications, citations for software may be dropped to make room for article citations. This surprised me, since I know that several highly-regarded journals have published articles on the importance of research software, have edited their author guidelines to include more and better information on citing software properly, and have improved how citations to ASCL entries, for example, are treated to ensure their proper capture and tracking by indexers.

So I wrote to editors at a number of prestigious publications such as Nature and Science to ask whether their publications might consider exempting software citations from the reference limits. The prompt replies stated that there is no need to do so: there is room for essential references, and even if there are (soft) limits on the number of references in the main text in the print journal, they are unlimited in the online supplementary materials, the reference list appears in full on the website (the version that has the most readers), and all are picked up (or at least made available for ingestion) as citations in bibliographic databases.

Here is a case in point: this Science paper was printed with a limited number of references, but all 113 appear in the online version, and 92 of them were captured by ADS. Those not captured by ADS include one of the four software references, which is only a link to a website, and other references that are similarly not formatted well for tracking or are to resources ADS does not ingest.

Science citation list screenshot showing end of reference list for 113 references

 

 

ADS screenshot showing that 92 references were captured

 

 

 

 

 

 

I’m very pleased — and relieved! — to know the commitment to have code cited well carries over to practice and that limiting citations in print format, when this might occur, does not appear to inhibit nor restrict software citation.

Who writes the codes that make our research sing?

Pie chart showing 66% of consolidated citations of ASCL codes are to codes with 1-3 authors; team-developed codes account for 7% of consolidated citationsWe were asked recently how many of our entries were attributed to one, two, or three authors. Would you guess that over a third of the codes in the ASCL — 35% — have only one author? Codes with 1-3 authors attributed, what we dubbed “short author list” codes, account for 68% of our entries. We ended up writing a short paper, published by Research Notes of the AAS (RNAAS), about authorship and citation numbers for team and short author list codes. It was a quick look and we hope to look more deeply into this; if you’d like to do the same, you can download our public data in JSON and find the code that we used for consolidating citations on GitHub.

June 2019 additions to the ASCL

Twenty-two codes were added to the ASCL in June 2019:

Astroalign: Asterism-matching alignment of astronomical images
Blimpy: Breakthrough Listen I/O Methods for Python
centerRadon: Center determination code in stellar images
FREDDA: A fast, real-time engine for de-dispersing amplitudes
GPUVMEM: Maximum Entropy Method (MEM) GPU algorithm for radio astronomical image synthesis

Kalman: Forecasts and interpolations for ALMA calibrator variability
limb-darkening: Limb-darkening coefficients generator
Lizard: An extensible Cyclomatic Complexity Analyzer
LIZARD: Particle initial conditions for cosmological simulations
mcfit: Multiplicatively Convolutional Fast Integral Transforms

MEGAlib: Medium Energy Gamma-ray Astronomy library
MORPHEUS: A 3D Eulerian Godunov MPI-OpenMP hydrodynamics code with multiple grid geometries
Morpheus: Pixel-level analysis of astronomical image data
OIT: Nonconvex optimization approach to optical-interferometric imaging
PandExo: Instrument simulations for exoplanet observation planning

PlasmaPy: Core Python package for plasma physics
PyA: Python astronomy-related packages
pyLIMA: Microlensing modeling package
PyMORESANE: Python MOdel REconstruction by Synthesis-ANalysis Estimators
T-RECS: Tiered Radio Extragalactic Continuum Simulation

The Exo-Striker: Transit and radial velocity interactive fitting tool for orbital analysis and N-body simulations
turboSETI: Python-based SETI search algorithm

Please meet Assistant Editor Catherine Gosmeyer!

Catherine (Katie) Gosmeyer recently joined our editorial staff as an Assistant Editor. She’s a Scientific Software Developer at NASA Goddard Space Flight Center and works on the ICESat-2 calibration team; formerly she was an analyst at Space Telescope Science Institute (STScI) working on Hubble and James Webb Space Telescopes. Katie is very interested in open source software and improving research and approached me after a presentation I gave at GSFC about volunteering for ASCL as an editor. Yes!! She has dived right in, and joins Associate Editor Kimberly DuPrie, who is at STScI, and me in creating and editing ASCL entries. Welcome, Katie!

Understanding data: Visualisation, machine learning, and reproducibility

The ASCL has once again partnered with others on a Special Session at EWASS. This year’s Special Session (SS34) is titled Understanding data: Visualisation, machine learning, and reproducibility, and will be held on Tuesday, 25 June, in Room 3. Not at EWASS? Follow the session on Twitter at #ewass19ss34.

Full information, including abstracts for the presentations listed below, can be found in the detailed interactive program; look for the sessions in yellow and labeled SS34a, SS34b, and SS34c.

Tuesday, 25 June, 9:00 in Room 3, chaired by Rein Warmels
Reproducibility in computer-aided research by Konrad Hinsen
Publishing associated data: Challenges & opportunities by Pierre Ocvirk
FAIR data in astronomy by Mark Allen
Template for reproducible, shareable & achievable research by Mohammad Akhlaghi
These talks are followed by an open discussion moderated by David Valls-Gabaud.

Tuesday, 25 June, 14:30 in Room 3, chaired by Amruta Jaodand
High-performance machine learning in Astrophysics by Simon Portegies Zwart
Machine learning for the SKA by Anna Scaife
SuperNNova: Open-source, deep learning photometric time-series classifier by Anais Möller
Transfer learning for radio galaxy classification by Hongming Tang
Unsupervised classification of galaxy spectra and interpretability by Didier Fraix-burnet

Tuesday, 25 June, 16:30 in Room 3, chaired by John Wenskovitch
Visual Analytics of Data in Astronomy by Johanna Schmidt
Visual analytics algorithms for multidimensional astronomical data by Dany Vohl
Pulsar to Person (P2P): Data Visualization & Sonification to Experience the Universe by John Wenskovitch
Lightning talks for e-Posters
These talks are followed by an open discussion moderated by the session chair.

This Special Session was organized by:
Rachael Ainsworth (UManchester)
Mohammad Akhlaghi (Instituto De Astrofísica De Canarias)
Amruta Jaodand (ASTRON)
David Valls-Gabaud (Observatoire de Paris)
Rein Warmels (ESO)
John Wenskovitch (Virginia Tech)
Alice Allen (ASCL/UMD)