Category Archives: musings

Hand-crafted spam

Nearly every month, ASCL editors notify software authors that their code has been registered in the ASCL. Each editor sends out notifications for the code entries she worked on. I say “nearly every month” because I sometimes get behind in the prep work for the notifications, and that delays all editors who send out these notifications. Editors create their own processes for handling this correspondence.

I refer to my process as “hand-crafted spam.” I send out two types of notifications, one to authors who submitted their software to the ASCL, and another to authors for code entries I created. I have standard text into which I add, one by one, the necessary details for each email, by cutting and pasting the info from an Excel spreadsheet I create just for this purpose. I also check the emails against the ASCL entries to make sure I’ve got the right code and author. This sounds laborious, but it actually doesn’t take much time. This afternoon, I sent out 34 emails covering 39 code entries (some authors had two codes added to the ASCL) for code entries processed in February, March, and April, and it took me, working at a steady but unhurried pace, exactly one hour from the first missive to the last.

I’m putting this here to remind future me that this task goes pretty quickly, so, future me, do this more promptly!

Codes gone bad and how to save them

The ASCL has 779 codes in it now, some of which date back to the 1990s. With the speed at which both the web and code authors (often grad students or post docs) move, links to some code sites are bound to go bad over time. We use a checker regularly to test links to ensure we’re not pointing to dead links; when we do find a broken link (defined as one we haven’t been able to reach for at least 2 weeks), we look for a new one and, if that doesn’t work, email the code author(s) to ask where the code has moved.

We can’t always find a good link, and code authors sometimes don’t reply to our emails. Currently, eight codes — 1% of our entries — have bad links. Of these, for half of them we either cannot find the code author or the code author has not replied to numerous emails.

What else can we do?

I assume that some code authors forget their codes. Having moved on perhaps to another institution and other work, they do not have time nor incentive to create a new web home for a code they wrote some years ago. That’s understandable, but then the code, a unique solution to a problem, an artifact of astrophysics research, a method used in research, is lost.

We’d like to save the codes (Save the Codes! I may have to put that on glow-in-the-dark pencils); here are a few ideas for authors who no longer want to maintain a site for their codes:

  1. Send an archive file of the code to the ASCL. We can house it, as we do for CHIWEI.
  2. Post the code in an online repostitory such as GitHub, SourceForge, Code.Google, or Bitbucket if you would like the code to be open source and are open to others continuing its development, or on a site such as Figshare or Zenodo to simply make it available.
  3. Create a Research Compendium for your paper, data, and code on Research Compendia, or a companion website for your research on RunMyCode and load the code and data for your research there.
  4. Ask your institutional library to house it; many institutions have repositories for storing the digital artifacts of academia and research.

I don’t know about option 4, but options 1-3 should take 15 minutes or less. Surely a code is worth that little bit of extra time to make it available to others even if you don’t want to be bothered with it anymore.

Please save your code; don’t let it go bad!

New papers to read

It’s not just astrophysics; other sciences are also grappling with issues surrounding software release, transparency of research, and collaboratively sharing codes.

The challenge of software licensing came up in the AAS 223 Special Session on code sharing; ASCL advisor Bruce Berriman followed up on this issue with a post on Astronomy Computing Today, and I’ve recently run across A Quick Guide to Software Licensing for the Scientist-Programmer, which also offers some guidance on this important issue.

How long should you keep your codes? And your data?

I’m currently working on a report for the Preserving.exe: Toward a National Strategy for Preserving Software summit held at the Library of Congress in May. My head is filled with the reasons and ways (and lack of ways) to save software discussed at the meeting, and by software, I don’t mean necessarily astrophysics codes, oh no! All kinds of software: mainframe HR software and VisiCalc and Doom and old browsers and dBASE and … well, everything.

This has dovetailed nicely, or perhaps alarmingly, with recent readings, including a blog post by Kristin Briney titled How Long Should You Keep Data? and the Retraction Watch post which inspired it, JCI paper retracted for duplicated panels after authors can’t provide original data, about a 2007 paper which recently had one figure retracted because the authors could not provide the data from which it was generated.

Jon Ippolito of the University of Maine was at the summit and wrote about it in his blog post The Ex-files: how long will our software last?

How long indeed? And if you wanted to retrieve data from 2007, would you be able to even if you had the data files? Would you still have the tools available to get into them? In astronomy, probably so; with FITS, astronomy is better off than many sciences. Elsewhere, maybe not.

How long will astronomy software last? That might be unknowable; perhaps a better question, then, is how long should it last?