Category Archives: best practices

Astro software citation examples

One of the unconference sessions (proposed during the event) held at December’s .Astronomy was on software citation, this subject having come up in an earlier session on improving credit for software.

Discussion and comments in the session inspired me to look at astronomy’s current practices for citing software. Though not an exhaustive list, I looked in more than a dozen journals for citations for codes used in research, and below are some of the examples I gathered.

The most common way to cite software is to reference a paper describing the code. This is how, for example, the authors of yt would like that software cited, as shown from a recent MNRAS paper:

Other: MNRAS citation for yt
Sometimes a link to the website for a code is listed as a reference to it, as was done in a Classical and Quantum Gravity paper:

Other: URL for CAMB in Classical and Quantum GravityOther: link for CAMB
Conference proceedings are cited in some cases, as the citation below for WCSTools in an The Astrophysical Journal paper demonstrates:

Other: citation from ApJ for conference proceedings for WCStools

ASCL entries can be cited, too, as shown in this citation for pynbody in a paper published in Physical Review D:

ASCL: pynbody citation in PhysRevDSomeone — I don’t remember who — reported that Google Scholar does not index mentions of codes, GitHub repos, etc. as citations, because they are not papers. An opinion tweeted out about this summed up the sentiment in the room pretty well! I plan to take this up with Google after the AAS meeting. Fortunately, ADS does index properly formatted software references; the only reference listed in this post that I didn’t see captured by ADS was the URL for CAMB, which is not surprising (nor expected).

A subsequent post will include additional information and a list of resources about software citation, to be posted before the first Special Interest Group on software publishing meeting scheduled at AAS225 that will be held on Tuesday, January 6, from 3:45 PM – 4:45 PM in 615 in the Convention Center. The main topic of this meeting will be software citation, and all interested parties are welcome to attend.


The journals below were part of my hunting grounds for software citations. Ever had a citation to software you used in research refused by a publication? If so, I’m interested in knowing the details; please share here or send them to editor@ascl.net. Thanks!

American Institute of Physics Proceedings
Astronomy & Astrophysics
Astronomy and Computing
The Astronomical Journal
The Astrophysical Journal
The Astrophysical Journal Supplement
Classical and Quantum Gravity
Icarus
Monthly Notices of the Royal Astronomical Society
Nature
Physical Review D
Proceedings of the SPIE
Publications of the Astronomical Society of Australia
Publications of the Astronomical Society of Japan
Publications of the Astronomical Society of the Pacific

Additional screenshots of software citations:

ASCL: Citation to PyKE in A&AOther: citation for astrometry.net in ApJGasoline citation in PhysRevDScreen Shot 2014-12-28 at 10.18.28 PMScreen Shot 2015-01-01 at 1.54.20 PMScreen Shot 2015-01-01 at 2.04.07 PMScreen Shot 2015-01-01 at 11.35.47 PMScreen Shot 2015-01-01 at 1.40.11 PM

Formatting counts! Below, two citations for Turbospectrum, the first formatted in a way ADS can pick up and count the citation, the second one not.

Screen Shot 2014-12-28 at 10.12.30 PMScreen Shot 2015-01-01 at 1.31.14 PM

Software Publication Special Interest Group (SPSIG)

The AAS’s Working Group on Astronomical Software (WGAS) has invited the ASCL to form a Special Interest Group (SIG) on software publication. We think this is a dandy idea and have accepted the invitation. The inaugural meeting will be held on Tuesday, January 6, from 3:45 PM – 4:45 PM in 615 in the Convention Center. This is immediately after the Licensing Astrophysics Codes: What You Need to Know special session that is from 2:00-3:30 in that same room.

As issues around software citation came up several times at this month’s .Astronomy meeting and has received subsequent discussion online since, it seems fitting for this to be the main topic for the first meeting of the SPSIG.

Please note that this SIG meeting does not appear in the AAS schedule. The meeting is open to anyone who is interested, and additional information will be posted here as it becomes available.

Creating and evaluating data management plans

I’m delighted to offer the following guest post by Jonathan Petters, Data Management Consultant, Johns Hopkins Data Management Services, and thank him very much for it!

Funding agencies have long encouraged and expected that data and code used in the course of funded research be made available to those in the research discipline.In a recent discussion on preservation and sharing of research data, a few participants expressed their concern (paraphrased here) that “My research community doesn’t know how to create a quality data management plan” and “We don’t know how to evaluate data management plans.” The astronomy community explicitly requested a little guidance. We in Johns Hopkins University Data Management Services have developed a few resources, described below, of use in both developing and evaluating data management plans within all research disciplines, including astronomy.

Funding agencies have long encouraged and expected that data and code used in the course of funded research be made available to those in the research discipline. NSF is an important funder of astronomical research that has such expectations (and the agency I will focus on here). A few years ago NSF began requiring data management plans as part of research proposal, in part to aid in the dissemination and sharing of research data and code. Following a February 2013 Office of Science and Technology Policy memo other US funding agencies are expected to follow suit with similar data management plan requirements, including the Department of Energy’s Office of Science.

What does NSF say about writing and evaluating quality data management plans? A good overview of NSF data policies relevant for the AST community can be found in these slides from Daniel Katz, NSF). In general the National Science Foundation (NSF) states that data management will be defined by “the communities of interest.” The NSF AST-specific policy further states “MPS Divisions will rely heavily on the merit review process in this initial phase to determine those types of plan that best serve each community and update the information accordingly.” Neither statement is especially prescriptive and can leave researchers unclear as to what they should do.

Creating a plan
While effective research data management certainly has community- and discipline-specific attributes, there ARE aspects of effective data management that are generalizable across research disciplines. It is around these general aspects that we in Johns Hopkins University Data Management Services (JHUDMS) devised our Data Management Planning Questionnaire. We work through this questionnaire with researchers at Johns Hopkins to help them create effective data management plans.

The Questionnaire is designed to comprehensively hit upon the important aspects of effective research data management (e.g. data inputs/outputs in the research, ethical/legal compliance, standards and formats used, intended sharing and preservation, PI restrictions on the use of the data).  By answering the applicable questions in the document, removing the questions/front matter and connecting the answers in each section into paragraphs, a researcher would be well on their way to a quality, well thought-out data management plan.

Two relevant side-notes:
1.)   For the Questionnaire we consider code and software tools as one ‘kind’ of research data; thus analysis or simulation codes used in the course of your proposed research should be included as a Data Product. While research code and research data generated or processed by code are clearly NOT the same, there are many similarities in managing the two. In both cases effective management should include consideration of documentation, licensing, formats, associated metadata, and upon what platform(s) the data or code could be shared.

2.)   Astronomy, as in other disciplines, conducts a substantial amount of research through large collaborations (e.g. surrounding HST or SDSS data). In these cases it is typical for investments in research data infrastructure to be made, and data policies/practices to be defined for those working with the data. Citing those policies and practices in a data management plan would be appropriate.

Screenshot of Reviewer Guide and Worksheet for Data Management Plans

Screenshot of Reviewer Guide and Worksheet for Data Management Plans

Evaluating a plan
To help researchers evaluate data management plans for their quality, my colleagues developed the Reviewer Guide and Worksheet for Data Management Plans (dotx). This Guide and Worksheet is a complement to our Questionnaire; it is a handy checklist by which a grant reviewer can determine whether a researcher thoroughly considered the important aspects of research data management.

For those who researchers saying to themselves, “The Questionnaire and Reviewer Guide are nice, but PLEASE just tell me what to do!!!”, I found two tweets from the code sharing session at the latest (223rd) AAS meeting in January to be quite relevant (h/t August Muench and Lucianne Walkowicz):

Who enforces software/data sharing in astronomy? YOU DO! WE DO! PEER REVIEW DOES! not snf/nasa #aas223 #astroCodeShare It's UP TO YOU to include good data management plan as part of panel reviews. The community must enforce importance. #aas223 #astroCodeShare

I wholeheartedly agree with both tweets. It is up to the research community members to police and enforce the data management and sharing practices they would like to see in their community. That’s how peer review works! So the next time you review astronomical research proposals, look over the data management plans carefully and bring up relevant thoughts and concerns to the review panel.

Summing up
I hope the Data Management Planning Questionnaire and Reviewer Guide and Worksheet for Data Management Plans help you and other researchers in the astronomy community more fully develop expectations for data management and sharing practices. It’s likely your institution also has research data management personnel (like the JHUDMS at Hopkins) who are more than happy to help!

New papers to read

It’s not just astrophysics; other sciences are also grappling with issues surrounding software release, transparency of research, and collaboratively sharing codes.

The challenge of software licensing came up in the AAS 223 Special Session on code sharing; ASCL advisor Bruce Berriman followed up on this issue with a post on Astronomy Computing Today, and I’ve recently run across A Quick Guide to Software Licensing for the Scientist-Programmer, which also offers some guidance on this important issue.

Citations redux

I’ve recently learned that some citations to ASCL (and arXiv) entries are not caught by ADS because some BibTeX styles (.bst) don’t support the eprint field, which ADS uses when generating the BibTeX for ASCL and arXiv entries. The lack of support for the eprint field results in a citation that formats the ascl ID incorrectly; for ADS to be able to find and count the citation, the ascl ID needs to be formatted just as it appears in the code entry, e.g. ascl:1010.051 for NEMO. The arXiv site has a list of BibTeX styles that have been updated to support the eprint field, and Norman Gray’s nice urlbst code can add this functionality to existing .bst files.

(This information has been added to the Citing ASCL code entries page.)