Category Archives: AAS

Software licensing resources

Below, a list of informative, interesting (or both!) writings about software licensing; the ASCL doesn’t necessarily agree with all positions in these articles, but we want to know what people are thinking even when we don’t agree with them.

EUDAT License Wizard
http://www.eudat.eu/news/eudat-license-wizard-guides-you-through-legal-maze
http://ufal.github.io/lindat-license-selector/

A Quick Guide to Software Licensing for the Scientist-Programmer
By Andrew Morin, Jennifer Urban, Piotr Sliz
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002598

Relicensing yt from GPLv3 to BSD
By Matthew Turk
http://blog.yt-project.org/post/Relicensing.html

Best Practices for Scientific Computing
Greg Wilson, D. A. Aruliah, C. Titus Brown, Neil P. Chue Hong, Matt Davis, Richard T. Guy, Steven H. D. Haddock, Katy Huff, Ian M. Mitchell, Mark Plumbley, Ben Waugh, Ethan P. White, Paul Wilson
http://arxiv.org/abs/1210.0530v4

The Whys and Hows of Licensing Scientific Code
By Jake VanderPlas
http://www.astrobetter.com/the-whys-and-hows-of-licensing-scientific-code/

Licensing your code
ASCL blog post https://ascl.net/wordpress/?p=726 lists the following:

Making Sense of Software Licensing
Choose a license
Open Source Initiative also offers information on licenses
White paper from the Software Freedom Law Center
Bruce Berriman’s post on relicensing Montage

The Gentle Art of Muddying the Licensing Waters
by Glyn Moody
http://blogs.computerworlduk.com/open-enterprise/2014/08/the-gentle-art-of-muddying-the-licensing-waters/index.htm

STM open license suggestions and aftermath

Open Access Licensing
Don’t Muddy the “Open” Waters: SPARC Joins Call for STM Association to Rethink New Licenses
Global Coalition of Access to Research, Science and Education Organizations calls on STM to Withdraw New Model Licenses
STM response to ‘Global Coalition of Access to Research, Science and Education Organisations calls on STM to Withdraw New Model Licenses’
New “open” licenses aren’t so open

Interesting talk on ITAR
http://www.state.gov/e/stas/series/154211.htm
Discusses dual-use technologies, which is what codes are under ITAR. These are governed by the Wassenaar Arrangement. The countries that participate meet 3x/year to decide what restrictions to put on dual-use technologies. Dr. James Harrington was the speaker. Slides available on that page.

AAS Software Events: The Short List

A short list without the descriptions, other information, and Saturday-start bootcamp the longer list has, because short is beautiful, too! Some may require registration/charge a fee.

Astropy Tutorial, Sunday, 8:00-11:00 (Tutorial)
Location: 612 (Convention Center)

SciCoder@AAS: Intro to Databases for Astronomers, Sunday, 9:00-5:00 (Workshop)
Location: 607 (Convention Center)

Astrostatistics, Sunday, 9:30-6:00 (Workshop)
Location: 618/619 (Convention Center)

Collaborating Online with GitHub and Other Tools, Sunday, 12:00-5:00 (Workshop)
Location: 303 (Convention Center)

232. Licensing Astrophysics Codes: What You Need to Know, Tuesday, 2:00-3:30 (Special Session)
Location: 615 (Convention Center)

Software Publication Special Interest Group (SPSIG) Inaugural Meeting, Tuesday, 3:45-4:45 (Special Interest Group meeting)
Location: 615 (Convention Center)

Catalogs, Surveys, and Computation Posters, Wednesday, 9:00-5:30

315 Astroinformatics and Astrostatistics in Astronomical Research: Steps Towards Better Curricula, Wednesday, 10:00-11:30 (Special Session)
Location: 620 (Convention Center)

The SKA Telescope: Global Project, Revolutionary Science, Extreme Computing Challenges, Wednesday, 12:30-3:30 (Splinter Meeting)
Location: 4C-4 (Convention Center)

332. Catalogs/Surveys/Computation – UVOIR, Wednesday, 3:10-3:20 PM (Oral Session)
Location: 620 (Convention Center)

434. Computation, Data Handling and Other Matters Posters, Thursday, 9:00-2:00

Hack Day, Thursday, 10:00-7:00 (Workshop)
Location: 4C-2 (Convention Center)

Astro software citation examples

One of the unconference sessions (proposed during the event) held at December’s .Astronomy was on software citation, this subject having come up in an earlier session on improving credit for software.

Discussion and comments in the session inspired me to look at astronomy’s current practices for citing software. Though not an exhaustive list, I looked in more than a dozen journals for citations for codes used in research, and below are some of the examples I gathered.

The most common way to cite software is to reference a paper describing the code. This is how, for example, the authors of yt would like that software cited, as shown from a recent MNRAS paper:

Other: MNRAS citation for yt
Sometimes a link to the website for a code is listed as a reference to it, as was done in a Classical and Quantum Gravity paper:

Other: URL for CAMB in Classical and Quantum GravityOther: link for CAMB
Conference proceedings are cited in some cases, as the citation below for WCSTools in an The Astrophysical Journal paper demonstrates:

Other: citation from ApJ for conference proceedings for WCStools

ASCL entries can be cited, too, as shown in this citation for pynbody in a paper published in Physical Review D:

ASCL: pynbody citation in PhysRevDSomeone — I don’t remember who — reported that Google Scholar does not index mentions of codes, GitHub repos, etc. as citations, because they are not papers. An opinion tweeted out about this summed up the sentiment in the room pretty well! I plan to take this up with Google after the AAS meeting. Fortunately, ADS does index properly formatted software references; the only reference listed in this post that I didn’t see captured by ADS was the URL for CAMB, which is not surprising (nor expected).

A subsequent post will include additional information and a list of resources about software citation, to be posted before the first Special Interest Group on software publishing meeting scheduled at AAS225 that will be held on Tuesday, January 6, from 3:45 PM – 4:45 PM in 615 in the Convention Center. The main topic of this meeting will be software citation, and all interested parties are welcome to attend.


The journals below were part of my hunting grounds for software citations. Ever had a citation to software you used in research refused by a publication? If so, I’m interested in knowing the details; please share here or send them to editor@ascl.net. Thanks!

American Institute of Physics Proceedings
Astronomy & Astrophysics
Astronomy and Computing
The Astronomical Journal
The Astrophysical Journal
The Astrophysical Journal Supplement
Classical and Quantum Gravity
Icarus
Monthly Notices of the Royal Astronomical Society
Nature
Physical Review D
Proceedings of the SPIE
Publications of the Astronomical Society of Australia
Publications of the Astronomical Society of Japan
Publications of the Astronomical Society of the Pacific

Additional screenshots of software citations:

ASCL: Citation to PyKE in A&AOther: citation for astrometry.net in ApJGasoline citation in PhysRevDScreen Shot 2014-12-28 at 10.18.28 PMScreen Shot 2015-01-01 at 1.54.20 PMScreen Shot 2015-01-01 at 2.04.07 PMScreen Shot 2015-01-01 at 11.35.47 PMScreen Shot 2015-01-01 at 1.40.11 PM

Formatting counts! Below, two citations for Turbospectrum, the first formatted in a way ADS can pick up and count the citation, the second one not.

Screen Shot 2014-12-28 at 10.12.30 PMScreen Shot 2015-01-01 at 1.31.14 PM

Software Publication Special Interest Group (SPSIG)

The AAS’s Working Group on Astronomical Software (WGAS) has invited the ASCL to form a Special Interest Group (SIG) on software publication. We think this is a dandy idea and have accepted the invitation. The inaugural meeting will be held on Tuesday, January 6, from 3:45 PM – 4:45 PM in 615 in the Convention Center. This is immediately after the Licensing Astrophysics Codes: What You Need to Know special session that is from 2:00-3:30 in that same room.

As issues around software citation came up several times at this month’s .Astronomy meeting and has received subsequent discussion online since, it seems fitting for this to be the main topic for the first meeting of the SPSIG.

Please note that this SIG meeting does not appear in the AAS schedule. The meeting is open to anyone who is interested, and additional information will be posted here as it becomes available.

AAS225 software events

Software is an integral part of astronomy research and the American Astronomical Society meetings reflect this. The upcoming AAS meeting in Seattle (January 4-8) offers workshops, sessions, posters, tutorials, and discussions that focus on many aspects of astronomical software. On Tuesday, join us for a special session on code licensing; immediately after, the inaugural meeting of the Software Publishing Special Interest Group will be held. On Thursday, the wildly successful and fun Hack Day returns. Hack Day includes but is not limited to hacking software; whatever skills you have or project you want to take on, there’s a place for you at Hack Day!

Organized by day, below is a list of software-related offerings at the AAS meeting. See you there!


Saturday

Software Carpentry Bootcamp, Saturday-Sunday, 9:00-5:30, organized by August Muench
Computing is now an integral part of every aspect of science, but most scientists are never taught how to build, use, validate, and share software well. As a result, many spend hours or days doing things badly that could be done well in just a few minutes. The goal of AAS 225 Software Carpentry 2 day “bootcamp” is to change that so that astronomers can spend less time wrestling with software and more time doing useful research. Further, good quality, well tested code means science results are easier to verify, share, and update. More information on the Software Carpentry project can be found <http://software-carpentry.org>. The AAS 225 Software Carpentry bootcamp consists of short tutorials alternating with hands-on practical exercises and will cover the core software skills needed build, use, validate, and share software in astronomy: Saturday’s tutorials will comprise shell automation, basic python programming, and unit testing; Sunday’s sessions will shift to focus on advanced python, including numerical and astronomy oriented computing, and version control. Registration is for both days. The target audience for the bootcamp consists of graduate students and early career scientists. The Software Carpentry @ AAS 225 Bootcamp will be run by a set of three certified instructors and a team of helpers. Participants will be required to bring laptops and to install software in advance of the workshop. Some basic familiarity with shell based computing was assumed in setting the bootcamp schedule. See also a FAQ at http://software-carpentry.org/faq.html for more information.
Event Type: Workshop
Organizer: August A. Muench
Location: 609 (Convention Center)


Sunday

Software Carpentry Bootcamp, 9:00-5:00, Day 2; see description above

Astropy Tutorial, Sunday, 8:00-11:00, organized by Perry Greenfield
This tutorial will cover the features and capabilities of Astropy and affiliated packages.
Event Type: Splinter Meeting
Organizer: Perry Greenfield
Location: 612 (Convention Center)

SciCoder@AAS: Intro to Databases for Astronomers, Sunday, 9:00-5:00, organized by Demitri Muna
The volume of data available to astronomers today is enormous. The standard pattern of working with flat files doesn’t scale to what’s available now, let alone with the increasing amount of data that is coming. Every astronomer should have the skills to work with databases both for their own data sets and what is publicly available. This workshop will teach how a database is designed, how to create your own, how to populate it with data, how to query that data, how to work with other databases, and how to write scripts against a database. Exercises and examples will be geared to astronomical data but will be applicable to nearly any data. Participants should have a basic comfort level with Python and will be required to install some software on their laptops before the workshop. The workshop will be presented by Demitri Muna (Ohio State University), creator of the SciCoder workshop, and Alex Hagen (Pennsylvania State University).
Event Type: Workshop
Organizer: Demitri Muna
Location: 607 (Convention Center)

Astrostatistics, Sunday, 9:30-6:00, organized by Eric Feigelson
The fields of astronomy and statistics diverged in the 20th century so that astronomers are often not well informed about the wealth of powerful modern methodologies developed by statisticians. Statistics is needed for: characterizing astronomical images, spectra and lightcurves; inferring properties of underlying populations from limited samples; linking astronomical observations to astrophysical theories; and many other aspects of data and science analysis. An additional difficulty has been the inaccessibility of software implementing modern statistical methods for most astronomers. Fortunately, a large, integrated and user-friendly public domain software system has emerged in recent years to implement modern methods. R with its >5000 add-on CRAN packages has >100,000 statistical functionalities, extensive graphics, links to other languages, and more. Over 100 recipe books and extensive on-line support provide guidance for the sophisticated R user. The AAS astrostatistics tutorials are presented by astronomer Eric D. Feigelson and statistician G. Jogesh Babu, authors of the textbook `Modern Statistical Methods for Astronomy with R Applications’ that won the PROSE Award for best astronomy book of 2012. Participants should bring laptops with R installed (http://www.r-project.org). R scripts and astronomical datasets will be provided. Schedule for Sunday January 4: 9:30-10:30 Introduction to astrostatistics (lecture) 10:30-11:30 Fundamentals of statistical inference (lecture) 11:30-12:30 Introduction to R (tutorial) — Lunch (not provided) — 2:00-3:00 Density estimation or data smoothing (tutorial) 3:00-4:00 Fitting models to data (lecture) 4:00-5:00 Multivariate clustering and classification (tutorial)
Event Type: Workshop
Organizer: Eric Feigelson
Location: 618/619 (Convention Center)

Collaborating Online with GitHub and Other Tools, Sunday, 12:00-5:00, organized by August Muench
Distributed collaboration is a hallmark of modern international astronomical research. We collaborate on everything from software development to paper and grant writing to sharing new results, plots, and data files. The goal of this workshop to provide new tools and techniques for productive efficient collaboration online. This workshop will begin with a hands on tutorial of GitHub. This will include reviewing distributed version control systems and learning collaboration workflows using the GitHub system. During the second part of the workshop we will explore an array of other online tools, ranging from cloud storage (DropBox, Google Drive) to collaborative document creation (Google Documents, online LaTeX editors) to feature tracking platforms (Trello, Jira) and much more. We intend to provide concrete workflows and to imbue you with tips and tricks for using these online tools in your research groups. The target audience for the workshop consists of astronomers at all points in their careers. Presenters will include Arfon Smith <https://github.com/arfon>, PhD Astronomer turned Zooniverse developer turned Github Science head, Brent Beer, a GitHub Trainer, and August Muench (Smithsonian). Participants will be required to bring laptops and to install software in advance of the workshop. Familiarity with git or other version control systems is not a prerequisite.
Event Type: Workshop
Organizer: August A. Muench
Location: 303 (Convention Center)


Tuesday

232. Licensing Astrophysics Codes: What You Need to Know, Tuesday, 2:00-3:30
Research in astronomy is increasingly dependent on software methods and astronomers are increasingly required to share their codes; those who write software need to choose a license that delineates whether, when and how others may use and extend this software. Building on comments and questions about licensing in the January 2014 AAS special session “Astrophysics Code Sharing II: The Sequel”, this session, organized by the Astrophysics Source Code Library (ASCL) and AAS’s Working Group on Astronomical Software (WGAS), and the Moore-Sloan Data Science Environment, explores why providing an explicit license for software is important, explains different common licenses, examines intellectual property concerns common to universities, and provides information on restrictions that arise from ITAR. A panel of speakers will discuss code licensing, share considerations that arise when choosing a license, and benefits of the licenses they chose. Institutional and governmental concerns about intellectual property, its licensing, use, and release, will also be covered. The floor will then be open for discussion and questions.
Session Type: Special Session
Organizer: Alice Allen
Chair: Frossie Economou
Moderator: Alberto Accomazzi
Location: 615 (Convention Center)

232.01. Copy-left and Copy-right, Jacob VanderPlas
232.02. University tech transfer perspective on software licensing, Laura Dorsey
232.03. Relicensing the Montage Image Mosaic Engine, G. B. Berriman
232.04. Export Controls on Astrophysical Simulation Codes, Daniel Whalen
232.05. Why licensing is just the first step, Arfon M. Smith
232.06. Licenses in the wild, Daniel Foreman-Mackey
Open Discussion
moderated by Alberto Accomazzi

Software Publication Special Interest Group (SPSIG) Inaugural Meeting, Tuesday, 3:45-4:45
This first meeting of the to-be-formed-at-AAS225 Software Publication Special Interest Group (SPSIG) is open to all interested parties. The main topic of discussion at this meeting will be software citation.
Session Type: Special Interest Group meeting
Organizer: Alice Allen
Location: 615 (Convention Center)


Wednesday

Catalogs, Surveys, and Computation Posters, Wednesday, 9:00-5:30

336.34. A Pipeline for High Resolution Radio Images
Brianna P. Thomas; Alison B. Peck; Jacqueline Hodge; Anthony J. Beasley

336.35. ADMIT: ALMA Data Mining Toolkit
Douglas N. Friedel; Lisa Xu; Leslie Looney; Peter J. Teuben; Marc W. Pound; Kevin P. Rauch; Lee G. Mundy; Jeffrey S. Kern

336.36. Overview of the SOFIA Data Processing System: A generalized system for manual and automatic data processing at the SOFIA Science Center
Ralph Shuping; Robert Krzaczek; William D. Vacca; Miguel Charcos-Llorens; William T. Reach; Rosemary Alles; Melanie Clarke; Riccardo Melchiorri; James T. Radomski; Sachindev S. Shenoy; David Sandel; Eric Omelian

336.37. A Prototype External Event Broker for LSST
Gabriella E. Alvarez; Keivan Stassun; Dan Burger; Robert Siverd; Donald Cox

336.39. Simulating Optical Surveys with the LSST Software Stack
Scott Daniel; K. S. Krughoff; Peter Yoachim; R. Lynne Jones; Yusra AlSayyad; Bryce Kalmbach; Andrew J. Connolly; Zeljko Ivezic

336.40. The LSST Metrics Analysis Framework (MAF)
R. Lynne Jones; Peter Yoachim; Srinivasan Chandrasekharan; Andrew J. Connolly; Kem H. Cook; Zeljko Ivezic; K. S. Krughoff; Catherine E. Petry; Stephen T. Ridgway

336.41. Analyzing Simulated LSST Surveys With MAF
Peter Yoachim; R. Lynne Jones; Srinivasan Chandrasekharan; Andrew J. Connolly; Kem H. Cook; Zeljko Ivezic; K. S. Krughoff; Catherine E. Petry; Stephen T. Ridgway

336.42. Building POCS: An open source observatory control system for amateur telescopes used by the PANOPTES project for the detection of extrasolar planets
Wilfred T. Gee; Josh Walawender; Mike Butterfield; Olivier Guyon; Nemanja Jovanovic

336.43. Adaptive Optics Images of the Galactic Center: Using Empirical Noise-maps to Optimize Image Analysis
Saundra Albers; Gunther Witzel; Leo Meyer; Breann Sitarski; Anna Boehle; Andrea M. Ghez

336.44. Recovering Astrophysical Signals Lost in Noise: Light Curves of Background Objects in Kepler Data
Rebecca L. Bowers; Joshua Pepper; Michael Abdul-Masih; Andrej Prsa

336.45. An Exploration Tool for Very Large Spectrum Data Sets
Duane F. Carbon; Christopher Henze

336.46. Understanding and Using the Fermi Science Tools
Joseph Asercion

336.47. Fact Checking LIGO’s Radiometer Code with Simulated LIGO Data
Samantha E. Thrush

336.48. AstroML: “better, faster, cheaper” towards state-of-the-art data mining and machine learning
Zeljko Ivezic; Andrew J. Connolly; Jacob Vanderplas

336.49. Bayesian Identification of Emission–Line Galaxies with Photometric Equivalent Widths
Andrew S. Leung; Eric J. Gawiser; Viviana Acquaviva

336.50. Statistical Computing for Galaxy Modeling and Residual Detection
Sean McLaughlin; Robert Brunner

336.51. Separating Stars and Galaxies Probabilistically Based on Color
Victoria Strait

336.52. Visualizing SPH Cataclysmic Variable Accretion Disk Simulations with Blender
Brian R. Kent; Matthew A. Wood

336.53. Computer analysis of digital sky surveys using citizen science and manual classification
Evan Kuminski; Lior Shamir

336.55. Improved Functionality and Curation Support in the ADS
Alberto Accomazzi; Michael J. Kurtz; Edwin A. Henneken; Carolyn S. Grant; Donna Thompson; Roman Chyla; Alexandra Holachek; Vladimir Sudilovsky; Stephen S. Murray

336.56. Online Activity Around Scholarly Astronomy Literature – A Discussion of Altmetrics
Edwin A. Henneken; Alberto Accomazzi; Michael J. Kurtz; Donna Thompson; Carolyn S. Grant; Stephen S. Murray

336.57. Astrophysics Source Code Library — Now even better!
Alice Allen; Judy Schmidt; Bruce Berriman; Kimberly DuPrie; Robert J. Hanisch; Jessica D. Mink; Robert J. Nemiroff; Lior Shamir; Keith Shortridge; Mark B. Taylor; Peter J. Teuben; John F. Wallin

336.59. Beyond The Prime Directive: The MAST Discovery Portal and High Level Science Products
Scott W. Fleming; Faith Abney; Tom Donaldson; Theresa Dower; Dorothy A. Fraquelli; Anton M. Koekemoer; Karen Levay; Jacob Matuskey; Brian McLean; Lee Quick; Anthony Rogers; Bernie Shiao; Randy Thompson; Shui-Ay Tseng; Geoff Wallace; Richard L. White

315 Astroinformatics and Astrostatistics in Astronomical Research: Steps Towards Better Curricula, Wednesday, 10:00-11:30
The AAS Working Group on Astroinformatics and Astrostatistics hereby proposes a Special Session for the 225th AAS meeting in Seattle which will highlight the importance of data analytics training in astronomy, both for the sake of astronomical research and in order to make astronomy graduates more employable. Although astronomy and astrophysics are witnessing dramatic increases in data volume as detectors, telescopes, and computers become ever more powerful, the traditional training of astronomy and physics students is not providing skills to handle such voluminous and complex data sets. Equally worrisome, research funds and hiring options in astronomy are diminishing; in particular, a number of candidates for permanent (or steady) jobs significantly exceeds the job availability. As a result many of astronomy graduates have transitioned out of astronomy to work in areas where their analytic skills become highly valuable. Invited talks by a recent astronomy Ph.D. graduate who transitioned to industry, and an industry representative, will critically compare academic and industrial environments.The main goals of the proposed session are to discuss ways to improve Big Data training and research in astronomy, as well as to explore the connections between data science in astronomy and in the other research or technology areas where astronomy postdocs or recent graduates could excel and compete. We will use moderated panel method to facilitate discussion of graduate curriculum at Astronomy Departments, and invited talks to highlight connections to industry.
Session Type: Special Session
Organizer: Zeljko Ivezic
Organizer: Aneta Siemiginowska
Location: 620 (Convention Center)

315.01. Working on interesting problems, Arfon M. Smith
315.02. Astronomer to Data Scientist, Jessica Kirkpatrick
Panel Discussion

The SKA Telescope: Global Project, Revolutionary Science, Extreme Computing Challenges, Wednesday, 12:30-3:30
The Square Kilometre Array (SKA) is one of the most awe-inspiring and audacious science and engineering projects of the 21st Century. With its hundreds of thousands of antennas spread across Africa and Australia, the SKA will have unrivalled scope in observations and is designed to address fundamental questions about the earliest stages of the Universe, such as star formation, dark energy, gravity and life itself. When fully operational in the early 2020s, the SKA will produce 10 times the data of the current global internet. Processing this vast quantity of data will require very high performance central supercomputers capable of in excess of 100 petaflops of raw processing power: about three times more powerful than the most powerful supercomputer in 2013. In addition to developing this high performance computing hardware and software capability, the project must also address the incredibly complex tasks of signal processing, data transfer, storage and curation, and data manipulation. To develop these revolutionary technologies and drive tomorrow’s groundbreaking science, effective global partnerships between governments, academia, and industry are becoming essential. With their long-standing tradition of radio astronomy, the US can bring much expertise to such global partnerships, while at the same time gaining strategic access to world-class instruments. This session will be divided in 2 parts: – Science: Through the case study of the SKA precursor telescopes MWA, ASKAP and MeerKAT, and of the first-class observatories LOFAR and JVLA, we will see how major science questions are already being touched upon, paving the way for the revolutionary capabilities of the SKA. We will finally examine how a project the scale of the SKA will push the frontiers of scientific knowledge.- Computing: The sheer amount of data collected by the SKA will drive fundamental shifts in science-driven technology with daily-life applications in the areas of data transport, data storage, high-performance computing, and algorithm design. We will first present the SKA global computing and technological challenges, and then give the floor to experts from High Performance Computing industry who will provide their views on how they aim to tackle these challenges and how the SKA is driving technology development in a number of domains.
Event Type: Splinter Meeting
Organizer: Tyler L. Bourke
Location: 4C-4 (Convention Center)

332. Catalogs/Surveys/Computation – UVOIR, Wednesday, 3:10-3:20 PM
Session Type: Oral Session
Chair: Steven A. Rodney
Location: 620 (Convention Center)

332.09. Targeted-mode pipeline for the Evryscope: a minute cadence, 10,000-square-degree FoV, gigapixel-scale telescope
Octavi Fors Aldrich; Nicholas M. Law; Philip J. Wulfken; Jeffrey Ratzloff


Thursday

434. Computation, Data Handling and Other Matters Posters, Thursday, 9:00-2:00

434.01. Spherical harmonic transit analysis with PAPER
Jason Ling; Saul A. Kohn; James E. Aguirre

434.02. Time-domain Surveys and Data Shift: Case Study at the intermediate Palomar Transient Factory
Umaa Rebbapragada; Brian Bue; Przemyslaw R. Wozniak

434.03. A new ultra-fast Moving Object Discovery Engine for iPTF, ZTF, and beyond
Frank J. Masci; Adam Waszczak; Russ Laher; James M. Bauer; Thomas A. Prince; George Helou; Shrinivas R. Kulkarni

434.04. Comparing the Mass Functions of Simulated Galaxies
Nicholas Miller; Ariyeh Maller; M.K Ryan Joung; Julien Devriendt; James Bullock

434.05. A New Laboratory for MM-/Sub-MM-Wave Characterization of Cosmic Dust Analogs
Samuel Birsa; Huy Do; Frederick Williams; Lunjun Liu; Ryan Schonert; Thushara Perera

434.06. IPAC Firefly package goes open source
Xiuqin Wu; William Roby; Tatiana Goldina; Loi Ly

Hack Day, Thursday, 10:00-7:00
A day to work intensively on collaborative projects. A wide variety of projects will be undertaken and will be everything from software development and coding to creative outreach projects. Projects that take advantage of the unique gathering of enthusiasm and expertise at the Winter AAS Meeting are particularly encouraged. Hack ideas and participants will be solicited before and during the meeting. Participants can either lead a project or join a project and should plan on focusing primarily on only one hack. In addition, we ask participants to commit to hacking for the majority of the day. Registration is encouraged to facilitate pre-meeting coordination, but not required.
Event Type: Workshop
Organizer: Kelle L. Cruz
Chair: David W. Hogg
Location: 4C-2 (Convention Center)

Creating and evaluating data management plans

I’m delighted to offer the following guest post by Jonathan Petters, Data Management Consultant, Johns Hopkins Data Management Services, and thank him very much for it!

Funding agencies have long encouraged and expected that data and code used in the course of funded research be made available to those in the research discipline.In a recent discussion on preservation and sharing of research data, a few participants expressed their concern (paraphrased here) that “My research community doesn’t know how to create a quality data management plan” and “We don’t know how to evaluate data management plans.” The astronomy community explicitly requested a little guidance. We in Johns Hopkins University Data Management Services have developed a few resources, described below, of use in both developing and evaluating data management plans within all research disciplines, including astronomy.

Funding agencies have long encouraged and expected that data and code used in the course of funded research be made available to those in the research discipline. NSF is an important funder of astronomical research that has such expectations (and the agency I will focus on here). A few years ago NSF began requiring data management plans as part of research proposal, in part to aid in the dissemination and sharing of research data and code. Following a February 2013 Office of Science and Technology Policy memo other US funding agencies are expected to follow suit with similar data management plan requirements, including the Department of Energy’s Office of Science.

What does NSF say about writing and evaluating quality data management plans? A good overview of NSF data policies relevant for the AST community can be found in these slides from Daniel Katz, NSF). In general the National Science Foundation (NSF) states that data management will be defined by “the communities of interest.” The NSF AST-specific policy further states “MPS Divisions will rely heavily on the merit review process in this initial phase to determine those types of plan that best serve each community and update the information accordingly.” Neither statement is especially prescriptive and can leave researchers unclear as to what they should do.

Creating a plan
While effective research data management certainly has community- and discipline-specific attributes, there ARE aspects of effective data management that are generalizable across research disciplines. It is around these general aspects that we in Johns Hopkins University Data Management Services (JHUDMS) devised our Data Management Planning Questionnaire. We work through this questionnaire with researchers at Johns Hopkins to help them create effective data management plans.

The Questionnaire is designed to comprehensively hit upon the important aspects of effective research data management (e.g. data inputs/outputs in the research, ethical/legal compliance, standards and formats used, intended sharing and preservation, PI restrictions on the use of the data).  By answering the applicable questions in the document, removing the questions/front matter and connecting the answers in each section into paragraphs, a researcher would be well on their way to a quality, well thought-out data management plan.

Two relevant side-notes:
1.)   For the Questionnaire we consider code and software tools as one ‘kind’ of research data; thus analysis or simulation codes used in the course of your proposed research should be included as a Data Product. While research code and research data generated or processed by code are clearly NOT the same, there are many similarities in managing the two. In both cases effective management should include consideration of documentation, licensing, formats, associated metadata, and upon what platform(s) the data or code could be shared.

2.)   Astronomy, as in other disciplines, conducts a substantial amount of research through large collaborations (e.g. surrounding HST or SDSS data). In these cases it is typical for investments in research data infrastructure to be made, and data policies/practices to be defined for those working with the data. Citing those policies and practices in a data management plan would be appropriate.

Screenshot of Reviewer Guide and Worksheet for Data Management Plans

Screenshot of Reviewer Guide and Worksheet for Data Management Plans

Evaluating a plan
To help researchers evaluate data management plans for their quality, my colleagues developed the Reviewer Guide and Worksheet for Data Management Plans (dotx). This Guide and Worksheet is a complement to our Questionnaire; it is a handy checklist by which a grant reviewer can determine whether a researcher thoroughly considered the important aspects of research data management.

For those who researchers saying to themselves, “The Questionnaire and Reviewer Guide are nice, but PLEASE just tell me what to do!!!”, I found two tweets from the code sharing session at the latest (223rd) AAS meeting in January to be quite relevant (h/t August Muench and Lucianne Walkowicz):

Who enforces software/data sharing in astronomy? YOU DO! WE DO! PEER REVIEW DOES! not snf/nasa #aas223 #astroCodeShare It's UP TO YOU to include good data management plan as part of panel reviews. The community must enforce importance. #aas223 #astroCodeShare

I wholeheartedly agree with both tweets. It is up to the research community members to police and enforce the data management and sharing practices they would like to see in their community. That’s how peer review works! So the next time you review astronomical research proposals, look over the data management plans carefully and bring up relevant thoughts and concerns to the review panel.

Summing up
I hope the Data Management Planning Questionnaire and Reviewer Guide and Worksheet for Data Management Plans help you and other researchers in the astronomy community more fully develop expectations for data management and sharing practices. It’s likely your institution also has research data management personnel (like the JHUDMS at Hopkins) who are more than happy to help!

Tweets from and about the code sharing session at AAS223

The code sharing crowd took over the AAS Twitter feed, it seems, during the Special Session on code sharing at AAS 223. Bottom up is the best way to read these, as the most recent tweet is on the top, and please note they aren’t strictly in order of occurrence and I likely missed some (there were so many!). I’m happy to add those I missed if someone tells me about them. Thanks to all those who tweeted throughout the session!

  1. ASCL@asclnet 10 Jan
    @
    shaka_lulu
    I keep a list of articles of possible interest to #astroCodeShare folks here: http://asterisk.apod.com/viewtopic.php?f=35&t=21544 …. #aas223
  2. Nuria Lorente@NoTruerAlien 7 Jan
    @
    augustmuench
    @bruceberriman Absolutely, but NOT releasing code also comes at a price, which is often forgotten. #aas223 #astroCodeShare
  3. Zach Pace@zpacefromspace 7 Jan
    Just got finished with an awesome breakout session at #aas223 on code sharing. The moral: your code may be crap, but release it anyway!
  4. Nuria Lorente@NoTruerAlien 7 Jan
    Morin et al: Informative paper on Sw licensing for Scientist-Programmer. MT @augustmuench: http://bit.ly/QlZKDP #astroCodeShare #aas223
  5. Chrissy Madison@cmmadiso 7 Jan
    See. It happens! RT @bathompso: Pulling a @cmmadiso: my phone has 1% battery after the #astroCodeShare session. #AAS223
  6. Ben Thompson@bathompso 7 Jan
    Pulling a @cmmadiso: my phone has 1% battery after the #astroCodeShare session. #AAS223
  7. Adrian Price-Whelan@a_p_w 7 Jan
    RE: writing quick/dirty code to get papers out. “Weeks of programming can save you hours of planning.” #aas223
  8. Lucianne Walkowicz@shaka_lulu 7 Jan
    Hanisch: there should also be a prize for software, esp since Webber prize is for hardware only #aas223 #astroCodeShare
  9. August Muench@augustmuench 7 Jan
    .@AlexaVillaume note: that paper is for software. licensing of *data/papers* is distinct but VERY important thing. #aas223 #astroCodeShare
  10. Lucianne Walkowicz@shaka_lulu 7 Jan
    Licensing: BSD or MIT and forget about it- but we should discuss it more as community – @davidwhogg #astroCodeShare #aas223 cc @jonmccann
  11. Lucianne Walkowicz@shaka_lulu 7 Jan
    @
    jonmccann
    crap I had to answer an email and missed license discussion. Maybe check #astroCodeShare tag if someone else got it
  12. Christopher Hanley@chanley 7 Jan
    @
    eblur27
    Projects should include a citations file in repo right next to LICENCE.txt and README. Make it easier to be cidted #astroCodeShare
  13. August Muench@augustmuench 7 Jan
    so @aaccomazzi brings us back to licensing: unlicenced code is the WORST. @davidwhogg echos this point #aas223 #astroCodeShare
  14. Lia Corrales@eblur27 7 Jan
    Hey #astroCodeShare, I still want to know how I should cite software I get through github. Could help with fluid contributor lists #aas223
  15. Lucianne Walkowicz@shaka_lulu 7 Jan
    Something to be gleaned from size of room v attendance v perceived necessary size of room & how code is valued. #aas223 #astroCodeShare
  16. Ben Cook@bacook17 7 Jan
    Reference in #aas223 code sharing session. http://wssspe.researchcomputing.org.uk
  17. Kelle Cruz@kellecruz 7 Jan
    #
    AAS223
    lots of different ways to share code…but I really want to spend time and energy making it expected & common practice.
  18. Lucianne Walkowicz@shaka_lulu 7 Jan
    Comment in back: u make good science by making good investments- invest in quality code by encouraging code sharing #aas223 #astroCodeShare
  19. Kelle Cruz@kellecruz 7 Jan
    #
    aas223
    I care less about how we data & code share. tech will work itself out. I want to make it a *requirement* for funds and publications.
  20. Alexa Villaume@AlexaVillaume 7 Jan
    A mortifying story of a misplaced 2 in a program causing 8 years of research going down the drain. Share your code. It’ll be ok. #aas223
  21. Meredith Rawls@merrdiff 7 Jan
    .@kellecruz If it’s easy to share code & get credit, we’ll do it. Reminds me of this: http://theoatmeal.com/comics/game_of_thrones … #aas223 #astroCodeShare
  22. Lucianne Walkowicz@shaka_lulu 7 Jan
    There are more ppl in this room than were in the Kepler session I attended yesterday. #aas223 #astroCodeShare
  23. August Muench@augustmuench 7 Jan
    @
    jradavenport
    225: Panel: MMMMMM Q: FMMFFFMM A: all M except 1 comment by F audience #astroCodeShare
  24. August Muench@augustmuench 7 Jan
    So @eteq is gonna drop the mic: papers have fixed author lists. software authorship if fluid and grows. Et tu, ADS? #astroCodeShare #aas223
  25. Lucianne Walkowicz@shaka_lulu 7 Jan
    Reply: if it goes on arXiv you can never update contributor list, so subsequent contributors don’t get credit #aas223 #astroCodeShare
  26. Lucianne Walkowicz@shaka_lulu 7 Jan
    Prsa: would be great if announcement of code went up on arXiv (I think they often do as release papers, e.g. emcee) #aas223 #astroCodeShare
  27. August Muench@augustmuench 7 Jan
    interesting point: Montage built under contract to NASA; astropy built by cats, hosted on a cat based website #astroCodeShare #aas223
  28. Meredith Rawls@merrdiff 7 Jan
    Andrej Prsa implores everyone in #astroCodeShare session to post code on astro-ph every time you submit a paper. #aas223
  29. Meredith Rawls@merrdiff 7 Jan
    My advisor has said not to “waste time” writing generalized code; contradicting this is troubling. Mixed messages. #aas223
    #astroCodeShare
  30. Lucianne Walkowicz@shaka_lulu 7 Jan
    Cost to sharing: making code useable by anyone req more time than just making it work for you then publishing w it #aas223 #astroCodeShare
  31. Meredith Rawls@merrdiff 7 Jan
    Recurring theme of how do I maximize research productivity and make my code useable for others? Not an easy Q. #AAS223 #astroCodeShare
  32. Lucianne Walkowicz@shaka_lulu 7 Jan
    That is, proprietary data sometimes equates to leverage- there is prob some analogy in code community- @davidwhogg #aas223 #astroCodeShare
  33. Lucianne Walkowicz@shaka_lulu 7 Jan
    panelists doubt ppl are being hired for a specific code as opposed to skill, but must be analogy w proprietary data #aas223 #astroCodeShare
  34. Lucianne Walkowicz@shaka_lulu 7 Jan
    Q: how do we reward ppl in ways that don’t req keeping code proprietary? As in, ppl get hired bc they have the code #aas223 #astroCodeShare
  35. Lia Corrales@eblur27 7 Jan
    Sad I’m missing #astroCodeShare, but reports of a packed room and massive twitter coverage are letting me stay comfy in COS session #aas223
  36. Lucianne Walkowicz@shaka_lulu 7 Jan
    Besides, every little thing you think no one else needs- *someone* will prob find it useful #aas223 #astroCodeShare
  37. Lucianne Walkowicz@shaka_lulu 7 Jan
    A: do what you need, if no one else needs it then that’s fine, you haven’t made anyone’s life worse – @davidwhogg #aas223 #astroCodeShare
  38. Lucianne Walkowicz@shaka_lulu 7 Jan
    Q: what’s the balance bt needing to make code work for yourself vs making it useful for everyone always? #aas223 #astroCodeShare
  39. Lucianne Walkowicz@shaka_lulu 7 Jan
    Comment in back: u make good science by making good investments- invest in quality code by encouraging code sharing #aas223 #astroCodeShare
  40. Lucianne Walkowicz@shaka_lulu 7 Jan
    Katz: not really, unlikely beyond a few years’ horizon at a time #aas223 #astroCodeShare
  41. Lucianne Walkowicz@shaka_lulu 7 Jan
    No long term stewardship of code like there is for results (i.e. pubs)- does NSF have plans for that? – @davidwhogg #aas223 #astroCodeShare
  42. Ben Thompson@bathompso 7 Jan
    . @kellecruz starting off the #astroCodeShare question session strong. Why no AAS reps here? #AAS223
  43. Lucianne Walkowicz@shaka_lulu 7 Jan
    Do you use other people’s codes? Do you modify them or use them as is? #astroCodeShare #aas223
  44. Lucianne Walkowicz@shaka_lulu 7 Jan
    Do you share code? If not, why not? #astroCodeShare #aas223
  45. Lucianne Walkowicz@shaka_lulu 7 Jan
    Benefits: perceived priority on work, visibility & good will, citations, bug-catching, and moral high ground – DWH #astroCodeShare #aas223
  46. Alexa Villaume@AlexaVillaume 7 Jan
    Releasing code establishes priority and good will. Benefit from bug catching. Also, you get to be smug. #aas223
  47. Laura Watkins@laurawatkins_ 7 Jan
    @
    davidwhogg
    : if you’re not embarrassed by the code you released, you released it too late. #aas223
  48. Ian Paul Freeley@ianpaulfreeley 7 Jan
    If your not embarrassed by your code/website, you launched too late–Hogg #aas223
  49. Ben Thompson@bathompso 7 Jan
    If you’re not embarrassed by your code, you’re releasing it too late #AAS223 #astroCodeShare
  50. Meredith Rawls@merrdiff 7 Jan
    .@davidwhogg debunks cons to code sharing. Only real cost is email & support requests. He knows of NO example of being scooped. #aas223
  51. August Muench@augustmuench 7 Jan
    hogg: “if you’re not embarrassed by the code/website you put out there then you put it out there too late.” so good. #astroCodeShare #aas223
  52. Lucianne Walkowicz@shaka_lulu 7 Jan
    Cost: embarrassment! You know your code is crap, but if yr not embarrassed you released too late. –@davidwhogg #aas223 #astroCodeShare
  53. Lucianne Walkowicz@shaka_lulu 7 Jan
    Costs: getting scooped? @davidwhogg knows of no cases of scooping caused by *release of code* #astroCodeShare #aas223
  54. Lucianne Walkowicz@shaka_lulu 7 Jan
    All papers, grant writing, etc – not just code – are developed out in the open since 2005. – @davidwhogg #astroCodeShare #aas223
  55. Matthew Turk@powersoffour 7 Jan
    @
    augustmuench
    Not all good or new software is developed using github. Platforms should be transcended by applications. #astroCodeShare
  56. Lucianne Walkowicz@shaka_lulu 7 Jan
    And boom, @davidwhogg right on time. Also, who mic’d him? #astroCodeShare #aas223
  57. Laura Watkins@laurawatkins_ 7 Jan
    +1 MT @augustmuench “and this fact terrifies me because we have no idea collectively what sharing should look like. #astroCodeShare #aas223
  58. August Muench@augustmuench 7 Jan
    and this fact terrifies me because education — we have no idea collectively what sharing should look like. #astroCodeShare #aas223
  59. August Muench@augustmuench 7 Jan
    Who enforces software/data sharing in astronomy? YOU DO! WE DO! PEER REVIEW DOES! not nsf/nasa. #aas223 #astroCodeShare
  60. Lucianne Walkowicz@shaka_lulu 7 Jan
    Its UP TO YOU to include good data management plan as part of panel reviews. The community must enforce importance. #aas223 #astroCodeShare
  61. Lucianne Walkowicz@shaka_lulu 7 Jan
    Data management plans in NSF proposals are required to detail how results/data/software will be shared. – Katz #astroCodeShare #aas223
  62. Lucianne Walkowicz@shaka_lulu 7 Jan
    NSF policy for sharing research results: supposed to share not only the data and the results but the software #astroCodeShare #aas223
  63. Lucianne Walkowicz@shaka_lulu 7 Jan
    NSF does include “products” in addtn to pubs in bio sketches, but could be better abt following up on code release #AAS223 #astroCodeShare
  64. Lucianne Walkowicz@shaka_lulu 7 Jan
    Do we have policies that mandate code release in conjunction w publication or receipt of fed funds? #aas223 #astroCodeShare
  65. August Muench@augustmuench 7 Jan
    Software that enables all this new software: Github, Travis, Sphinx, Jenkins. #aas223 #astroCodeShare
  66. August Muench@augustmuench 7 Jan
    Agreed. RT @kellecruz: .@augustmuench we need to make data/code sharing requirements part of AAS journal policy. those two things. #aas223
  67. Ben Thompson@bathompso 7 Jan
    Testing code is an important part of code sharing. #aas224 session? #AAS223
  68. Kelle Cruz@kellecruz 7 Jan
    .@augustmuench we need to make data/code sharing requirements part of AAS journal policy. those two things. #aas223
  69. Lucianne Walkowicz@shaka_lulu 7 Jan
    If you build it, they will code – Tollerud #aas223 #astroCodeShare
  70. Lucianne Walkowicz@shaka_lulu 7 Jan
    Need infrastructure, a few software ppl to do housekeeping, let scientists do whatev & set expectations – Tollerud #astroCodeShare #aas223
  71. Lucianne Walkowicz@shaka_lulu 7 Jan
    Most ppl who have contributed code to AstroPY have never met each other – all via @github – Tollerud #astroCodeShare #aas223
  72. Ben Thompson@bathompso 7 Jan
    Almost 60 people (who have not met) have all worked together to build @astropy #astroCodeShare #AAS223
  73. Lucianne Walkowicz@shaka_lulu 7 Jan
    AstroPY: a python library for and by astronomers, developed by self-herding astronomers since 2011 – Tollerud #astroCodeShare #aas223
  74. August Muench@augustmuench 7 Jan
    I hoping that we see some cool diffs between the @astropy and montage *support* networks in the open discussion in #astroCodeShare #aas223
  75. August Muench@augustmuench 7 Jan
    . @merrdiff “research objects” is I think the new age terminology. #aas223 #astroCodeShare
  76. David Morrison@drmorr0 7 Jan
    @
    merrdiff
    Best advice I have: learn to use Git (or SVN, if you must), and use it for every single piece of code you write. #astroCodeShare
  77. Lucianne Walkowicz@shaka_lulu 7 Jan
    I have used this exact cat herding graphic in Erik Tollerud’s talk in an LSST talk hehe #aas223 #astroCodeShare
  78. August Muench@augustmuench 7 Jan
    the @astropy project — cat herding software development from @eteq at #astroCodeShare #aas223
  79. Ian Paul Freeley@ianpaulfreeley 7 Jan
    Damn it–tweets from code sharing session sounded cool, but I got here late and now crowd out the door. #aas223
  80. Alex Parker@Alex_Parker 7 Jan
    I’m nodding so vigorously at the #astroCodeShare tweets that I might need to ice my neck later.
  81. Dr Chris Tibbs@chris_tibbs 7 Jan
    Love the fact that my timeline is currently full of great tweets about code sharing and EPO #aas223
  82. Kelle Cruz@kellecruz 7 Jan
    #
    aas223
    ok, maybe 20% women in code sharing session but still disproportionately tweeting. #interesting
  83. Alexa Villaume@AlexaVillaume 7 Jan
    “I wrote my first fortran code when Apollo 12 was on the moon.” #aas223
  84. August Muench@augustmuench 7 Jan
    Decision to code Cloudy in C++ was partly motivated to use industry grade lang & give students real world job ops! #astroCodeShare #aas223
  85. Meredith Rawls@merrdiff 7 Jan
    Learning about the CLOUDY code, but speaker has no visuals 🙁 Jokes that the code can be opaque; “C++: write once, read never.” #aas223
  86. Lucianne Walkowicz@shaka_lulu 7 Jan
    Complaining astros aren’t comp scientists is like saying they shldn’t learn math bc they aren’t mathematicians #astroCodeShare #aas223
  87. Timothy Pickering@te_pickering 7 Jan
    #
    preach
    ! RT @shaka_lulu: I’ll paraphrase @mjuric here: code is to modern astronomy what calculus once was. #aas223 #astroCodeShare
  88. Lucianne Walkowicz@shaka_lulu 7 Jan
    I’ll paraphrase @mjuric here: code is to modern astronomy what calculus once was. #aas223 #astroCodeShare
  89. Kelle Cruz@kellecruz 7 Jan
    #
    aas223
    could someone in the back of the code sharing session do a quick attendance & gender count? I’m in the front row…
  90. Jessica Lu@jlu_astro 7 Jan
    @
    kellecruz
    Tell me if you figure it out!
  91. Kelle Cruz@kellecruz 7 Jan
    #
    aas223
    code sharing room is packed! I’m curious what brought them all here…
  92. Lucianne Walkowicz@shaka_lulu 7 Jan
    Q: how much do you think we fail to educate our young researchers to write good code? #astroCodeShare #aas223
  93. Ben Thompson@bathompso 7 Jan
    Q on why students are not educated on how to write good code (or code at all!) #AAS223
    We have all failed here.
  94. August Muench@augustmuench 7 Jan
    The code base under question is Montage http://bit.ly/1aELvEz , dev’d & now volunteerly supported by IPAC scientists #aas223 #astroCodeShare
  95. August Muench@augustmuench 7 Jan
    “Releasing your code comes with a price” — @bruceberriman Hmm, let’s see if this pivots to the positive+solutions! #aas223 #astroCodeShare
  96. Lucianne Walkowicz@shaka_lulu 7 Jan
    Lastly, resist the pundit-technician divide. – Weiner #aas223 #astroCodeShare
  97. August Muench@augustmuench 7 Jan
    I completely agree with @cloud149: a lot of our concerns about sharing code are “pseudo” or hypothetical problems. #aas223 #astroCodeShare
  98. Michelle Collins@michelle_lmc 7 Jan
    We are failing to teach students how to write GOOD code in astronomy. Need to do better. Some programs in place, but not standard #aas223
  99. Laura Watkins@laurawatkins_ 7 Jan
    “Do we do enough to teach our researchers how to write good code?” No. Fundamental skills but so many are left to learn alone. #aas223
  100. Kelle Cruz@kellecruz 7 Jan
    #
    aas223
    really interesting that nearly 100% of the women in the code sharing session are tweeting…all 4 of us. #exaggerating
  101. Laura Watkins@laurawatkins_ 7 Jan
    Standing room only at the code sharing session. Apparently this is more popular than anticipated (this can only be a good thing)! #aas223
  102. Meredith Rawls@merrdiff 7 Jan
    Astrophysics code sharing session 225 at #aas223. Let’s stop re-inventing the wheel. Our hardware is built to last; why not software?
  103. Michelle Collins@michelle_lmc 7 Jan
    Oh, there are no women on the code sharing panel. Are we not sharing code? I’m currently not, but i’m here to learn how to #aas223
  104. Ben Thompson@bathompso 7 Jan
    Excited for the Astronomy Code Sharing session. Wondering what to do with all my research programs. #AAS223
  105. Erik Tollerud@eteq 7 Jan
    #
    aas223
    , Tues@2pm: talking about lesson’s from @astropy on how code can be shared, along side @owlice @davidwhogg @bruceberriman @cloud149
  106. Benjamin Weiner@cloud149 7 Jan
    My talk “Occupy Hard Drives” for code session Tues 2 pm #aas223 is here: http://bit.ly/1acClmg @davidwhogg @bruceberriman @owlice @eteq
  107. ADASS@astroADASS 7 Jan
    Follow discussion on astronomy code sharing at the #aas223 meeting using #astroCodeShare hashtag.
  108. Benjamin Weiner@cloud149 7 Jan
    Tues 2pm #aas223 I aim to provoke on astro code sharing and why we don’t respect software with @davidwhogg @bruceberriman @owlice @eteq
  109. Astropy@astropy 6 Jan
    At the #aas223? Don’t miss Tuesday’s 14:00-15:30 session on code sharing – including a talk by @eteq about @astropy!

  110. David W. Hogg@davidwhogg 6 Jan
    Tues at 2 see @owlice Hanisch Teuben @cloud149 @bruceberriman Ferland Katz @eteq and me get all crazy about #code sharing at #aas223 in NH5

Astrophysics Code Sharing II: The Sequel at AAS 223

On Tuesday, January 7, the AAS Working Group on Astronomical Software (WGAS) and the ASCL sponsored a special session on code sharing as a follow up to the splinter meeting “Astrophysics Code Sharing?” held at AAS 221. We continue the dialogue for ways to improve the transparency and efficiency of research by sharing codes and to mitigate the negative aspects of releasing them.

Photograph of room session was held in, showing people in seats and standing in the back of the room

Even before the session began, it looked like there would be standing room only. Photo, Peter Teuben, used with permission

Before the session started, however, there were a few nerve-wracking moments;  weather- and Amtrak-related delays had one of the presenters arriving at AAS at 2:40 AM the day of the session rather than before lunch on Monday, and another getting to AAS after the session had started (!) but before his talk was to begin. So yes! There were minutes to spare!

The standing-room-only session was moderated by Peter Teuben of the University of Maryland and chairman of the ASCL Advisory Committee; Robert Hanisch, STScI, outgoing chair of the WGAS and also a member of the ASCL Advisory Committee, provided closing remarks. Those not in the room were not without news of what was being said in it, as there was much tweeting about the session (#aas223, #astroCodeShare).

Peter started the session by introducing the speakers (present or not) and explaining a bit how the session would work: code case studies would have 2-minute question periods for any clarifications or questions about the cases themselves, and other questions would be deferred until the open discussion period, which was approximately the latter half of the session.

Presentations
A very brief summary of some main points of the sessions, along with their titles, presenters, and links to slides where applicable, is given here.

    • Occupy Hard Drives: Making your work more valuable by giving it away, Benjamin Weiner (University of Arizona)
      Ben pointed out that time spent writing software represents an enormous sunk cost that is, unfortunately, not viewed as doing real work, though writing software is part of doing science. He stated that widely-used software has enabled at least as much science as a new instrument would. He encouraged people to document their code for their own sake, release it without worrying about bugs or other potential issues in the software, and to write software methods papers for journals.
      slides (PDF)
    • Maintaining A User Community For The Montage Image Mosaic Toolkit, Bruce Berriman (Caltech)
      In this case study of Montage, Bruce stated that releasing software comes with a cost, but that it is still worth doing. Montage was developed under contract, and was designed for ease of maintenance, modularity, and sustainability from the beginning. It is maintained primarily through volunteer effort, and in part through collaborations, e.g., with the LSST EPO team. He said the Caltech license under which Montage is licensed does not allow users to redistribute modified code, nor can Montage be included in other distributions such as Redhat. He suggests coders consider licensing carefully.
      slides (PDF)
    • Cloudy – simulating the non-equilibrium microphysics of gas and dust, and its observed spectrum, Gary Ferland (University of Kentucky)
      Gary discussed Cloudy, which, with over three decades of use, is the most mature of the three codes covered in this session. The code is autonomous and self-aware, providing warnings about what might have gone wrong when things do go wrong. Though the user community is broad and participants in the summer schools that are held on the code have formed collaborations, a Yahoo! discussion forum for Cloudy has not been as successful as they had hoped. Cloudy was released as open access, with the most permissive license possible; Gary cited NSF as making this necessary since the code was developed with public grant funds. Students who work on the code get industry-standard programming experience, which is intended to help students gain employment after graduation.
      slides (PDF)
    • NSF Policies on Software and Data Sharing and their Implementation, Daniel Katz (National Science Foundation)
      Dan covered the NSF policies that govern software funded by the agency. Though some NSF panels are much more rigorous than others, it is expected that PIs will publish all significant findings, included data and software; he stated quite firmly that data include software according to the Government. He also said that it is up to the community via peer review panels to enforce these policies, that many core research programs don’t enforce this very well, and that the community determines what is and is not acceptable. This may be changing, however, as with an Office of Science and Technology Policy memo on open data, OMB policies are pushing harder on open access.
      slides (PDF)
    • The Astropy Project’s Self-Herding Cats Development Model, Erik Tollerud (Yale University)
      The newest of the three code projects highlighted is Astropy. Erik described the grass-roots effort to self-organize the now ~60 code base contributors, and that this arose out of a common goal: to streamline astronomy tools written in Python, as having eight different packages to do the same thing means that 7/8s of the effort was wasted effort. He stated that technology now exists that provides good support for such an effort, including GitHub to manage the processes of many developers, Travis for testing code, and Sphinx for documentation, which is written as the code is written. He pointed out that agreement on the problem was the key in getting the effort to come together and that consensus, guidelines, and expectations make it work.
      slides (PDF)
    • Costs and benefits of developing out in the open, David W. Hogg (New York University)
      David started out by saying that everything his group does is open —  all papers, grant proposals, comments, and codes — and has been since 2005, and that this was a pragmatic, not an ethical decision. He stated that the negatives others give for not releasing code — getting scooped, embarrassment, time, e-mail and support requests, licensing — are overplayed, and that since the public is paying for this, we should return the products we develop to them. He doesn’t know of a single case of someone’s getting scooped because he/she shared code. Rather, the benefits that sharing openly provides, establishing priority, visibility and goodwill, re-use and citations, feedback and bug-catching, and having the moral high ground, outweigh the overplayed negatives.
      slides (PDF)

Discussion
After David’s presentation, Peter opened the floor for questions and discussion, and Kelle Cruz from Hunter College was ready! Kelle said that AAS should require code release and then asked whether anyone from the AAS journals was present. There was not.

Photo of slide with unneeded discussion questions on them

We didn’t really need to prompt discussion; there was plenty to talk about! Photo, Meredith Rawls, used with permission

Kelle then suggested to Daniel Katz that the NSF should take stronger role in enforcement. Dan said he will see what he can do to get astronomy reviewers training for what to look for, and that he already does this for his area. David Hogg said there aren’t any mechanisms for long-term stewardship of software and asked whether the NSF was looking at this. Dan said it is not at this time, and that the NSF generally avoids long-term commitment of funds.

Someone in the back of the room pointed out that protection of code can also lead to the protection of errors, told a sad anecdote to illustrate that point, and commented that code sharing fosters improvements in coding practice. In response to a question about whether it was worthwhile to share very specific code, David answered yes, just post it, that if it’s not useful to others, so what? But it just might be! And Benjamin Weiner suggested the code be put in GitHub.

Two questions came from someone else in the back of the room, one on whether export control restrictions (ITAR) would be changing; the second question relayed that PhD students write a code for their thesis but then protect it because, in their perception, the code makes them employable, and did the panel have anything to say about that? Erik Tollerud made the point that people are hired for the skills that went into creating the code, not for a particular code. David replied that he has seen this with data, that proprietary data does sometimes give someone leverage for employment. Dan answered the ITAR question by saying that changes in ITAR were probably not going to happen soon.

Another attendee asked about the cost of making code shareable, of what that cost is, and felt that the panelists had swept it under the rug. Ben replied that it’s a community problem, the community needs to reward it, and there needs to be a values change. In the meantime, put it out there anyway; clean it up if you can, but put it out. David agrees there are costs, but the benefits are more substantial than the costs. The cost is not very large and the upside is larger than the downside. Bruce thinks it is worth the effort to plan upfront; that will save time/money later on. This is harder if the code is not initially planned, but one should try to address this when possible.

Nuria Lorente, who was following the session from Australia through Twitter, tweeted that “NOT releasing code also comes at a price, which is often forgotten.”

Andrej Prsa from Villanova made a strong appeal to post code to arXiv; he stated that astro-ph should be open to other things beside preprints. Someone else pointed out that arXiv doesn’t necessarily agree. David said that he put the documentation for emcee, the MCMC hammer on arxiv and that gets cited. Erik made the point about additional contributors to a software development project such as Astropy don’t get credit if they are not on the author list on the paper uploaded to arXiv. Alberto Accomazzi from ADS mentioned that updating the author list on arXiv was a way to fix that and give others credit, even though that will not be reflected on ADS.

Someone commented on the need for some sort of code sharing infrastructure to help with sharing. David commented that he wants all flowers to bloom, but some flowers are more valuable than others. Erik said that better search engines over time will help, that Astropy is more findable because of better search engines and because more people now link to it. It was mentioned that with more code sharing, finding useful codes may become more difficult as the signal to noise ratio goes down.

Alberto Accomazzi brought up the uncertain provenance of code, code that does not have a license, and sometimes no author, attached to it, and stated that it is hard to deal with because it cannot be shared. This was echoed by David, who pointed out that a lack of a license for a code can prevent release. Bruce suggested a licensing workshop would be a good idea, and this idea got traction among attendees. The recent re-licensing of yt was brought up. Dan Katz looks specifically for licensing information when looking at proposal, and it’s clear to him that many people don’t know what they are doing on this and could use guidance. David suggested that people use BSD or MIT licenses if they know nothing about licensing.

Peter Teuben then brought the discussion to an end and turned the podium over to Robert Hanisch for closing remarks.

Session wrapup
Robert Hanisch reiterated that software sharing is fundamental to the dissemination and validation of research results, and though there are carrots and sticks for software sharing, the sticks are not very strong. He also pointed out that nothing within the funding agencies offers support for software development and that there is a disconnect between national policy and implementation. Journals at best only encourage code release, too; they do not require it. A sociological change is necessary; in the meantime, he hopes those attending will just put codes out there!  The benefits outweigh the costs.

He talked also about opportunity for change; as of Sunday January 5, the Working Group on Astronomical Software has Frossie Economou as its new chair, and that over the weekend, the Council of AAS had suggested that the WGAS be elevated from a Working Group to a Division within AAS. He had requested that the Council have the WGAS offer a prize specifically for software, and though the Council did not accept the idea upon presentation, Bob noted that a Division can award prizes independently. Having a Division focused on software will also provide more visibility for it, and on this hopeful note, the session ended.

… though the discussion continues…

My thoughts (just a few)
This is the fourth discussion session the ASCL has arranged; previous sessions include one at AAS 221 and two at the previous two ADASS meetings. (Links to materials or discussion from previous sessions are below.)

I was glad to hear several of the presenters say the concerns people have about releasing their codes are overplayed. I was particularly happy when David said that if people would only go ahead and release their imperfect software, other people would see that released codes are also imperfect and thus feel more emboldened to release their own imperfect work. Yes! Lose the fear, gain the codes! It really doesn’t need to be perfect; Nick Barnes, among others, have written eloquently, or amusingly, on this subject already. Astronomical software wants to be free; please release it, let it show!

It was hard for me to stay silent when the need for a code sharing infrastructure was mentioned, not because I disagree with the need — I believe the need is very great! — but because the ASCL is trying hard to help with that. I’ve looked at other similar efforts tried over the years, and either they have started, lived (usually briefly) and in one case, even flowered, and died, or they still exist but are mostly silent and their efforts in code sharing dormant. The ASCL has been around since 1999 and is indexed by ADS, and use of it has been increasing. It’s not perfect, but it does work and is actively growing.

I believe that science should be as transparent as possible, that code release (absent ITAR and other truly compelling reasons) even if only for examination, not reuse, is part of this transparency, and that ultimately, code release is better for code authors, especially if the astronomy community works together to make it better for them. Code sharing can make astronomy more efficient, too, which is especially important in the current financial climate.

Finally, I want to thank Peter for moderating the session, Bob for offering closing remarks, and the most excellent Ben, Bruce, Gary, Erik, Dan, and David for presenting at this session and also for not protesting even one time about the innumerable emails they received from me from May on. I also have to thank our wonderful volunteer whose name I did not get, alas, for her great work and for counting the 149 (!) attendees, the AAS for accepting the proposal in the first place, and the amazing people who sent this session literally around the world through their tweets. Thank you!

AAS 221 Astronomy Code Sharing? links
Announcement
Omar Laurino joins Astronomy Code Sharing panel
Brief blog post
Astronomy Computing Today post
Slides used at meeting: Google Doc  PDF

ADASS XXIII (2013) links
Announcement
Our eight questions
The eight questions that were discussed/links to discussion notes
Pre-print of proceedings paper

ADASS XXII (2012) links
Birds of a Feather session
Resources used/linked to for ADASS
Pre-print of proceedings paper

What to do with an astronomy code? Share it!

The poster I’m presenting at AAS 223 is below. Please stop by poster 255.25 on Tuesday; I would love to know whether you share your codes and why you do or do not.

AAS2014JanPosterFinal-20percentsize
Abstract: Now that you’ve written a useful astronomy code for your soon-to-be-published research, you have to figure out what you want to do with it. Our suggestion? Share it! This presentation highlights the means and benefits of sharing your code. Make your code citable — submit it to the Astrophysics Source Code Library and have it indexed by ADS! The Astrophysics Source Code Library (ASCL) is a free online registry of source codes of interest to astronomers and astrophysicists. With over 700 codes, it is continuing its rapid growth, with an average of 17 new codes a month. The editors seek out codes for inclusion; indexing by ADS improves the discoverability of codes and provides a way to cite codes as separate entries, especially codes without papers that describe them.

Authors: Alice Allen, Astrophysics Source Code Library
Alberto Accomazzi, Smithsonian Astrophysical Observatory
Bruce Berriman, California Institute of Technology
Kimberly DuPrie, Astrophysics Source Code Library
Robert Hanisch, Space Telescope Science Institute/Virtual Astronomical Observatory
Jessica Mink, Harvard-Smithsonian Center for Astrophysics
Robert Nemiroff, Michigan Technological University
Lior Shamir, Lawrence Technological University
Keith Shortridge, Australian Astronomical Observatory
Mark Taylor, University of Bristol, UK
Peter Teuben, University of Maryland
John Wallin, Middle Tennessee State University

ASCL at AAS 223

The ASCL will be at the AAS meeting in (not quite) Washington, DC next week; I’ll be handing out (non-glowing) pens like crazy at both the ASCL poster (255.25, titled You’ve Written a Cool Astronomy Code! Now What Do You Do with It?) and the Special Session (more information below) on Tuesday, too. I hope you’ll stop by the poster to say hi, talk codes, and grab a pen!

The AAS’s Working Group on Astronomical Software (WGAS) and the ASCL are holding a Special Session on code sharing that includes presentations and an open discussion. Peter Teuben and Robert Hanisch will moderate the session, which will be held on Tuesday, January 7, 2:00 to 3:30 in National Harbor 5, Gaylord National Resort and Convention Center.

The panelists and topics for the session are:

Benjamin J. Weiner, Occupy Hard Drives: Making your work more valuable by giving it away
G. Bruce Berriman, Maintaining a user community for the Montage Image Mosaic Toolkit
Gary J. Ferland, Cloudy – the non-equilibrium microphysics of gas and dust, and its observed spectrum
Daniel S. Katz, NSF policies on software and data sharing and their implementation
Erik J. Tollerud, The Astropy Project’s self-herding cats development model
David W. Hogg, Costs and benefits of developing out in the open

After the presentations, Peter will open the floor for questions and discussion; at the end of the discussion, Bob will summarize the themes and points and will close the Special Session.

We’ll be tweeting, too, especially during the Special Session on Tuesday: @asclnet or #asclnet. See you (in person or online) next week!