First look at software activities at AAS 229

Though we have a way to go before January’s AAS meeting (and ADASS and OpenCon on the ASCL’s schedule coming up sooner), a look at the schedule for the AAS meeting already shows multiple options for the computationally-inclined astronomer. I’m very excited about the Special Session we’ve organized with the Moore-Sloan DSE, called Perspectives in Research Software. Bruce Berriman (IPAC, Caltech/Astronomy Computing Today) will moderate the session. In keeping with previous sessions, the session will include a discussion period with the floor open for questions and comments; we want to hear what you have to say. We have a panel of seven speakers; the presenters and topics are:

Tracy Teal (Data Carpentry), Software not as a service
Michael Hucka (Caltech), Finding the right wheel when you don’t want to reinvent it
Lior Shamir (LTU), Reproducibility and reusability of scientific software
Ivelina Momcheva (STScI), Funding research software development
Heather Piwowar (ImpactStory), Capturing the impact of software
David W. Hogg (NYU), The relationships between software publications and software systems
And me, Update on research software citation efforts

I hope to see you there!

Other software events that have shown up so far on the AAS schedule are listed below. Good times coming!

Tuesday, 3 January 2017
Workshop: Introduction to Software Carpentry, 8:00 am ‐ 5:30 pm
Workshop: Using Python for Astronomical Data Analysis, 8:00 am ‐ 4:30 pm

Wednesday, 4 January 2017
Splinter Meeting: Flexible Multi‐dimensional Modeling of Complex Data in Astronomy, 9:30 am ‐ 11:30 am

Friday, 6 January 2017
Special Session: Perspectives in Research Software: Education, Funding, Reproducibility, Citation, and Impact, 10:00 am – 11:30 am

Saturday, 7 January 2017
Special Session: Statistical, Mathematical and Computational Methods for Astronomy (ASTRO): SAMSI 2016-17, 10:00 am – 11:30 am
Workshop: Hack Together Day, 10:00 am ‐ 7:00 pm

Also of likely interest is the Special Session on The Value of Astronomical Data and Long Term Preservation that will take place on Thursday, 4 January from 10:00 am – 11:30 am.

 

September 2016 additions to the ASCL

Twenty-five codes were added to the ASCL in September 2016:

21cmSense: Calculating the sensitivity of 21cm experiments to the EoR power spectrum
AdaptiveBin: Adaptive Binning
AIPY: Astronomical Interferometry in PYthon
Askaryan Module: Askaryan electric fields predictor
contbin: Contour binning and accumulative smoothing

CuBANz: Photometric redshift estimator
FISHPACK: Efficient FORTRAN Subprograms for the Solution of Separable Elliptic Partial Differential Equations
FISHPACK90: Efficient FORTRAN Subprograms for the Solution of Separable Elliptic Partial Differential Equations
FIT3D: Fitting optical spectra
GRASP: General-purpose Relativistic Atomic Structure Package

Kranc: Cactus modules from Mathematica equations
NSCool: Neutron star cooling code
Photutils: Photometry tools
PKDGRAV3: Parallel gravity code
PYESSENCE: Generalized Coupled Quintessence Linear Perturbation Python Code

PyPHER: Python-based PSF Homogenization kERnels
SCIMES: Spectral Clustering for Interstellar Molecular Emission Segmentation
SIP: Systematics-Insensitive Periodograms
Sky3D: Time-dependent Hartree-Fock equation solver
spectral-cube: Read and analyze astrophysical spectral data cubes

StarPy: Quenched star formation history parameters of a galaxy using MCMC
SuperBoL: Module for calculating the bolometric luminosities of supernovae
T-PHOT: PSF-matched, prior-based, multiwavelength extragalactic deconfusion photometry
TIDEV: Tidal Evolution package
Weighted EMPCA: Weighted Expectation Maximization Principal Component Analysis

August 2016 additions to the ASCL

Twenty codes were added to the ASCL in August 2016:

21CMMC: Parallelized Monte Carlo Markov Chain analysis tool for the epoch of reionization (EoR)
2DFFT: Measuring Galactic Spiral Arm Pitch Angle
appaloosa: Python-based flare finding code for Kepler light curves
AstroVis: Visualizing astronomical data cubes
BART: Bayesian Atmospheric Radiative Transfer fitting code

BASE-9: Bayesian Analysis for Stellar Evolution with nine variables
Cuba: Multidimensional numerical integration library
DOLPHOT: Stellar photometry
FilFinder: Filamentary structure in molecular clouds
Gemini IRAF: Data reduction software for the Gemini telescopes

gevolution: General Relativity Cosmological N-body code for evolution of large scale structures
LORENE: Spectral methods differential equations solver
NEBULAR: Spectrum synthesis for mixed hydrogen-helium gas in ionization equilibrium
NICIL: Non-Ideal magnetohydrodynamics Coefficients and Ionisation Library
OBERON: OBliquity and Energy balance Run on N-body systems

PROFFIT: Analysis of X-ray surface-brightness profiles
pvextractor: Position-Velocity Diagram Extractor
pyXSIM: Synthetic X-ray observations generator
SPIDERz: SuPport vector classification for IDEntifying Redshifts
Stingray: Spectral-timing software

Upcoming events and sessions, Fall-Winter 2016/7

I’ll be representing the ASCL at next month’s WSSSPE4 meeting in Manchester, and in October, the ASCL will be at ADASS XXVI and will hold an Advisory Committee (AC) meeting while there. Peter Teuben, chair of the ASCL AC, will moderate a Birds of a Feather session at ADASS on Implementing Ideas for Improving Software Citation and Credit; this is a follow-up on the discussion at last year’s BoF Improving Software Citation and Creditwith a goal of taking action on some of the ideas generated at last year’s BoF. Watch this space in October for more!

For January’s American Astronomy Society meeting in Texas, the Moore-Sloan Data Science Environment at NYU and the ASCL have organized another Special Session, Perspectives in Research Software. This will follow the format of previous sessions, with presentations in the first half of the session and discussion open to all for the second half. Bruce Berriman from the Infrared Processing and Analysis Center at Caltech will moderate; the presenters include Ivelina Momcheva (Space Telescope Science Institute),  Tracy Teal (Data Carpentry), Lior Shamir (Lawrence Technological University), and Michael Hucka (Caltech). I’m rationally exuberant about this session!

July 2016 additions to the ASCL

Twenty codes were added to the ASCL in July 2016:

AGNfitter: SED-fitting code for AGN and galaxies from a MCMC approach
astLib: Tools for research astronomers
Atlas2bgeneral: Two-body resonance calculator
Atlas3bgeneral: Three-body resonance calculator
BLS: Box-fitting Least Squares

BoxRemap: Volume and local structure preserving mapping of periodic boxes
Cholla: 3D GPU-based hydrodynamics code for astrophysical simulation
DICE: Disk Initial Conditions Environment
HfS: Hyperfine Structure fitting tool
HIDE: HI Data Emulator

JUDE: An Utraviolet Imaging Telescope pipeline
K2PS: K2 Planet search
Kālī: Time series data modeler
LZIFU: IDL emission line fitting pipeline for integral field spectroscopy data
PICsar: Particle in cell pulsar magnetosphere simulator

Planetary3br: Three massive body resonance calculator
RT1D: 1D code for Rayleigh-Taylor instability
SEEK: Signal Extraction and Emission Kartographer
SOPIE: Sequential Off-Pulse Interval Estimation
ZASPE: Zonal Atmospheric Stellar Parameters Estimator

Engineering Academic Software, Schloss Dagstuhl Day 0 and 1

Now that I’ve written extensively about days 2-4, I am cycling back to give day 1 its due, but first will say the sharing started on Sunday June 19 as people arrived for both the week-long Engineering Academic Software Perspectives Workshop and the three-day Information-centric Networking and Security Seminar; cake and coffee is available upon arrival, which gives folks an opportunity to meet, and conversation between participants in both workshops flowed easily. A Workshop participants on the site of the old schlossgroup of us decided to walk to the old castle ruins on the hill — up many steps — and it was on this little jaunt that I learned firsthand about stinging nettles (having only read about them before) with Andrei Chiș providing most of the hands-on instruction. He and I experimented with different nettles to see which produced the greatest stinging/welts; oh, the things we do for science! Those who had been unwilling victims of the plants provided more data points, and our quick survey leads us to suspect that the more mature the plant, the greater its “don’t touch!” defenses.

On Monday morning, we started with lightning introductions. We had been asked to create two slides, one on our relevant background and another on our interests for future use; a list of workshop participants and intro slides are available online. Tweet: "“I’m a programmer that somebody made into a professor.” @jurgenvinjuNext came presentations; first up was Dan Katz to talk about WSSSPE (Working towards Sustainable Software for Science: Practice and Experience), pronounced “wispy.” He shared the history of the organization, noting that three large annual meetings had been held, with several smaller interim meetings also having taken place. WSSSPE’s progression was to first identify challenges regarding software and best practices for sustainability, then to discuss solutions and ways to enable change, and then at WSSSPE3, to take action and encourage people to work in groups to put into practice the identified solutions. Katz’s presentation included an overview of each of the WSSSPE working groups and the progress each group has made. Some of the working groups overlapped with efforts taking place elsewhere; the Software Credit Working Group, for example, shared much in common with Force11’s Software Citation Working Group, so the decision was made to work on combining the two groups (which was successful) and for members to work on the Software Citation Principles that were being developed (which was also successful).

Katz also shared lessons learned from WSSSPE3 — what had worked, what could have worked better, and what didn’t work. He outlined what is planned for WSSSPE4 (taking place this September in Manchester), listing two tracks for the event: Building a sustainable future for open-use research software, which will concentrate on defining the future of open-use scientific software and initiating plans to arrive at this future; and Practices and experiences in sustainable software, which will concentrate on improving current practices. Katz concluded his talk by sharing links to the reports for WSSSPE1, WSSSPE2, and WSSSPE3 and the social media sites. PDF

The next presentation, Supporting Research Software Engineering, was by Mike Croucher. His talk focused on his work as an Engineering and Physical Sciences Research Council (EPSRC) Research Software Engineering Fellow; he is one of only seven to be awarded this new fellowship. He helps scientists improve their software in various ways, such as making it faster, more reliable/robust/user friendly, and more sustainable. (Now that I’ve typed that, Harder, Better, Faster, Stronger is playing on the radio in my head.) This has to be done carefully, for as Croucher put it: Tweet: "Users are afraid I’m going to “do computer science” to them. @walkingrandomly treads carefully when repairing user code."The phrase “do computer science to them” was echoed throughout the rest of the workshop; this idea — acknowledgement of that fear — seemed to resonate with many.

Croucher shared some of the outreach and education activities he’s been involved with, one of which was a (gentle) self-paced R tutorial held in a café. Volunteer facilitators walked around to answer questions, clarify information, and unstick people who got stuck and the session was a rousing success, so much so that there are now requests and expectations that more will be held!

It was acknowledged at the beginning of the day that academic software faces many challenges; Croucher’s presentation covered some of them, and included this stark slide:

Software is not valued in academizfollowed by:

Tweet: One of the major reasons I dropped out of my PhD was because I didn't believe academia could properly value software contributionsHe also mentioned the lack of funding for software activities, that soft-money researchers are discriminated against in favor of tenure-track and tenured faculty, and other issues. Oooo, he got a great discussion going with these and other points! In the active discussion, Cecilia Aragon made the point that we need to stop calling software “infrastructure,” as software has intellectual content.
Tweet: "Comment from @craragon : stop calling software 'infrastructure' because plumbers aren't invited to coauthor science papers"Though there are challenges, it’s not all bad — things are improving. The Software Sustainability Institute is funded by several organizations, the UK’s Engineering and Physical Sciences Research Council (EPSRC) has recognized the importance of software through funding of the Research Software Engineers, and a Horizon 2020 project to provide “substantial funding” for open source maths research software. Croucher’s vision of the future includes core-funded research software engineers and a hope for tenure awarded on the basis of software contributions. He closed his excellent presentation with concrete steps for changing the perception of software engineering and leading a change in culture. Slides

After a coffee break, the last presentation of the morning was given by Christoph Becker on Sustainability design.  He covered the challenges of sustainability, and referred to the “sustainability debt” Sustainability debt model across realms and widening effectsthat is mostly unknown for most systems. The effects of sustainability (or lack thereof) can be considered from several angles; one way to look at this debt is across economic, technical social, environmental, and individual aspects, and whether it has an immediate, enabling, or structural effect. The concerns about sustainability have inspired the Karlskrona Manifesto for Sustainability Design, which seeks to address sustainability across different aspects and widening effects. The Manifesto identifies eleven “misperceptions and counterpoints”, seeks to correct or mitigate them, and educate and advocate for a constructive approach to enabling a paradigm shift. Tweet: "Software projects are full of present-future trade offs, yet SE hasn't learned from behavioral economics @ChriBecker at #dagstuhleas"Becker is particularly interested in studying how people decide on the trade-offs they make when designing software, and using the insight gleaned to develop and implement methods and tools for making better choices. PDF

After Becker’s talk, we broke for lunch, then went into four breakout sessions for a good part of the afternoon; the four selected by participants from the many that had been proposed were:

  • Academic software project typology
  • Examining sustainability for a particular project
  • Making the intellectual content of software visible
  • Empirical survey of software practices in a domain

After working in our breakout sessions, we came back together to report on our progress. Wow, was there a lot of discussion! Everyone was very engaged in listening to, commenting on, and discussing the reports from the different groups. It was a very exciting afternoon, and discussion continued right up until we were forced to break for dinner.

As previously reported, we had a discussion in the evening as well. Monday was an excellent start to an outstanding week!

June 2016 additions to the ASCL

Fifteen codes were added to the ASCL in June 2016:

COMB: Compact embedded object simulations
Companion-Finder: Planets and binary companions in time series spectra
Cygrid: Cython-powered convolution-based gridding module for Python
FDIPS: Finite Difference Iterative Potential-field Solver
FLASK: Full-sky Lognormal Astro-fields Simulation Kit

HIBAYES: Global 21-cm Bayesian Monte-Carlo Model Fitting
KMDWARFPARAM: Parameters estimator for K and M dwarf stars
Lmfit: Non-Linear Least-Square Minimization and Curve-Fitting for Python
PAL: Positional Astronomy Library
Pulse Portraiture: Pulsar timing

PyMultiNest: Python interface for MultiNest
s2: Object oriented wrapper for functions on the sphere
SimpLens: Interactive gravitational lensing simulator
SWOC: Spectral Wavelength Optimization Code
uvmcmcfit: Parametric models to interferometric data fitter

Engineering Academic Software, Schloss Dagstuhl Day 4

The morning of Day 4 of the Engineering Academic Software workshop opened with the mighty James Howison talking about the outputs expected from our work this week; these include a report of the meeting, the manifesto mentioned in my previous blog post, a draft document offering guidance for tenure committees on evaluating software contributions, a draft workplan for writing a proposal to establish an award for software contributions, a table of contents for a research software engineering handbook, and a sustainability debt use case, these last four from breakout session work.

The two talks on this morning might well be billed the Battle of Cool Places to Work. First up was Cecilia Aragon from the University of Washington on eScience Institute Initiatives. This work grew out of the realization that people were drowning in data, leading to the Moore/Sloan Data Science Environment, a $37.8M initiative at UCB, NYU, and UW. Tweet: ""Big data is two orders of magnitudes larger than you are used to dealing with now" - @craragon"The eScience Institute has a multi-pronged approach set up around science theme areas with bridges to data science methodologies; this sets up a cycle wherein research needs generate new methodologies, which enable more science. Two new roles were established, Data Science Fellows and Data Scientists. They also set up education and training, including workshops and bootcamps for data science, such as Software Carpentry and Astro Hack Week and a seminar series. They offer a new MS in data science that is interdisciplinary, involving six departments, and innovative, with a social science component that includes a human-centered viewpoints and ethics. This MS program is designed for working professionals, provides a rigorous technical program in statistics and computer science, and has evening courses and allows full or part time attendance. Tweet: "UW's MSc in Data Science includes training in "software hygiene" and pragmatic parts of Soft Eng"They have set up guidelines for reproducibility and offer help to people to improve this aspect of their work; eScience institute data scientists and others participate in a “drop-in” office hours program. They foster working relations with their working space and culture; people sit side-by-side to work on a problem. They see sharing a physical space as essential for data science and growing research software collaborations.

Aragon discussed the integration of ethnography (a qualitative field-based technique originally from anthropology that enables study of underlying patterns and themes) and evaluation into a wide range of the data science environment. Ethnography research tries to answer questions such as Who does data science?, How are they networked?, and What forms of social interaction do they use? Ethnographers at UW work with members of the community to interpret observations and to provide feedback on what works and what doesn’t. She reports they have had a lot of  success with “applied ethnography”.

She also discussed their data science incubator program, which was the precursor to the Data Science for Social Good program. They looked for high-impact data-intensive science projects that would benefit from quarter-long sprints of expertise, and had projects outlive the incubator, getting advances in both the science and the software and generating publishable "Human centered data science" initiative by @craragon impressive in its interdisciplinary diversity. Great stuff to learn fromresults for both. One project was to try to solve problem of homelessness in Seattle. It involved bringing data about homelessness into more manageable form and analyzing it to see what worked, conducting analysis to identify predictors of permanent housing, and looking for successful outcomes. Another project, Open Sidewalks, created sidewalk maps for low-mobility citizens to show the curb cuts are.

Aragon discussed the marketing that they do; they talk to a lot of people, and this has helped with engagement. They actively look for ways to build relationships and collaboration.
factors contributing to collaborative dynamicsAs with all the talks, participants in the room were very engaged, asked questions, and discussed various points. Aragon was asked about career paths and the backgrounds of those in the MS data science program; she said there were forty students in the first cohort and that it was a very heterogenous class, with people from many disciplines. The ethnography work has been discussed in a paper by Tanweer, Fiore, and Aragon. I hope the slides for this talk are released! There was a lot in it that I have not captured here. Screen Shot 2016-06-27 at 6.32.45 PMWhen someone gives a talk about their organisation and you want to quit your job, leave the country and go join them.
The “organization envy,” as one person in the room put it, continued with Rob van Nieuwpoort‘s talk on the Netherlands eScience Center. The Netherlands eScience center receives 5.4M€/year in permanent funding.
Screen Shot 2016-06-27 at 6.34.54 PMTheir responsibilities is demand-driven for all sciences; competition for funding and services is within disciplines, not between disciplines. They fund path-finding projects; this program is similar to UW’s incubator projects program. They also receive in-kind funding for eScience research engineers; these are broadly oriented scientists.

The eScience Center recognized early on that they wanted to give research engineers career paths; they offer three different paths: managerial, technical, and research. Asked by Katz whether research engineers have to have academic appointments, van Nieuwpoort stated that some researcher engineers do have academic appointments, but not all do. Vinju asked about educational opportunities, to which the answer was that yes, there are educational opportunities, including workshop and other training; this topic came up again a bit later.

They foster a collaborative rather than a competitive environment, with their engineers fully integrated into the scientific work, and domain scientists recognized for their contributions to software development. Research software engineers are coauthors on science papers to which they have contributed, and when software methods published, domain scientists are recognized with coauthorship.

Returning to education, van Nieuwpoort stated that research software engineers like learn, so the eScience Center keeps them challenged and learning with courses, hackathons, and sprints, and by switching disciplines and technologies.

Through eStep, an eScience technology platform, the eScience Center serves the 99% of Rob action shotscientists in the Netherlands that aren’t at the eScience Center. eSTEP goals are to prevent fragmentation and duplication; to promote exchange and reuse of best practices; to represent NLeSC’s expertise and knowledge, and to improve the science state of art with fundamental science research. There are key expertises used in many projects and projects use number of methodologies. NLeSC generalizes software for use in eSTEP; they find or develop state of the art and “best of breed” technologies and software matching their expertise areas that can be made generic and overarching and integrate that technology into eSTEP.

Sustainability is important to them, as is preventing duplication and fragmentation; they seek to build software that is worth sustaining and enforce software engineering best practicesTweet: OH sustainability only matters for things that are worth sustaining :-). They use Software Carpentry and Data Carpentry to educate their partners, maintain a knowledge base, and (be still, my heart!) offer a searchable software repository. And more! Slides (PDF)

Tweet: lots of cool projects and ideas at esciencecenter.nl :)

Of course there was discussion (and funding envy, too); Kevin Crowston pointed out that permanent funding, as Rob’s eScience center has, solves a lot of issues. After a short break, we worked together (all of us in the same Google document, which was a little wild) on the Manifesto, sometimes tweeting out comments and questions, up until lunch.

Tweet: Has anyone got any stories about researchers who shared their code and got publicly mocked because of low quality?Tweet: Trying to "look far into future" as @dagstuhl manifestos are meant to do. What will research software be in 10, 20 years?Tweet: "Just became aware of #CodeMeta “a Rosetta stone of software metadata” http://codemeta.github.io"After lunch, we went into breakout sessions; these included sessions on future research questions, the research software engineering handbook, and the Force11 software citation suggestions. After working in these breakouts, we reconvened to share and discuss the progress that was made, breaking only for dinner at 6:00 PM because we had to.

A very busy, exciting, interesting, informative, and productive day!

Engineering Academic Software, Schloss Dagstuhl Day 3

The day started with a quick discussion about the afternoon; it is traditional for Schloss Dagstuhl seminars that Wednesday afternoons involve a social activity. It was determined on Tuesday that the activity was to be a hike some distance away from Dagstuhl with dinner after in another town, but several changes to these plans had to be ironed out and announced. After a few minutes spent on that, the morning session got underway and was furiously fast! This was an Open Mic, with participants having signed up while here to give short talks (ten minutes or less).

First up was Daniel Garijo on Software Metadata: Describing “dark software” in Geosciences. By “dark software,” he means that which is often hidden from view. He described the current state of the art for software description in geosciences and demonstrated Ontosoft.org, a semantic registry for scientific software, which currently includes information from several geosciences resources. As Ontosoft is not domain-specific, it has the capacity to expand into other fields as well. This is a very attractive and capable site. It uses a distributed approach to software registries and depends on crowdsourcing for metadata maintenance. The resource organizes software metadata using the OntoSoft ontology along six dimensions: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software. Slideshare

Jurgen Vinju was next with Organising a research team around the research software around the research team in software engineering: Motivation, experiences, lessons. He talked about his experiences as the group leader of the SWAT (Software Analysis and Transformation) team at Centrum Wiskunde and Informatica (CWI), the national research institute for math and computer science in the Netherlands. tweet showing image of Jurgen presenting his Open Mic talkSWAT is all about the source code and supporting programmers to create more efficient, maintainable software. They work to understand and control software complexity to enable more and better tools. He made the point that research teams “prioritise for academic output which is not software.” He showed UseTheSource, a resource developed by CWI with contributions from other institutes and housing open-source projects related to software language engineering and metaprogramming. This allows more efficient programming by automating tasks that are cumbersome or hard, and allows synergies between software engineers, researchers, and industry. PDF
Tweet: A research team s not a software team. We have fewer resources. We need more investment in efficiency.

Dan Katz gave an overview of work done by the Force11 Software Citation Working Group; his presentation was titled Software Citation: Principles, Discussion, and Metadata. He provided Tweet: "Check out force 11 for progress in software citation"rationales for citing software, information on the WSSSPE and Force11 groups involved in developing software citation principles and the process used to develop them, and then the six principles, which focus on the importance of software, the need to credit and attribute the contributions software makes to research and to be able to uniquely identify software in a persistent and specific way, and that citations should enable access to the software and associated information about the software that informs its use. Katz brought up many of the discussions the WSSSPE and Force11 working groups had and their determinations, such as what software to cite, how to uniquely identify software, that peer-review of software is important but not required for citation, and how publishers can help.
Tweet: "It's more important to cite the software directly rather than a software paper"Each of the Open Mic sessions generated immediate discussion during the sessions and while the next presenter was setting up, and this session was no exception. When Katz pointed out that a common practice is to publish and cite papers about software (“software papers”), but that the Importance principle of the Force11 Working Group calls for the citation of the software itself, “on the same basis as any other research product”, this was countered with a comment that people should cite software papers if the software authors have requested that method of citation. Katz stated that could be done in addition to citing to the software, as one of his slides stated. The presentation concluded with information on the next steps for the Force11 Software Citation Working Group — to finalize the principles, and publish and circulate them for endorsement — and the likelihood of a Software Citation Implementation Group being formed to work with institutions, researchers, publishers, and other interested parties to put the principles into practice. PDF

Tweet: ""Software advisors are elected. It's a role people create when ask you questions" Katie Kuksenok"The fourth Open Mic talk was by Katerena Kuksenok on Best Practices (by any other name). This interesting talk looked atTweet: "User resistance: “I don’t want to use version control because I don’t want the world to see my terrible code.”" intersections of the technical, social, and cognitive aspects of software engineering in research, and asked how the available community and skill resources could be leveraged. brought together various elements brought up through the workshop so far, including different roles that had been identified, the need for software engineers to learn from scientists just as we hope researchers learn software engineering practices, Tweet: Mike Croucher "is s/w therapist/coach, helping scientists improve code...carefully; doesn't throw computer science at them!"and overcoming communications barriers. She referred back to a comment Mike Croucher had made in his talk on Monday, agreeing that software engineers should “do CS/SE with people not at them!” PowerPoint

After Kuksenok’s talk, I presented Restoring reproducibility: Making scientist software discoverable. This presentation was a quick overview of the ASCL, its history and a few of the changes to our infrastructure, the lessons we learned from Tweet: astrophysics source code library since 1999looking at what other astro code registries and repositories had done and what we did with those lessons, and some of the impact we have on the community. As with every other session, there was intermittent discussion, questions asked and answered, and conversation on the topic as I headed back to my chair and the next speaker set up. PowerPoint PDF

Robert Haines was up next with A Short* History of Research Software Engineers in the UK (*and probably incomplete). Before there were Research Software Engineers (RSE), there were RSEs going by other names, such as Post Doc and Research Assistant. These were the people in the lab who could code, andTweet" "#dagsRobert Haines reports on the coming to life of the job of “Research Software Engineer”, with jobs, a union, etc." fell “foul of publish or perish” because they were writing code rather than papers. RSEs might also have been hiding as those working in high performance computing or as a research group admin. He is an example of someone who has always done RSE work, though was not called an RSE until fairly recently. It was at a Software Sustainability Institute Collaborations Workshop in 2012 that there was a call to arm to recognize the Screen Shot 2016-06-26 at 12.34.52 PM contributions of those who write code rather than papers and are not purely researchers. They decided they needed a name, to unionize, and a policy campaign. He described the current environment, both the challenges and the positives, and shared that many people want to work in this field. Yes, discussion broke out in this session, too! It was remarkable how engaged everyone at the workshop was, and how often and easily discussion took place. PDF

Ralf presentingDan Katz made a very brief presentation and instigated more discussion on career paths when Robert Haines was finished, then after a brief coffee break, the morning Open Mic session continued with Ralf Lämmel‘s presentation intriguingly called Making a failing project succeed?! about the 101Companies project. He called 101Companies a software chrestomathyfrom chresto, meaning “useful” and mathein, meaning “to learn.” He shared other chrestomathies, such as the Hello World Collection and the Evolution of a Haskell programmer. (One of the previous links will lead you to a song about a popular beverage.) 101Companies is a resource for learning Tweet: "101 is a knowledge resource for technological space travel (between all kinds of online spaces)"more about software, for comparing technologies, for programming education, and can serve as “a playground for student projects.” He discussed some of the challenges the project is having and some of the ways in which it is succeeding. PDF

The last Open Mic talk of the morning was by Ashish Gehani giving a quick overview of his work on software, including software to make data more manageable, particularly the OCCAM: Object Culling and Concretization for Assurance Maximization project.

The last agenda item for the morning was to discuss the manifesto that is one of the required Tweets: "we discussed the #manifesto as genre in http://dx.doi.org/10.1109/ICSE.2015.179 … section III. http://press.princeton.edu/titles/8066.html … is a great #longread"outputs for this workshop. This discussion was led by James Howison, who shared the link for the Google Doc that was to become the manifesto, and which was discussed and created in tandem (and wild abandon) by many in the room duTweet: "I was, uh, one of the authors of the EAS manifesto. The original EAS manifesto. Not the compromised second draft."ring the time remaining before lunch. The manifesto is our public declaration, our own call to action. Our work is only beginning at Schloss Dagstuhl; we must put what we have discussed here into practice. We shared other manifestos (manifesti!), determined authorship as opt-in (by adding our names to the author list), and talked about but did not determine where this might be published. I found the creation of this document interesting and inspiring, very much in line with the philosophy of “be the change you want to see in the world.”
Tweet: "According to James Howison software as communication between people should be studied."
After getting a good start on the manifesto, we broke for a longer than usual lunch period, after which some took a long hike with a lakeside stop for a refreshing beverage, and some did other things. I took a much-needed nap and then noodled around for a bit in the music room, view of the music room looking toward the piano from the far end a lovely large, long room with wonderful acoustics and a recently-tuned grand piano, two guitars, a cello, and a violin available. (I discovered later in the week that the violin case also holds a kazoo.) small ornate doorway decorated with naked cherubs and a shield with 1743 on itScores for solo and ensemble music are stocked in a room at one end of the music room, the (small) door to which is watched over by cherubs. Most of the Schloss is modern in appearance; this is one of the few rooms that reveals the building’s history. I found plenty of music to amuse myself with, including a collection of Bach preludes and fugues from the WTC apparently edited by Bartók and in what to me was a confusing order, and Beethoven sonatas that at one time I knew how to butcher. Others reported having taken shorter walks than the one that was organized, listening to podcasts, trying out the bicycles available for guests, and also napping.

As you have likely surmised by now, the Twitter hashtag for this event was , and the Twitter feed offers more pictures and information about this workshop.

Engineering Academic Software, Schloss Dagstuhl Day 2

Tuesday started with Jeffrey Carver from the University of Alabama presenting What we have learned about using software engineering practices in scientific software. They took a multi-pronged approach to studying scientific software, from conducting surveys and workshop to direct interactions and case studies. From survey work, his team was able to group problems scientists were having with their own software into four main areas: rework, performance, regression (testing), and forgetting bugs. From this, they could see what software engineering practices might help with solving the problems.

Case studies brought numerous lessons to light; they found that the use of higher-level languages was low, performance competes with other goals, and external software use can be seen as risky. Workshops highlighted some of the differences between scientist programmers and software engineers and their domains. Scientist developers often lack formal software engineering expertise but have deep knowledge of their domains and are often the main users of their software. Quality goals are different, too; scientists would rather software not run than return an incorrect result. This project demonstrated that there is a need to eliminate the stigma associated with software engineering and that software engineers need to understand domain constraints and specific problems. PDF

Every presentation sparked lively Q&A and discussion, often throughout the presentation, and this one was no exception. User stories beat data even for scientistsScreen Shot 2016-06-22 at 7.49.08 PM

The next presentation was Engineering yt by Matthew Turk from NCSA at the University of Illinois at Urbana Champaign. He provided context and information on this well-cited community-developed project, discussed how the community was built, and its adoption of a code of conduct. YTEP, yt Enhancement Proposals, provide a method to manage suggestions for improving yt. Communication methods within the community are well thought out. The challenges of creating and managing the community sparked a lot of discussion; large software projects can have many things go wrong. tweet about change
Failure Modes

Discussion among the group made Matt’s presentation run long, making it necessary to break for coffee before Matt’s talk was done and then return to it after the break. The group was very engaged throughout the day; fortunately, the schedule accommodated the frequent discussions in every presention very well.

After Matt’s talk, Caroline Jay (University of Manchester) and Robert Haines (Software Sustainability Institute) presented Software as academic output. They discussed software’s role in research, when it can be a tool that enables research or the actual research itself, and how this is different depending on the discipline and the functionality of the software within the discipline and the role of the person using the software. They made the point that “Software isn’t a separate thing — software could exist without the paper; the paper couldn’t exist without the software.”
Daniel's tweetChristoph's tweetOh, there was much more goodness in this presentation, which was interrupted by lunch, than I have time to report, including The Horror, as it was termed — the steps necessary for someone to replicate the computational work on one of the research projects this presentation covered — and Robert’s work on making this computational work software available in Docker. It also touched on the FAIR principles for computational research and academic software, and like the other presentations, generated lots of discussion, including conversations in the group on ethical considerations. PDFDan Katz's tweetOscar's tweet

The last formal presentation of the day, before the breakout workgroups, was by Claude Kirchner (INRIA) on the Software Heritage Project. He covered the rationale for this project, which includes the inconsiderate or malicious loss of code and the desire to preserve “our technical and scientific knowledge.” The Software Heritage Project has set out to preserve all the software. Yes, you read that correctly: All the software. Fortunately, a version of the slides for this presentation are online so you can see them for yourself! The site is scheduled to go live next week and I look forward to seeing it.

After Claude’s presentation, we went into breakout sessions.
Christoph's tweet re breakout groupsI joined a breakout session on getting a standing award for scientific contributions through software created. The other breakout sessions were on creating a research software engineering handbook and academic software project typology. All groups reported back before the day’s session ended for dinner. Quite an informative, exciting, and productive day!