Category Archives: news

Upcoming events and sessions, Fall-Winter 2016/7

I’ll be representing the ASCL at next month’s WSSSPE4 meeting in Manchester, and in October, the ASCL will be at ADASS XXVI and will hold an Advisory Committee (AC) meeting while there. Peter Teuben, chair of the ASCL AC, will moderate a Birds of a Feather session at ADASS on Implementing Ideas for Improving Software Citation and Credit; this is a follow-up on the discussion at last year’s BoF Improving Software Citation and Creditwith a goal of taking action on some of the ideas generated at last year’s BoF. Watch this space in October for more!

For January’s American Astronomy Society meeting in Texas, the Moore-Sloan Data Science Environment at NYU and the ASCL have organized another Special Session, Perspectives in Research Software. This will follow the format of previous sessions, with presentations in the first half of the session and discussion open to all for the second half. Bruce Berriman from the Infrared Processing and Analysis Center at Caltech will moderate; the presenters include Ivelina Momcheva (Space Telescope Science Institute),  Tracy Teal (Data Carpentry), Lior Shamir (Lawrence Technological University), and Michael Hucka (Caltech). I’m rationally exuberant about this session!

July 2016 additions to the ASCL

Twenty codes were added to the ASCL in July 2016:

AGNfitter: SED-fitting code for AGN and galaxies from a MCMC approach
astLib: Tools for research astronomers
Atlas2bgeneral: Two-body resonance calculator
Atlas3bgeneral: Three-body resonance calculator
BLS: Box-fitting Least Squares

BoxRemap: Volume and local structure preserving mapping of periodic boxes
Cholla: 3D GPU-based hydrodynamics code for astrophysical simulation
DICE: Disk Initial Conditions Environment
HfS: Hyperfine Structure fitting tool
HIDE: HI Data Emulator

JUDE: An Utraviolet Imaging Telescope pipeline
K2PS: K2 Planet search
Kālī: Time series data modeler
LZIFU: IDL emission line fitting pipeline for integral field spectroscopy data
PICsar: Particle in cell pulsar magnetosphere simulator

Planetary3br: Three massive body resonance calculator
RT1D: 1D code for Rayleigh-Taylor instability
SEEK: Signal Extraction and Emission Kartographer
SOPIE: Sequential Off-Pulse Interval Estimation
ZASPE: Zonal Atmospheric Stellar Parameters Estimator

Engineering Academic Software, Schloss Dagstuhl Day 0 and 1

Now that I’ve written extensively about days 2-4, I am cycling back to give day 1 its due, but first will say the sharing started on Sunday June 19 as people arrived for both the week-long Engineering Academic Software Perspectives Workshop and the three-day Information-centric Networking and Security Seminar; cake and coffee is available upon arrival, which gives folks an opportunity to meet, and conversation between participants in both workshops flowed easily. A Workshop participants on the site of the old schlossgroup of us decided to walk to the old castle ruins on the hill — up many steps — and it was on this little jaunt that I learned firsthand about stinging nettles (having only read about them before) with Andrei Chiș providing most of the hands-on instruction. He and I experimented with different nettles to see which produced the greatest stinging/welts; oh, the things we do for science! Those who had been unwilling victims of the plants provided more data points, and our quick survey leads us to suspect that the more mature the plant, the greater its “don’t touch!” defenses.

On Monday morning, we started with lightning introductions. We had been asked to create two slides, one on our relevant background and another on our interests for future use; a list of workshop participants and intro slides are available online. Tweet: "“I’m a programmer that somebody made into a professor.” @jurgenvinjuNext came presentations; first up was Dan Katz to talk about WSSSPE (Working towards Sustainable Software for Science: Practice and Experience), pronounced “wispy.” He shared the history of the organization, noting that three large annual meetings had been held, with several smaller interim meetings also having taken place. WSSSPE’s progression was to first identify challenges regarding software and best practices for sustainability, then to discuss solutions and ways to enable change, and then at WSSSPE3, to take action and encourage people to work in groups to put into practice the identified solutions. Katz’s presentation included an overview of each of the WSSSPE working groups and the progress each group has made. Some of the working groups overlapped with efforts taking place elsewhere; the Software Credit Working Group, for example, shared much in common with Force11’s Software Citation Working Group, so the decision was made to work on combining the two groups (which was successful) and for members to work on the Software Citation Principles that were being developed (which was also successful).

Katz also shared lessons learned from WSSSPE3 — what had worked, what could have worked better, and what didn’t work. He outlined what is planned for WSSSPE4 (taking place this September in Manchester), listing two tracks for the event: Building a sustainable future for open-use research software, which will concentrate on defining the future of open-use scientific software and initiating plans to arrive at this future; and Practices and experiences in sustainable software, which will concentrate on improving current practices. Katz concluded his talk by sharing links to the reports for WSSSPE1, WSSSPE2, and WSSSPE3 and the social media sites. PDF

The next presentation, Supporting Research Software Engineering, was by Mike Croucher. His talk focused on his work as an Engineering and Physical Sciences Research Council (EPSRC) Research Software Engineering Fellow; he is one of only seven to be awarded this new fellowship. He helps scientists improve their software in various ways, such as making it faster, more reliable/robust/user friendly, and more sustainable. (Now that I’ve typed that, Harder, Better, Faster, Stronger is playing on the radio in my head.) This has to be done carefully, for as Croucher put it: Tweet: "Users are afraid I’m going to “do computer science” to them. @walkingrandomly treads carefully when repairing user code."The phrase “do computer science to them” was echoed throughout the rest of the workshop; this idea — acknowledgement of that fear — seemed to resonate with many.

Croucher shared some of the outreach and education activities he’s been involved with, one of which was a (gentle) self-paced R tutorial held in a café. Volunteer facilitators walked around to answer questions, clarify information, and unstick people who got stuck and the session was a rousing success, so much so that there are now requests and expectations that more will be held!

It was acknowledged at the beginning of the day that academic software faces many challenges; Croucher’s presentation covered some of them, and included this stark slide:

Software is not valued in academizfollowed by:

Tweet: One of the major reasons I dropped out of my PhD was because I didn't believe academia could properly value software contributionsHe also mentioned the lack of funding for software activities, that soft-money researchers are discriminated against in favor of tenure-track and tenured faculty, and other issues. Oooo, he got a great discussion going with these and other points! In the active discussion, Cecilia Aragon made the point that we need to stop calling software “infrastructure,” as software has intellectual content.
Tweet: "Comment from @craragon : stop calling software 'infrastructure' because plumbers aren't invited to coauthor science papers"Though there are challenges, it’s not all bad — things are improving. The Software Sustainability Institute is funded by several organizations, the UK’s Engineering and Physical Sciences Research Council (EPSRC) has recognized the importance of software through funding of the Research Software Engineers, and a Horizon 2020 project to provide “substantial funding” for open source maths research software. Croucher’s vision of the future includes core-funded research software engineers and a hope for tenure awarded on the basis of software contributions. He closed his excellent presentation with concrete steps for changing the perception of software engineering and leading a change in culture. Slides

After a coffee break, the last presentation of the morning was given by Christoph Becker on Sustainability design.  He covered the challenges of sustainability, and referred to the “sustainability debt” Sustainability debt model across realms and widening effectsthat is mostly unknown for most systems. The effects of sustainability (or lack thereof) can be considered from several angles; one way to look at this debt is across economic, technical social, environmental, and individual aspects, and whether it has an immediate, enabling, or structural effect. The concerns about sustainability have inspired the Karlskrona Manifesto for Sustainability Design, which seeks to address sustainability across different aspects and widening effects. The Manifesto identifies eleven “misperceptions and counterpoints”, seeks to correct or mitigate them, and educate and advocate for a constructive approach to enabling a paradigm shift. Tweet: "Software projects are full of present-future trade offs, yet SE hasn't learned from behavioral economics @ChriBecker at #dagstuhleas"Becker is particularly interested in studying how people decide on the trade-offs they make when designing software, and using the insight gleaned to develop and implement methods and tools for making better choices. PDF

After Becker’s talk, we broke for lunch, then went into four breakout sessions for a good part of the afternoon; the four selected by participants from the many that had been proposed were:

  • Academic software project typology
  • Examining sustainability for a particular project
  • Making the intellectual content of software visible
  • Empirical survey of software practices in a domain

After working in our breakout sessions, we came back together to report on our progress. Wow, was there a lot of discussion! Everyone was very engaged in listening to, commenting on, and discussing the reports from the different groups. It was a very exciting afternoon, and discussion continued right up until we were forced to break for dinner.

As previously reported, we had a discussion in the evening as well. Monday was an excellent start to an outstanding week!

June 2016 additions to the ASCL

Fifteen codes were added to the ASCL in June 2016:

COMB: Compact embedded object simulations
Companion-Finder: Planets and binary companions in time series spectra
Cygrid: Cython-powered convolution-based gridding module for Python
FDIPS: Finite Difference Iterative Potential-field Solver
FLASK: Full-sky Lognormal Astro-fields Simulation Kit

HIBAYES: Global 21-cm Bayesian Monte-Carlo Model Fitting
KMDWARFPARAM: Parameters estimator for K and M dwarf stars
Lmfit: Non-Linear Least-Square Minimization and Curve-Fitting for Python
PAL: Positional Astronomy Library
Pulse Portraiture: Pulsar timing

PyMultiNest: Python interface for MultiNest
s2: Object oriented wrapper for functions on the sphere
SimpLens: Interactive gravitational lensing simulator
SWOC: Spectral Wavelength Optimization Code
uvmcmcfit: Parametric models to interferometric data fitter

Engineering Academic Software, Schloss Dagstuhl Day 4

The morning of Day 4 of the Engineering Academic Software workshop opened with the mighty James Howison talking about the outputs expected from our work this week; these include a report of the meeting, the manifesto mentioned in my previous blog post, a draft document offering guidance for tenure committees on evaluating software contributions, a draft workplan for writing a proposal to establish an award for software contributions, a table of contents for a research software engineering handbook, and a sustainability debt use case, these last four from breakout session work.

The two talks on this morning might well be billed the Battle of Cool Places to Work. First up was Cecilia Aragon from the University of Washington on eScience Institute Initiatives. This work grew out of the realization that people were drowning in data, leading to the Moore/Sloan Data Science Environment, a $37.8M initiative at UCB, NYU, and UW. Tweet: ""Big data is two orders of magnitudes larger than you are used to dealing with now" - @craragon"The eScience Institute has a multi-pronged approach set up around science theme areas with bridges to data science methodologies; this sets up a cycle wherein research needs generate new methodologies, which enable more science. Two new roles were established, Data Science Fellows and Data Scientists. They also set up education and training, including workshops and bootcamps for data science, such as Software Carpentry and Astro Hack Week and a seminar series. They offer a new MS in data science that is interdisciplinary, involving six departments, and innovative, with a social science component that includes a human-centered viewpoints and ethics. This MS program is designed for working professionals, provides a rigorous technical program in statistics and computer science, and has evening courses and allows full or part time attendance. Tweet: "UW's MSc in Data Science includes training in "software hygiene" and pragmatic parts of Soft Eng"They have set up guidelines for reproducibility and offer help to people to improve this aspect of their work; eScience institute data scientists and others participate in a “drop-in” office hours program. They foster working relations with their working space and culture; people sit side-by-side to work on a problem. They see sharing a physical space as essential for data science and growing research software collaborations.

Aragon discussed the integration of ethnography (a qualitative field-based technique originally from anthropology that enables study of underlying patterns and themes) and evaluation into a wide range of the data science environment. Ethnography research tries to answer questions such as Who does data science?, How are they networked?, and What forms of social interaction do they use? Ethnographers at UW work with members of the community to interpret observations and to provide feedback on what works and what doesn’t. She reports they have had a lot of  success with “applied ethnography”.

She also discussed their data science incubator program, which was the precursor to the Data Science for Social Good program. They looked for high-impact data-intensive science projects that would benefit from quarter-long sprints of expertise, and had projects outlive the incubator, getting advances in both the science and the software and generating publishable "Human centered data science" initiative by @craragon impressive in its interdisciplinary diversity. Great stuff to learn fromresults for both. One project was to try to solve problem of homelessness in Seattle. It involved bringing data about homelessness into more manageable form and analyzing it to see what worked, conducting analysis to identify predictors of permanent housing, and looking for successful outcomes. Another project, Open Sidewalks, created sidewalk maps for low-mobility citizens to show the curb cuts are.

Aragon discussed the marketing that they do; they talk to a lot of people, and this has helped with engagement. They actively look for ways to build relationships and collaboration.
factors contributing to collaborative dynamicsAs with all the talks, participants in the room were very engaged, asked questions, and discussed various points. Aragon was asked about career paths and the backgrounds of those in the MS data science program; she said there were forty students in the first cohort and that it was a very heterogenous class, with people from many disciplines. The ethnography work has been discussed in a paper by Tanweer, Fiore, and Aragon. I hope the slides for this talk are released! There was a lot in it that I have not captured here. Screen Shot 2016-06-27 at 6.32.45 PMWhen someone gives a talk about their organisation and you want to quit your job, leave the country and go join them.
The “organization envy,” as one person in the room put it, continued with Rob van Nieuwpoort‘s talk on the Netherlands eScience Center. The Netherlands eScience center receives 5.4M€/year in permanent funding.
Screen Shot 2016-06-27 at 6.34.54 PMTheir responsibilities is demand-driven for all sciences; competition for funding and services is within disciplines, not between disciplines. They fund path-finding projects; this program is similar to UW’s incubator projects program. They also receive in-kind funding for eScience research engineers; these are broadly oriented scientists.

The eScience Center recognized early on that they wanted to give research engineers career paths; they offer three different paths: managerial, technical, and research. Asked by Katz whether research engineers have to have academic appointments, van Nieuwpoort stated that some researcher engineers do have academic appointments, but not all do. Vinju asked about educational opportunities, to which the answer was that yes, there are educational opportunities, including workshop and other training; this topic came up again a bit later.

They foster a collaborative rather than a competitive environment, with their engineers fully integrated into the scientific work, and domain scientists recognized for their contributions to software development. Research software engineers are coauthors on science papers to which they have contributed, and when software methods published, domain scientists are recognized with coauthorship.

Returning to education, van Nieuwpoort stated that research software engineers like learn, so the eScience Center keeps them challenged and learning with courses, hackathons, and sprints, and by switching disciplines and technologies.

Through eStep, an eScience technology platform, the eScience Center serves the 99% of Rob action shotscientists in the Netherlands that aren’t at the eScience Center. eSTEP goals are to prevent fragmentation and duplication; to promote exchange and reuse of best practices; to represent NLeSC’s expertise and knowledge, and to improve the science state of art with fundamental science research. There are key expertises used in many projects and projects use number of methodologies. NLeSC generalizes software for use in eSTEP; they find or develop state of the art and “best of breed” technologies and software matching their expertise areas that can be made generic and overarching and integrate that technology into eSTEP.

Sustainability is important to them, as is preventing duplication and fragmentation; they seek to build software that is worth sustaining and enforce software engineering best practicesTweet: OH sustainability only matters for things that are worth sustaining :-). They use Software Carpentry and Data Carpentry to educate their partners, maintain a knowledge base, and (be still, my heart!) offer a searchable software repository. And more! Slides (PDF)

Tweet: lots of cool projects and ideas at esciencecenter.nl :)

Of course there was discussion (and funding envy, too); Kevin Crowston pointed out that permanent funding, as Rob’s eScience center has, solves a lot of issues. After a short break, we worked together (all of us in the same Google document, which was a little wild) on the Manifesto, sometimes tweeting out comments and questions, up until lunch.

Tweet: Has anyone got any stories about researchers who shared their code and got publicly mocked because of low quality?Tweet: Trying to "look far into future" as @dagstuhl manifestos are meant to do. What will research software be in 10, 20 years?Tweet: "Just became aware of #CodeMeta “a Rosetta stone of software metadata” http://codemeta.github.io"After lunch, we went into breakout sessions; these included sessions on future research questions, the research software engineering handbook, and the Force11 software citation suggestions. After working in these breakouts, we reconvened to share and discuss the progress that was made, breaking only for dinner at 6:00 PM because we had to.

A very busy, exciting, interesting, informative, and productive day!

Engineering Academic Software at Schloss Dagstuhl

I’m at Schloss Dagstuhl – Leibniz Center for Informatics for a week-long workshop on Engineering Academic Software. Some of the questions we are tackling have been discussed elsewhere, which we are taking into consideration as we talk about them here, and new questions were not only part of the seminar’s original description, but are arising throughout the general and break-out sessions. I would say we’re at the end of the first day but it continues on though it is past 10 PM, with a planned open and vibrant discussion on dogmas past and present. First up for discussion tonight was Agile project management; how do you feel about it? Is this a dogma that needs to be shot or embraced?

The hashtag to follow on Twitter is #dagstuhleas for the full-group discussions; the breakout sessions so far have been too intense for tweeting!

April and May 2016 additions to the ASCL

Twenty-eight codes were added to the ASCL in April and May 2016:

2-DUST: Dust radiative transfer code
ASTRiDE: Automated Streak Detection for Astronomical Images
BACCHUS: Brussels Automatic Stellar Parameter
CAMELOT: Cloud Archive for MEtadata, Library and Online Toolkit
CCSNMultivar: Core-Collapse Supernova Gravitational Waves

cluster-lensing: Tools for calculating properties and weak lensing profiles of galaxy clusters
DISCO: 3-D moving-mesh magnetohydrodynamics package
DNest3: Diffusive Nested Sampling
DUO: Spectra of diatomic molecules
FDPS: Framework for Developing Particle Simulators

grtrans: Polarized general relativistic radiative transfer via ray tracing
Halotools: Galaxy-Halo connection models
K2SC: K2 Systematics Correction
LAMBDAR: Lambda Adaptive Multi-Band Deblending Algorithm in R
libpolycomp: Compression/decompression library

magicaxis: Pretty scientific plotting with minor-tick and log minor-tick support
MARZ: Redshifting Program
MUSCLE: MUltiscale Spherical-ColLapse Evolution
OpenMHD: Godunov-type code for ideal/resistive magnetohydrodynamics (MHD)
PDT: Photometric DeTrending Algorithm Using Machine Learning

SAND: Automated VLBI imaging and analyzing pipeline
Shadowfax: Moving mesh hydrodynamical integration code
Surprise Calculator: Estimating relative entropy and Surprise between samples
The Tractor: Probabilistic astronomical source detection and measurement
TMBIDL: Single dish radio astronomy data reduction package

TRIPPy: Python-based Trailed Source Photometry
TTVFaster: First order eccentricity transit timing variations (TTVs)
zeldovich-PLT: Zel’dovich approximation initial conditions generator

Engineering Academic Software

I’ll be heading to Schloss Dahstuhl in June for a Perspectives Workshop on Engineering Academic Software. Questions the workshop will seek to address include:

  • How is academic software different from other software? What are its most pressing dimensions of quality?
  • Is the software we use and produce in an academic context sustainable? How can we ensure that the software continues to evolve and offer value after serving its initial purpose?
  • How can we adapt software engineering methods for the unique academic context without losing quality?
  • How can we balance domain knowledge and expertise with software engineering knowledge and expertise in an academic research team?
  • Do quality aspects of academic software apply to open data as well? How can well-engineered academic software together with open data make science more reproducible?

I look forward to tackling these and other questions with the other participants, and thank Carole Goble, James Howison, Claude Kirchner, and Oscar M. Nierstrasz for organizing the workshop.

March 2016 additions to the ASCL

Eighteen codes were added to the ASCL in March, 2016:

Asfgrid: Asteroseismic parameters for a star
CORBITS: Efficient Geometric Probabilities of Multi-Transiting Exoplanetary Systems
Dedalus: Flexible framework for spectrally solving differential equations
DiskJockey: Protoplanetary disk modeling for dynamical mass derivation
ellc: Light curve model for eclipsing binary stars and transiting exoplanets

EQUIB: Atomic level populations and line emissivities calculator
ExoPriors: Accounting for observational bias of transiting exoplanets
FAST-PT: Convolution integrals in cosmological perturbation theory calculator
fibmeasure: Python/Cython module to find the center of back-illuminated optical fibers in metrology images
gPhoton: Time-tagged GALEX photon events analysis tools

HIIexplorer: Detect and extract integrated spectra of HII regions
PyGSM: Python interface to the Global Sky Model
PolRadTran: Polarized Radiative Transfer Model Distribution
ROBAST: ROOT-based ray-tracing library for cosmic-ray telescopes
SILSS: SPHERE/IRDIS Long-Slit Spectroscopy pipeline

SMARTIES: Spheroids Modelled Accurately with a Robust T-matrix Implementation for Electromagnetic Scattering
tpipe: Searching radio interferometry data for fast, dispersed transients
VIP: Vortex Image Processing pipeline for high-contrast direct imaging of exoplanets

February 2016 additions to the ASCL

Twenty-one codes were added to the ASCL in February, 2016:

Automark: Automatic marking of marked Poisson process in astronomical high-dimensional datasets
Celestial: Common astronomical conversion routines and functions
CHIP: Caltech High-res IRS Pipeline
CLOC: Cluster Luminosity Order-Statistic Code
COLAcode: COmoving Lagrangian Acceleration code

DELightcurveSimulation: Light curve simulation code
DUSTYWAVE: Linear waves in gas and dust
FilTER: Filament Trait-Evalutated Reconstruction
GANDALF: Graphical Astrophysics code for N-body Dynamics And Lagrangian Fluids
IRSFRINGE: Interactive tool for fringe removal from Spitzer IRS spectra

k2photometry: Read, reduce and detrend K2 photometry
LensTools: Weak Lensing computing tools
LIRA: LInear Regression in Astronomy
LRGS: Linear Regression by Gibbs Sampling
mbb_emcee: Modified Blackbody MCMC

NuCraft: Oscillation probabilities for atmospheric neutrinos calculator
POPPY: Physical Optics Propagation in PYthon
pyraf-dbsp: Reduction pipeline for the Palomar Double Beam Spectrograph
TailZ: Redshift distributions estimator of photometric samples of galaxies
The Cannon: Data-driven method for determining stellar parameters and abundances from stellar spectra

ZAP: Zurich Atmosphere Purge