The ASCL was well-represented at the AAS 231 Hack Together Day on Friday, January 12, with Advisory Committee Chairman Peter Teuben working on two hacks, one of which hopes to provide better guidance regarding software to reviewers, dashboard developer PW Ryan also working on two hacks, both related to the ASCL and research we’re conducting, and yours truly; I mostly worked on ASCL tasks that have been backlogged, such as adding preferred citation information to ASCL entries. The ASCL currently has preferred citation information listed for 25% of our entries; we will be adding this information to more records in 2018 where we can find it, though I note that many code sites do not list a preferred citation on their download sites.
For one of his hacks, Ryan grabbed all the Github links in ASCL entries, and then using a Ruby Gem that looks for licenses in Github repos, reported on the licensing information available. These results are preliminary, so please don’t take them as gospel, but it appears that a whopping 34% of these codes do not have licensing information in the repo. The most popular license is MIT, which does not surprise me, as Daniel Foreman-Mackey reported in the Special Session we held at AAS 225 that MIT was the popular license across all Github repos that have licensing info.
It’s come up in Twitter, so here are CSV files for codes that have and do not have preferred citation information in their ASCL records.
Codes with citation info
Codes without citation info
The pie below shows what percentage of codes in the ASCL have social coding links in their entries, and the Starlink caveat still holds: all the Starlink codes are in one Github repo, so that repo is represented only once in the pie below. These numbers are somewhat low, as some codes offer a webpage/site to which the ASCL links, with that webpage then directing people to a repostitory. If someone does a better analysis, please send it over; I’d love to include it!
As before, the data are here.
Data play: Social coding sites
More data play: Common domains
Here’s a quick look at the most common domains from the ASCL site links, with the actual counts included in the data labels.
And this shows the top ten country domains among the entries, also with the actual counts shown:
I’ve posted before about where the codes are; here’s a pie that shows the relative use of Github, Google Code, Bitbucket, and Sourceforge. Please note that because all the Starlink codes are in one Github repo, that repo is represented only once in the pie below. Want to do your own analysis? The site links (1080 of them at the moment, as some codes have more than one) are available here.