Report of the

High Data-Rate Macromolecular Crystallography Meeting

ACA 2018, Toronto, CA 22 July 2018

Report Date: 1 October 2018


This is a report of the informal High Data Rate Macromolecular Crystallography (HDRMX) dinner meeting on 22 July 2018 at Quinns Steakhouse at the ACA meeting in Toronto.

Attendees:

     Herbert J. Bernstein, yayahjb at gmail dot com
     Clemens Vonrhein, vonrhein at globalphasing dot com
     Michel Fodje, Michel.Fodje at lightsource dot ca
     Nick Sauter, nksauter at lbl dot gov
     Gerard Bricogne, gb10 at globalphasing dot com
     Ana Gonzalez, ana.gonzales at maxiv dot lu dot se
     Alexei Soares, soares at bnl dot gov
     Pascal Hofer, pascal.hofer at dectris dot com
     Claudio Klein, claudio.klein at marXperts dot com
     James Gorin, james.gorin at lightsource dot ca
     John Rose, jprose at uga dot edu
     Frances C. Bernstein, fcb at bernstein-plus-sons dot com

Discussion:

After some preliminary discussion of the agenda, the group chose to focus on the issue of heterogeneity and incompleteness of metadata in diffraction images from Eiger Detectors which can prevent processing of datasets at institutions other the one at which they are created. After considerable and sometimes heated discussion, a unanimous consensus was reached on the following:

Formation of a new e-mail discussion group to unite data processing software developers and beamline scientists responsible for the content of output files from data collection from MX detectors. The new group was charged with

  1. Make data and metadata reprocessible at places where it was not collected.
  2. Gather samples of data with metadata on the HDRMX web site as a uniform front end, making use of repositories such as the Integrated Resource for Reproducibility in Macromolecular Crystallography (IRRMC, https://www.proteindiffraction.org/), the SBGrid Data Bank (/https://data.sbgrid.org/), and other facilities that provide dataset DOIs as the backends.
  3. Software developers should provide a command line script to process each of the listed datasets with special attention to what special cases arise, with a goal of eliminating the need for special exceptions
  4. Beamline scientists will be asked to provide sample dataset DOIs for the HDRMX website representative of their beamline output and to cooperate with software developers on removal of exceptions

It was clear from this discussion that there will be a need for further consideration of how best to store raw data, but it was strongly felt that the current storage capabilities are sufficient to allow an immediate start on this project.

It should be noted that plans for metadata validation using cnxvalidate were also mentioned, as were plans to pursue possible future funding, but it was felt that the urgency of the above proposal had to be the priority for this group at this time.