WARNING: this post contains sarcasm and some swearing.
(But only where absolutely necessary.)
COBIDAcq, pronounced "Koby-dack," is the Committee on Best Practice in Data Acquisition. It is based on the similarly dodgy acronym, COBIDAS: Committee on Best Practice in Data Analysis and Sharing. I suppose COBPIDAAS sounds like a medical procedure and CBPDAS is unpronounceable, so COBIDAS here we are.
Unlike COBIDAS, however, the COBIDAcq doesn't yet exist. Do we need it? The purpose of this post is to wheel out the idea and invite debate on the way we do business.
Why a new committee?
You know the old aphorism: "Act in haste, repent at leisure?" It's not just for US tax reform. We have a lot of errors made in haste in fMRI. You may have noticed. Some of the errors may be directly under an experimenter's local control, but many are distributed by influential people or through limitations in commercial products. Whatever the root causes, unless you think fMRI has already attained methods nirvana, there is ample reason to believe we could do a lot better than the status quo.
The COBIDAS approach is intended to "raise the standards of practice and reporting in neuroimaging using MRI," according to its abstract. I am still seeing weak evidence there has been wholesale adoption of the COBIDAS suggestions for reporting. (Feel free to pick up your latest favorite paper and score it against the COBIDAS report.) Thus, I'm not wholly convinced practice in neuroimaging will benefit as much as intended, except to help people disentangle what was done by others and avoid their mistakes at some much later stage, perhaps. What I'm after is intervention a lot earlier in the process.
Risks and systematic errors in an era of Big Data
A long time ago I wrote a post about taking risks in your experiment only if you could demonstrate to yourself they were essential to your primary goals. New is rarely improved without some harmful - often unknown - consequences. Rather, what new usually gets you is new failure modes, new bugs, a need to change approach etc. So if you have a really good idea for a neuroscience experiment and you can achieve it with established methods, why insist on using new methods when they may not help but may cause massive damage? That is tantamount to placing your secondary goals - impressing a reviewer with yer fancy kit - ahead of your primary goals. Crazy!
There is a lot of energy presently going into data sharing and statistical power. This is great n' all, but what if the vast data sets being cobbled together have systematic flaws; potentially many different systematic flaws? How would you know? There are some QA metrics that attempt to capture some of the obvious problems - head motion or gradient spiking - but do you trust that these same metrics are going to catch more subtle problems, like an inappropriate parameter setting or a buggy reconstruction pipeline?
I'd like to propose that we redirect some of this enthusiasm towards improving our methods before massively increasing our sample size to the population of a small market town. Otherwise we are in danger of measuring faster-than-light neutrinos with a busted clock. No amount of repeat measures will tell you your clock is busted. Rather, you need a sanity check and a second clock.
Here are some examples of common problems I see in neuroimaging:
- Taking a tool that worked at one field strength and for one sort of acquisition and assuming it will work just as well at a different field strength or with a different acquisition, but with little or no explicit testing under the new circumstances.
- New and improved hardware, sequences or code that are still in their honeymoon phase, foisted into general use before rigorous testing. Only years later do people find a coding bug, or realize that widely used parameters cause problems that can be avoided with relative ease.
- Following others blindly. Following others is a great idea, as I shall suggest below, but you shouldn't assume they were paying full attention or were sufficiently expert to avoid a problem unless there is documented evidence to refer to. Maybe they got lucky and skirted an issue that you will run into.
And here's my final motivation. It's difficult enough for experienced people to determine when, and how, to use certain methods. Imagine if you were new to neuroimaging this year. Where on earth would you start? You might be easily beguiled by the shiny objects dangled in front of you. More teslas, more channels, shorter repetition times, higher spatial resolution.... If only we could use such simple measures to assess the likelihood of experimental catastrophe.
Ways to improve acquisition performance
I think there are three areas to focus on. Together they should identify, and permit resolution of, all but the most recalcitrant flaws in an acquisition.
1. Is it documented?
Without good documentation, most scientific software and devices are nigh on unusable. Good documentation educates experts as well as the inexperienced. But there's another role to consider: documenting for public consumption is one of the best ways yet devised for a developer to catch his errors. If you don't believe this to be true, you've never given a lecture or written a research paper! So, documentation should help developers catch problems very early, before they would have seen the light of day.
While we're talking documentation, black boxes are a bad idea in science. If it's a commercial product and the vendor doesn't tell us how it works, we need to open it up and figure it out. Otherwise we're conducting leaps of faith, not science.
2. How was it tested at the development stage?
Understandably, when scientists release a new method they want to present a good face to the world. It's their baby after all. When passing your creation to others to use, however, you need to inject a note of realism into your judgment and recognize that there is no perfectly beautiful creation. Test it a bit, see how it falls down. Assess how badly it hurts itself or things around it when it crashes through the metaphorical coffee table. Having done these tests, add a few explanatory notes and some test data to the documentation so that others can see where there might be holes still gaping, and so they can double-check the initial tests.
3. Has it been validated independently and thoroughly?
Today, the standard new methods pipeline can be represented in this highly detailed flow chart:
Developer → End user
Not so much a pipeline as quantum entanglement. This is a reeeeeally bad idea. It makes the end user the beta tester, independent tester and customer all in one. Like, when you're trying to complete a form online and you get one of those shibboleth messages, and you're like "What. The. Fuck! Did nobody bother to test this fucking form with Firefox on a Mac? Whaddayamean this site works best with Internet Explorer? I don't even own a PC, morons! How about you pay a high school student to test your fucking site before releasing it on the world?"
Okay, so when this happens I might not be quite that angry... Errr. Let's move on. After better documentation, independent validation is the single biggest area I think we need to see improvement in. And no, publishing a study with the new method does not count as validation. Generally, in science you are trying your hardest to get things to work properly, whereas during validation you are looking to see where and how things fail. There is a difference.
What to do now?
This is where you come in. Unless a significant fraction of end users take this stuff seriously, nothing will change. Maybe you're okay with that. If you're not okay with it, and you'd like more refined tools with which to explore the brain, let your suggestions flow. Do we need a committee? If so, should it be run through the OHBM as the COBIDAS has been? Or, can we form a coalition of the willing, a virtual committee that agrees on a basic structure and divides the work over the Internet? We have a lot of online tools at our disposal today.
I envisage some sort of merit badges for methods that have been properly documented, tested and then validated independently. There will necessarily be some subjectivity in determining when to assign a merit badge, but we're after better methods not perfect methods.
How might COBIDAcq work in practice? I think we would have to have some sort of formal procedure to initiate a COBIDAcq review. Also, it's harder to review a method without at least partial input from the developer, given that we might expect some of the documentation to come from them. In an ideal world, new method developers would eagerly seek COBIDAcq review, forwarding mountains of documentation and test data to expedite the next phase. Yeah, okay. Unrealistic. In the mean time, maybe we do things with as much democracy as we can muster: select for review the methods that are getting the most play in the literature.
One criticism I can envision runs along the line of "this will stifle innovation or prevent me from taking on a new method while I wait for you bozos to test it!" Not so. I'll draw a parallel with what the folks did for registered reports. Not all analyses have to be preregistered. If you don't know a priori what you'll do, you are still free to explore your data for interesting effects. So, if you choose to adopt a method outside of the scope of COBIDAcq, good luck with it! (Please still report your methods according to the COBIDAS report.) Maybe you will inadvertently provide some of the validation that we seek, in addition to discovering something fab about brains!
Nothing about this framework is designed to stop anyone, anywhere from doing precisely what they do now. The point of COBIDAcq is to create peer review of methods as early in their lifetime as possible, and to provide clear signaling that a method has been looked at in earnest by experts. Neuroscientists would then have another way to make decisions when selecting between methods.
Okay, that will do. I think the gist is clear. What say you, fMRI world?
No comments:
Post a Comment