Wednesday, November 14, 2012

Review: Using a bias field map to improve motion correction of EPI time series

In a new paper entitled "Effects of image contrast on functional MRI image registration," Gonzalez-Castillo et al. evaluate the performance of motion correction (a.k.a. registration) following a pre-processing step that aims to remove the contrast imparted across images due to receive (and/or transmit) field heterogeneity. A bias field map is estimated from a target EPI, and this reference image is then used to normalize the other images in the time series. There are other aims in the paper, too: specifically, to evaluate the performance of image registration (EPI to EPI, or EPI to MP-RAGE anatomical) when the T1 contrast of time series EPIs is altered via the excitation RF flip angle. But in this post I am going to focus on the normalization part because it involves the RF receive field heterogeneity, and this instrumentally-induced contrast is of particular concern for exacerbating motion sensitivity in fMRI (as explained here).

Although others have compared prescan normalization between different array coils (see the references in this paper), this is the first paper I've seen that compares motion correction performance for EPI time series acquired with an array coil (a 16-channel array) to a single channel birdcage coil. Now, this isn't quite the straightforward comparison I might like - with the receive fields being the only difference - because in this instance the birdcage is also used to transmit the excitation RF pulses, making the transmission (Tx) field for the birdcage experiment more spatially heterogeneous than will be produced from the body RF coil that's used when acquiring with a receive-only array coil. Following? In other words, for the 16-channel array the receive (Rx) field heterogeneity is likely to dominate whereas for the birdcage coil the heterogeneities of both the transmit and receive fields are salient. Still, it's worth a look since the coil comparison highlights the issue of the scanner hardware's influence on EPI contrast, and on subsequent motion correction.

The experimental stuff

Images were acquired on a 3 T GE HDx scanner. For the time series EPI measurements the study used a gradient echo EPI sequence with TR/TE of 2000/30 ms, 3.75x3.75x4.00 mm voxels, 33 axial slices and 64 volumes per time series. Scans were acquired using excitation flip angles from 10 to 90 degrees (10 degree steps) in order to assess the effects of T1 contrast on image registration accuracy. Subjects (6 male, 2 female; 24-36 years, mean 29 years) were scanned over two sessions, once using a Tx/Rx birdcage coil and once with a 16-channel Rx-only coil (and the scanner's built-in whole body RF coil for transmission). The entire protocol was repeated for each session (i.e. for each coil) on each subject.

Although it isn't specified in the paper, my assumption is that the authors used the GE default excitation paradigm for EPI: that is, a water- and slice-selective composite RF pulse (known colloquially as "water excite") rather than a separate fat saturation pre-pulse (a.k.a. "fatsat") with a separate slice-selective excitation pulse. (Siemens scanners use fatsat by default.) The use of water excitation rather than fat suppression - if that's indeed what was used - does have some implications for those using fatsat, but I'll get to that point later on.

Prior to further processing, all images in each EPI time series were masked using 3dAutomask in AFNI, to restrict analysis to intra-cranial voxels. For each EPI time series, an intensity bias correction map was derived from the 6th volume in the series, to ensure steady state T1 contrast, using the segmentation function available in SPM8. (No name was given for this function, I assume those familiar with SPM8 will recognize it.) This map was then used to normalize all the other images of the series, a process the authors refer to as "intensity uniformization." Time series images were then registered - or motion corrected if you prefer - to the 6th EPI volume of that series using AFNI's 3dvolreg algorithm, an affine (six degrees of freedom) transformation that minimizes least-squares differences.

A parameter called Mean Voxel Distance (MVD) was used to quantify registration performance to a single value, where MVD (in mm) is defined as the average Euclidean distance between the position of all intra-cranial voxels arising out of a reference registration and a trial registration. The reference  is obtained from registering the 6th EPI to the first EPI in each time series, the first EPI being acquired fully relaxed - what the paper calls the Infinite TR Volume. (Siemens users, see Note 1.) MVD then quantifies the difference between the trial registration - 6th EPI to nth EPI - with the reference - 6th EPI to 1st EPI. Hence:

"The interpretation of the MVD is as follows: the larger the MVD, the larger the inconsistency between the reference alignment (the one considered the gold standard) and the alignment under consideration."

As the authors point out, any errors in the reference alignment propagate into the measurement of interest. It's not an ideal quantification in light of this reference dependence, but for the purposes of this review I shall assume that it produced no systematic errors. (The authors include some simulations to characterize the behavior of MVD.) Instead, I want to get to the results so that we can assess any implications arising from this paper.

Bias field normalization improves motion correction

The particular comparisons of interest to me in this post were made between motion correction of time series EPIs with and without the bias field normalization, for both the Tx/Rx birdcage coil and the Rx-only 16-channel array coil.

The use of bias field normalization improved the subsequent registration (motion correction) results, as measured by reducing the mean voxel displacement, with the improvement being larger (bigger reduction in MVD) for the 16-channel array coil than for the birdcage coil:

Figure 6 from Gonzalez-Castillo et al.

For both RF coils the bias field correction improved the registration accuracy linearly with flip angle, indicating that this normalization process is somewhat robust to the underlying image contrast. In these experiments a lower flip angle corresponded to greater image (T1-based) contrast, showing that as the amount of brain contrast increases, relative to the (fixed) bias field contrasts, the ability of the registration algorithm to correct for head motion was improved. Thus, whether brain contrast is increased (with lower flip angle) or the extent of bias field contrast is reduced (via the bias field correction), it all leads to improved motion correction.

It's interesting to note that the array coil produces slightly better registration (that is, lower MVD) than the birdcage coil in the absence of bias field correction. This would suggest that the bias fields arising out of the combination of Tx and Rx heterogeneity for the birdcage coil are greater than the bias field being dominated by the Rx heterogeneity of the array coil; the Tx heterogeneity of the body RF coil would probably be quite small compared to that of the Rx field of the array.

Once the bias field correction is applied, however, the array coil seems to out-perform the birdcage coil by a significant amount. Put another way, the bias field correction appears to be better able to mitigate the receive field bias from the array than it is able to mitigate the transmit and receive field biases of the birdcage, even though the birdcage receive field heterogeneity is much lower than that of the array. But because the bias field is derived from an EPI template that includes Tx as well as Rx effects for each case, there's no easy way to estimate how much of the improvement for the birdcage coil data can be attributed to the Rx field heterogeneity alone. Even so, the improvement is still far greater for the 16-ch data which naturally suggests that the Rx bias field of the 16-channel array has greater heterogeneity than the combination of Tx and Rx field heterogeneities for the birdcage coil. (See Note 2.) One can only wonder what a body coil transmit, receive-only birdcage combination would have yielded.

Still, it is a tad surprising that the 16-channel array performs comparably (as measured by MVD) to the Tx/Rx birdcage when bias field correction isn't used. Could the array coil's strongly heterogeneous receive field be "anchoring" the registration algorithm? Is that why there is so much improvement when the bias field correction is applied to the array coil data?

Limitations and considerations of the current study

The bias field approach used here would include Rx field heterogeneity as well as Tx field heterogeneity, and also has an inherent bias towards signal having longish T2* because regions of signal dropout on gradient echo EPI at 30 ms will not provide information for the correction map, a limitation that may make a difference if subject movement is appreciable. (See Note 3.)

The use of a fully relaxed scan as a reference target, and EPI acquired with water excitation rather than fatsat, means that the quantitative results presented here are likely to differ from a study that used a Siemens scanner with fatsat. Still, my suspicion is that the benefit of using the bias field normalization could remain, but until such a test is actually performed it's all speculation.

Although I didn't report the variable flip angle results except in passing, there was an effect of changing flip angle on the registration efficacy. If fatsat were used instead of water excitation (where I am continuing to assume that water excitation was indeed used in this study) I wouldn't expect the flip angle dependency to be as strong because there is already more contrast within the image with fatsat. This is because fatsat generates some magnetization transfer (MT) contrast, especially in white matter and less so in gray matter, which tends to enhance the contrast between CSF, GM and WM by making the latter even darker. (CSF is brightest, GM intermediate, then WM darkest. See Note 4 in this post for more information on MT contrast.) Some MT contrast is also generated with the water excite scheme but it's less than when using fatsat, a fact that is readily appreciated if one compares typical EPI data from a GE scanner to those of a Siemens scanner: EPI from a GE scanner generally appear a lot flatter. Still, it will be interesting to see if the flip angle dependency persists once fatsat is being used instead of water excitation.

As a receiver coil, the single channel birdcage doesn't have the complicating factors of requiring some sort of element combination because all the signal contributions are summed in analog; with one voltage to be detected (in quadrature). The sixteen individual signals obtained from the 16-channel array, on the other hand, must be combined to produce each final image; in this case the method is the standard root-sum-of-squares approach. Other element combination methods are available, often at the click of a button on the scanner, so be careful if you're using something other than root-SOS because image contrast (as modulated by the receive field heterogeneity) could appear slightly different. I wouldn't expect there to be a major departure from the root-SOS, but it often pays to be circumspect when dealing with the complexities of fMRI!

Should we all be using bias field normalization before motion correction?

Is a bias field correction a useful pre-processing step in fMRI? Based on these results alone I can't say for sure. But I do think that some sort of intermediate correction step could be useful when using an array receiver coil. The perennial "more work is needed" is true here, and that's why I'm reviewing the paper. I think it's a piece of a complex puzzle that we all need to be looking at, if not actively working on.

For example, is a bias field derived from the actual EPI data sufficient, or best, for mitigating receive field contrast interaction with the motion correction algorithm? A major practical benefit of the bias field approach as used in the paper is that it can be obtained from any existing data; no new acquisition step is required. But there are other ways to produce a correction map, e.g. using the "prescan normalization" option that is available on most scanners. The prescan normalization routines I'm familiar with (on a Siemens TIM/Trio) use a standard gradient echo imaging acquisition with lowish resolution (circa 8 mm), which immediately raises further questions: Does the resolution of the prescan need to be better to map the bias field gradients, or should it just match the EPI resolution? And, given the mismatch between the distorted EPI and the undistorted gradient echo image (which uses conventional "spin warp" phase encoding), shouldn't there be a benefit to using an EPI-based prescan instead? (See Note 3 again.)

A major limitation of the bias field approach as suggested in the paper is likely to be the lack of support in regions containing no signal - those regions of signal dropout, for example. Our target template has limitations. Thus, a further benefit of a separate prescan of some sort could be to improve signal coverage, possibly leading to more robust registration. Another concern of the bias field derived from individual EPIs, rather than scans that aim to map the receive (or transmit) field itself, is that the algorithm used to generate the bias field estimate may well interpret some real brain features as parts of the bias field rather than anatomy. It's a fit to an EPI, not a derivative map of a field per se. Thus, in this bias field normalization approach it's possible that some real image contrast will be removed in the normalization step. That could reduce the efficacy of the subsequent realignment by some amount, or it could be so subtle as to be inconsequential for real data.

My final concern is the use of the sixth volume of EPI for the normalization. Here I am going to invoke the general concern that applies to reference scans of any sort: they have limits! Selecting the sixth volume is a perfectly principled thing to do. Is it best? It depends on how the head moves in the time series! For example, if it transpires that the subject moves once near the start of the time series and then stays stationary at the new position for the bulk of it, the prediction would be that a template selected from the end of the run - the very last volume, perhaps - would be a better target than the sixth volume. Conversely, a subject may be more compliant in the early part of a run and become more fidgety later on, making an early target volume a better bet, for fear of getting a motion-contaminated target late on. Of course, there's no way to know ahead of time which target is "best."

Whether a gradient echo or EPI-based normalization map is best (e.g. a separate prescan normalization), whether a single map acquired before (or after) a time series is sufficient and appropriate for correction of an entire time series when the subject is moving, and what other unintended consequences might arise out of this latest correction step, well, that's what we need to figure out. Thus, I'll close with a warning not to take anything that you read here, in the Gonzalez-Castillo paper or elsewhere, as gospel and to test out prescan normalization and/or bias field correction for yourself. There is a tendency when faced with a problem (such as motion and its correction) in fMRI that something must be done. It then follows that since this is something it must be done. Not necessarily. Luckily for you, however, you're in the driving seat because whether you use a bias field map derived from the data itself or you opt to acquire a separate prescan normalization, unless you do something silly you aren't committed (Siemens users see Note 4, added post publication) to a particular pipeline at the point of acquisition. You can take the time to evaluate your data as twin streams, with and without the correction du jour. And that, my friends, is an option you don't get very often in this game.



1.  Siemens users should note that EPI acquired with product pulse sequences commences only after a few dummy scans, depending on the TR, and thus the first volume in any EPI time series is not fully relaxed but in an approximate steady state. If you wanted a fully relaxed EPI then you'd need to acquire a separate EPI acquisition with the TR set very long - 15 or more seconds, to allow full T1 relaxation of CSF. A single volume acquisition would suffice.

2.  The sensitivity of a tuned MRI coil when it acts as a transmitter is the same as its sensitivity when used as a receiver, a property that is encapsulated in something called the reciprocity principle. However, there is one complicating factor for our purposes. During transmission, the actual field heterogeneity depends upon the power being deposited into the coil, i.e. the heterogeneity of the transmit field depends on the RF flip angle. Still, the overall heterogeneity of a Tx/Rx coil will be closely related for transmission and reception, it's just the amount of current (driving or induced) that is changing. This is in contrast to the situation when separate coils (with active decoupling between them) are used to transmit and receive RF. In the case of a large body transmit coil, which is typically of birdcage design, and a receive-only head-sized array, the Tx field will appear relatively homogeneous (a few percent) across the head whereas the Rx field could change considerably (tens to hundreds of percent).

3.  Using a short TE, non-EPI scan such as a conventional (spin warp) GRE to generate the bias field estimate might offer improved performance near to regions of dropout on the EPIs. However, that approach suffers from its own lack of support restriction: there will be differences in distortion characteristics. A short TE or spin echo EPI with matched distortion might be the best compromise here. Using an EPI-based prescan brings its own complexities, of course. One would presumably like to use a very short TE or even a spin echo EPI as the prescan image, in order to reduce as far as possible any regions of signal dropout whilst retaining the desired distortion properties. If one uses an EPI with the same TE as used for BOLD imaging then one never has any information on the bias fields residing in the signal voids (or outside of the head for that matter!), and so head movement that takes the time series data into these voids would necessarily be corrected poorly. This isn't a trivial problem!

4.  Under certain circumstances it is possible to acquire twin data streams in the database; one "raw" and one to which prescan normalization has been applied. A restriction concerns the use of online (i.e. on the scanner) motion correction - the Siemens "MoCo" option - as well. In that case the first stream would be prescan normalized but not motion-corrected, the second would be prescan normalized and motion-corrected. There is a little more detail in my user training guide/FAQ under the section entitled "What is the practical difference between the 12-channel and 32-channel head coils? Which one is best for fMRI?" The Prescan Normalize option may be enabled for any receive-only coil.


  1. Does the definition of MVD make any sense at all? It functionally depends upon the "coordinates in three dimensional space of each intracranial voxel". But a voxel does not move. It is a fixed position finite element of volume within the FOV. So what doe the authors mean? Are they referring to a fixed volume element of the brain instead? If so then how do they measure this with error less than half the length of a voxel in the x,y,z directions? And if the error is half a voxel then of what significance are there reported differences in MVD (see figure 6) of 0.05 mm?


  2. The authors gave no description of the normalization method. That is troubling. That is what the methods section of papers is for! Telling us which button you pressed in the SPM GUI is not acceptable ... and I am not even sure they told us that much.

  3. Does the definition of MVD make any sense at all? It functionally depends upon the "coordinates in three dimensional space of each intracranial voxel". But a voxel does not move. It is a fixed position finite element of volume within the FOV. So what doe the authors mean? Are they referring to a fixed volume element of the brain instead? If so then how do they measure this with error less than half the length of a voxel in the x,y,z directions? And if the error is half a voxel then of what significance are there reported differences in MVD (see figure 6) of 0.05 mm?


  4. The authors gave no description of the normalization method. That is troubling. That is what the methods section of papers is for! Telling us which button you pressed in the SPM GUI is not acceptable ... and I am not even sure they told us that much.