Data Pre-Processing
Warning
The documentation is under active development. Statistical and machine learning models will be made available once fully validated.
MUSE/RAVENS Pipeline
The MUSE/RAVENS pipeline performs the anatomical segmentation and parcellation. Following these steps will setup the MUSE/RAVENS processing pipeline using a singularity container.
Setup
Running the scripts inside the container requires Git and Singularity. Follow directions of the respective tools to install Singularity (https://sylabs.io/guides/3.8/admin-guide/installation.html) and Git (https://github.com/git-guides/install-git).
Make sure that the commands singularity and git are available in the terminal, for instance by adding them to the $PATH environment variable.
Additionally, make sure that an environment variable $TMPDIR points to a temporary scratch space that can be used to store intermediate results. Otherwise, it will be set to $PWD (i.e. the current working directory).
Complete example
Clone the istaging git repository
GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/CBICA/NiBAx/
cd NiBAx/Image_Processing/sMRI
git lfs pull --include example
git lfs pull --include sMRI_ProcessingPipeline
Download from the singularity cloud and save the .sif file in the Container/ folder.
cd Container
singularity pull library://jimitdoshi/cbica/cbica-muse-pipeline:1.0.0
cd ..
Follow the example provided in the example/ directory
bash example/run_example.sh
This step will create a Protocols directory inside example which contains all the results files.
# Re-orientation to LPS
example/Protocols/ReOrientedLPS/SUB-01/SUB-01_T1_LPS.nii.gz
# Intensity inhomogeneity correction
example/Protocols/BiasCorrected/SUB-01/SUB-01_T1_LPS_N4.nii.gz
# Brain extraction
example/Protocols/Skull-Stripped/SUB-01/SUB-01_T1_LPS_N4_brainmask_muse-ss.nii.gz
example/Protocols/Skull-Stripped/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss.nii.gz
example/Protocols/Skull-Stripped/SUB-01/SUB-01_T1_LPS_N4_ROI_1_SimRank.nii.gz
# Another round of inhomogeneity correction using fast
example/Protocols/fastbc/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_seg.nii.gz
example/Protocols/fastbc/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc.nii.gz
# MUSE ROI labeling
example/Protocols/MUSE/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse.nii.gz
example/Protocols/MUSE/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_DerivedVolumes.csv
# Tissue segmentation using MUSE ROIs
example/Protocols/Segmented/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg.nii.gz
# RAVENS
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_rTemplate_ants-0.5_JacDet.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_rTemplate_ants-0.5.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_10.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_50.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_150.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_250.nii.gz
# Post-processed RAVENS
### Smoothed by 2mm
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_10_s2.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_50_s2.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_150_s2.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_250_s2.nii.gz
### Downsampled to 2mmx2mmx2mm
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_10_s2_DS.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_50_s2_DS.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_150_s2_DS.nii.gz
example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_250_s2_DS.nii.gz
fMRI Processing
This pipeline is for pre-processing fMRI time-series using an incrementally modified version of the [UK_biobank_pipeline](https://git.fmrib.ox.ac.uk/falmagro/UK_biobank_pipeline_v_1). The pipeline removes structured artifacts using ICA+FIX [2], resamples filtered functional data to standard space, applies GIGICA [3] on functional data to extract features. Higher level functionalities include:
Generating filtered functional data and resampling to standard space(MNI152_2mm)
Getting subject specific IC time courses using GIGICA.
Getting Correlation Matrices at two different dimensionalities 25(21 useful components) and 100(55 useful)
Packages required
UKBiobank pipeline (https://github.com/CBICA/UK_biobank_pipeline_v_1.git)
GIGICA - Group Information Guided ICA (https://www.nitrc.org/projects/gig-ica/)
FSL and AFNI
Outputs
Dimensions : n = 25 or 100 components (Group ICs from UKBiobank)
Good Components list are in : (only useful components are extracted and saved)
n25 : https://www.fmrib.ox.ac.uk/ukbiobank/group_means/rfMRI_GoodComponents_d25_v1.txt
n100 : https://www.fmrib.ox.ac.uk/ukbiobank/group_means/rfMRI_GoodComponents_d100_v1.txt
and can be viewed : (this includes viewing bad nodes also)
https://www.fmrib.ox.ac.uk/ukbiobank/group_means/rfMRI_ICA_d25.html https://www.fmrib.ox.ac.uk/ukbiobank/group_means/rfMRI_ICA_d100.html
For Partial and Full Correlations saved n*(n-1)/2 vectorized elements.
i)Nodal amplitudes (21 useful/25 and 55 useful/100)
ii)Partial Correlation Matrix (vectorized and saved upper triangle 210 and 1485)
iii) Full Correlation Matrix (vectorized and saved upper triangle - 210 elements and 1485)
Download and Setup
UKB_Pipeline
GIT_LFS_SKIP_SMUDGE=1 git clone https://git.upd.unibe.ch/p400pm_191026/istaging_data_consolidation.git IDC_TEMP
cd IDC_TEMP/Image_Processing/fMRI
git lfs pull --include GIGICAR.tar.gz
git submodule update --init UK_biobank_pipeline_v_1
Note : The command git submodule update –init will work with git/2.23.0 & above. Otherwise use git init` and then git submodule add UK_biobank_pipeline_v_1.
GIGICA
tar xvfz GIGICAR.tar.gz
rm -rf GIGICAR.tar.gz
FSLNets
This is for calculating networks (dependency: MATLAB and L1 precision) For more information: https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSLNets
cd IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1
wget http://www.fmrib.ox.ac.uk/~steve/ftp/fslnets.tar.gz
tar xvfz fslnets.tar.gz
rm -rf fslnets.tar.gz
L1precision
To estimate L1-norm regularized partial correlation. Here, we are not regularizing/normalizing correlations for now. But its good to get this on path. .. code-block:: shell
cd IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/FSLNets wget http://www.cs.ubc.ca/~schmidtm/Software/L1precision.zip unzip L1precision.zip rm -rf L1precision.zip
With the setup being complete, now navigate to ~/IDC_TEMP/Image_Processing/fMRI/scripts for running the pipeline.
Input Data
Step 1 expects data to be in partial BIDS format. And for each subject, folder structure would be for example (runs both structural and functional pipelines and generate T1 brain mask for registration). The resulting files are ${sub}/fMRI_nosmooth/rfMRI.nii.gz and ${sub}/T1/T1.nii.gz.
sh convert_to_BIDS.sh -f ${path_to_resting_data} -s ${path_to_t1} -d ${destination} -smooth 0 # or 1
This creates the corresponding directory, copies files, and reorients images to LAS.
Submitting cluster job
This is an example of how to get the pipeline up and running locally. Assuming all wrapper scripts,UKBiobank pipeline and GIGICA are properly cloned:
Step 1 Filter functional data in MNI152_2mm template space
Result to look for: ${dest}/${sub}/fMRI_nosmooth/rfMRI.ica/reg_standard/filtered_func_data_clean.nii.gz
Example command for preprocessing the data:
jid=$(qsub \
-terse \
-j y \
-l h_vmem=12G \
-o ${dest}/${sub}/sge/\$JOB_NAME-\$JOB_ID.log \
${path_to_script}/ukbb_fix.sh \
-s ${sub} \
-i ${inpath} \
-tr ${TR} \
-te ${TE} \
-fwhm 100 \
-p ${UKBB_Pipeline_Dir} \
-smooth 0 );
- where sub - subject ID
inpath - Path for input directory where subject directory exists(output will be saved in ${inpath}/${sub}) TR - Repetition Time(sec) TE - Echo Time(ms) FWHM - Smoothing parameter - Full Width at Half Max p - location of UKBB pipeline directory -smooth - 0/1 0-no smoothing(uses WHII training data for FIX denoising)
1 - smoothing (uses Standard data for FIX denoising)
Step 2
Running GIGICA on filtered functional data separately for 25 and 100 components.
- Result to look for: ${dest}/${sub}/${sub}_gigica.mat which has subject specific time courses and ICs.
- gigica.mat - icnVoxels x nComponents
tc : nTimecourses x nComponents
Other results: ${sub}_timecourses.nii.gz and ${sub}_componets.nii.gz
Example command for obtaining gigica matrix for an individual:
jid=$(qsub \
-terse \
-b y \
-j y \
-l h_vmem=10G \
-o ${dest}/${sub}/sge/\$JOB_NAME-\$JOB_ID.log \
${path_to_script}/run_GIGICA.sh \
-in ${filtered_img} \
-ref ${ref_ics} \
-mask ${mask_img} \
-dest ${out_base} \
-p ${gigica_dir} \
-a 0.5 );
- where in - full path to filtered functional data registered to standard space from previous step (4D file)
ref - absolute path to reference group ICs(4D file) mask - absolute path to MNI152_2mm binarized mask dest - output directory along with base name p - path to GIGICA scripts directory a - similarity parameter by default 0.5 (optional)
Step 3 Extract features from GIGICA matrix
Run separately for 25 and 100 components.
Result to look for: Within ${dest}/${sub}/rfMRI_d100/:
${sub}_NodeAmplitudes_v1.txt - which has nodal amplitudes of size n=21 or n=55 ${sub}_partialcorr_v1.txt - partial correlations of size (21x20/2 = 210 elements or 55x54/2 = 1485 elements) ${sub}_fullcorr_v1.txt - Full correlations of size (21x20/2 = 210 elements or 55x54/2 = 1485 elements)
Example command for obtaining final features for an individual:
jid=$(qsub \
-terse \
-b y \
-j y \
-l h_vmem=8G \
-o ${dest}/${sub}/sge/\$JOB_NAME-\$JOB_ID.log \
${script}/processing_gigica.sh \
-s ${sub} \
-tr ${TR} \
-iDir ${protoDir}/GIGICA/gigica_d100 \
-nets ${FSLNets} \
-p ${ukb} \
-n ${nc} \
-gDir ${ukb}/templates/group/ \
-tp ${ntp} \
-o ${dest})
- where s - subject ID
tr - Repetition Time(sec) iDir - path for GIGICA input result directory nets - path to FSLNets. This must be within UK_biobank_pipeline_v_1 when we clone repository p - path to UKBiobank scripts directory n - number of components(25 or 100) gDir - path to group directory where template for melodic_IC_d25 and melodic_IC_d100 exists. tp - number of timepoints o - destination directory for saving final results.
### Working Example:
Copy data from project:
mkdir \${HOME}/Data/ -pv
cp -r /cbica/projects/BLSA/Pipelines/rsfMRI/rsfMRI_2020/Data/Nifti/BLSA_7996_06-0_10/ \${HOME}/Data/
Set all environment variables and paths in settings.sh within scripts directory.
iii) Create destination directory .. code-block:: shell
mkdir ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/sge -pv
sh convert_to_BIDS.sh \
-f ${HOME}/Data/BLSA_7996_06-0_10/BLSA_7996_06-0_10_REST.nii.gz \
-s ${HOME}/Data/BLSA_7996_06-0_10/BLSA_7996_06-0_10_T1.nii.gz \
-d ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10 \
-smooth 0
Expected output is :
`\${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.nii.gz`
`\${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/T1/T1.nii.gz`
For submitting this script to cluster:
jid=$(qsub \
-terse \
-b y \
-j y \
-l short \
-o ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/sge/\$JOB_NAME-\$JOB_ID.log \
$HOME/IDC_TEMP/Image_Processing/fMRI/scripts/convert_to_BIDS.sh \
-f ${HOME}/Data/BLSA_7996_06-0_10/BLSA_7996_06-0_10_REST.nii.gz \
-s ${HOME}/Data/BLSA_7996_06-0_10/BLSA_7996_06-0_10_T1.nii.gz \
-d ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10 \
-smooth 0)
iv) Check Orientation and see if it is LAS:
`fslhd \${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.nii.gz`
`fslhd \${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/T1/T1.nii.gz`
Expected output:
`qform_xorient Right-to-Left`
`qform_yorient Posterior-to-Anterior`
`qform_zorient Inferior-to-Superior`
v) Next run fmri pipeline by:
sh ukbb_fix.sh \
-s BLSA_7996_06-0_10 \
-tr 2 \
-te 25 \
-fwhm 100 \
-p ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/ \
-i ${HOME}/Out/UKB_Pipeline/ \
-smooth 0
Expected: ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.ica/reg_standard/filtered_func_data_clean.nii.gz (in standard space)
For submitting this script to cluster:
jid=$(qsub \
-terse \
-b y \
-j y \
-l h_vmem=12G \
-o ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/sge/\$JOB_NAME-\$JOB_ID.log \
$HOME/IDC_TEMP/Image_Processing/fMRI/scripts/ukbb_fix.sh \
-s BLSA_7996_06-0_10 \
-tr 2 \
-te 25 \
-fwhm 100 \
-p ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/ \
-i ${HOME}/Out/UKB_Pipeline/ \
-smooth 0)
For GIGICA,
mkdir ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/sge -pv
sh run_GIGICA.sh \
-in ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.ica/reg_standard/filtered_func_data_clean.nii.gz \
-ref ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/group/melodic_IC_100.nii.gz \
-mask ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/MNI152_T1_2mm_brain_mask_bin.nii.gz \
-dest ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/BLSA_7996_06-0_10 \
-p ${HOME}/GIGICAR/ \
-a 0.5
Pre-requisite: This script takes filtered_func_data_clean in standard space as input which is the output from previous step. Expected output: ``${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/BLSA_7996_06-0_10_gigica.mat`
The above script runs on MATLAB and exceeds interactive CPU/run limit. It may also use lot of CPUs. To avoid this, it can be submitted as batch job as below .
jid=$(qsub \
-terse \
-b y \
-j y \
-l h_vmem=10G \
-o ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/sge/\$JOB_NAME-\$JOB_ID.log \
$HOME/IDC_TEMP/Image_Processing/fMRI/scripts/run_GIGICA.sh \
-in ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.ica/reg_standard/filtered_func_data_clean.nii.gz \
-ref ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/group/melodic_IC_100.nii.gz \
-mask ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/MNI152_T1_2mm_brain_mask_bin.nii.gz \
-dest ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/BLSA_7996_06-0_10 \
-p ${HOME}/IDC_TEMP/Image_Processing/fMRI/GIGICAR/ \
-a 0.5 )
For Feature Extraction,
Expected output files:
${HOME}/Out/Features/BLSA_7996_06-0_10/rfMRI_d100/BLSA_7996_06-0_10_NodeAmplitudes_v1.txt
${HOME}/Out/Features/BLSA_7996_06-0_10/rfMRI_d100/BLSA_7996_06-0_10_partialcorr_v1.txt
${HOME}/Out/Features/BLSA_7996_06-0_10/rfMRI_d100/BLSA_7996_06-0_10_fullcorr_v1.txt
For submitting this script to cluster:
jid=$(qsub \
-terse \
-b y \
-j y \
-l h_vmem=8G \
-o ${HOME}/Out/Features/BLSA_7996_06-0_10/sge/\$JOB_NAME-\$JOB_ID.log \
$HOME/IDC_TEMP/Image_Processing/fMRI/scripts/processing_gigica.sh \
-s BLSA_7996_06-0_10 \
-tr 2 \
-iDir ${HOME}/Out/GIGICA/gigica_d100/ \
-nets ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/FSLNets \
-p ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/ \
-n 100 \
-gDir ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/group/ \
-tp 180 \
-o ${HOME}/Out/Features/ )
References
[1] Miller KL, Alfaro-Almagro F, Bangerter NK, Thomas DL, Yacoub E, Xu J, Bartsch AJ, Jbabdi S, Sotiropoulos SN, Andersson JL, Griffanti L, Douaud G, Okell TW, Weale P, Dragonu I, Garratt S, Hudson S, Collins R, Jenkinson M, Matthews PM, Smith SM. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nat Neurosci. 2016 Nov;19(11):1523-1536. doi: 10.1038/nn.4393 . Epub 2016 Sep 19. PMID: 27643430 ; PMCID: PMC5086094.
[2] L. Griffanti, G. Salimi-Khorshidi, C.F. Beckmann, E.J. Auerbach, G. Douaud, C.E. Sexton, E. Zsoldos, K. Ebmeier, N. Filippini, C.E. Mackay, S. Moeller, J.G. Xu, E. Yacoub, G. Baselli, K. Ugurbil, K.L. Miller, and S.M. Smith. ICA-based artefact removal and accelerated fMRI acquisition for improved resting state network imaging. NeuroImage, 95:232-47, 2014
[3] Du Y, Fan Y. Group information guided ICA for fMRI data analysis. Neuroimage. 2013 Apr 1;69:157-97. doi: 10.1016/j.neuroimage.2012.11.008 . Epub 2012 Nov 27. PMID: 23194820 .