Data Pre-Processing

Advertencia

The documentation is under active development. Statistical and machine learning models will be made available once fully validated.

## MUSE/RAVENS Pipeline The MUSE/RAVENS pipeline performs the anatomical segmentation and parcellation. Following these steps will setup the MUSE/RAVENS processing pipeline using a singularity container.

### Setup Running the scripts inside the container requires Git and Singularity. Follow directions of the respective tools to install.

Singularity (https://sylabs.io/guides/3.8/admin-guide/installation.html) Git (https://github.com/git-guides/install-git)

Make sure that singularity and git are available in the terminal, for instance by adding them to the $PATH environment variable.

Additionally, make sure that an environment variable $TMPDIR points to a temporary scratch space that can be used to store intermediate results. Otherwise, it will be set to $PWD (i.e. the current working directory).

### Complete example

1. Clone the istaging git repository ` GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/CBICA/NiBAx/ cd NiBAx/Image_Processing/sMRI git lfs pull --include example git lfs pull --include sMRI_ProcessingPipeline `

2. Download from the singularity cloud and save the .sif file in the Container/ folder. ` cd Container singularity pull library://jimitdoshi/cbica/cbica-muse-pipeline:1.0.0 cd .. `

3. Follow the example provided in the example/ directory ` bash example/run_example.sh `

This step will create a Protocols directory inside example which contains all the results files.
``` # Re-orientation to LPS example/Protocols/ReOrientedLPS/SUB-01/SUB-01_T1_LPS.nii.gz

# Intensity inhomogeneity correction example/Protocols/BiasCorrected/SUB-01/SUB-01_T1_LPS_N4.nii.gz

# Brain extraction example/Protocols/Skull-Stripped/SUB-01/SUB-01_T1_LPS_N4_brainmask_muse-ss.nii.gz example/Protocols/Skull-Stripped/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss.nii.gz example/Protocols/Skull-Stripped/SUB-01/SUB-01_T1_LPS_N4_ROI_1_SimRank.nii.gz

# Another round of inhomogeneity correction using fast example/Protocols/fastbc/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_seg.nii.gz example/Protocols/fastbc/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc.nii.gz

# MUSE ROI labeling example/Protocols/MUSE/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse.nii.gz example/Protocols/MUSE/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_DerivedVolumes.csv

# Tissue segmentation using MUSE ROIs example/Protocols/Segmented/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg.nii.gz

# RAVENS example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_rTemplate_ants-0.5_JacDet.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_rTemplate_ants-0.5.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_10.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_50.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_150.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_250.nii.gz

# Post-processed RAVENS ### Smoothed by 2mm example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_10_s2.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_50_s2.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_150_s2.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_250_s2.nii.gz ### Downsampled to 2mmx2mmx2mm example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_10_s2_DS.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_50_s2_DS.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_150_s2_DS.nii.gz example/Protocols/RAVENS/SUB-01/SUB-01_T1_LPS_N4_brain_muse-ss_fastbc_muse_seg_ants-0.5_RAVENS_250_s2_DS.nii.gz ```

## fMRI Processing This pipeline is for pre-processing fMRI time-series using an incrementally modified version of the [UK_biobank_pipeline](https://git.fmrib.ox.ac.uk/falmagro/UK_biobank_pipeline_v_1). The pipeline removes structured artifacts using ICA+FIX [[2]](#2), resamples filtered functional data to standard space, applies GIGICA[[3]](#3) on functional data to extract features. Higher level functionalities include:

Generating filtered functional data and resampling to standard space(MNI152_2mm)

Getting subject specific IC time courses using GIGICA.

Getting Correlation Matrices at two different dimensionalities 25(21 useful components) and 100(55 useful)

### Built With

Python & bash (UK_biobank_pipeline) Wrapper Scripts(bash) & mostly FSL and AFNI commands as base

### Packages required

UKBiobank pipeline (https://github.com/CBICA/UK_biobank_pipeline_v_1.git)

GIGICA - Group Information Guided ICA (https://www.nitrc.org/projects/gig-ica/)

### Outputs

Dimensions : n = 25 or 100 components (Group ICs from UKBiobank) Good Components list are in : (only useful components are extracted and saved) n25 : https://www.fmrib.ox.ac.uk/ukbiobank/group_means/rfMRI_GoodComponents_d25_v1.txt n100 : https://www.fmrib.ox.ac.uk/ukbiobank/group_means/rfMRI_GoodComponents_d100_v1.txt

and can be viewed : (this includes viewing bad nodes also)

https://www.fmrib.ox.ac.uk/ukbiobank/group_means/rfMRI_ICA_d25.html https://www.fmrib.ox.ac.uk/ukbiobank/group_means/rfMRI_ICA_d100.html

For Partial and Full Correlations saved n*(n-1)/2 vectorized elements. i)Nodal amplitudes (21 useful/25 and 55 useful/100) ii)Partial Correlation Matrix (vectorized and saved upper triangle 210 and 1485) iii) Full Correlation Matrix (vectorized and saved upper triangle - 210 elements and 1485)

### Getting Started

### Download and Setup

UKB_Pipeline:

```bash: GIT_LFS_SKIP_SMUDGE=1 git clone https://git.upd.unibe.ch/p400pm_191026/istaging_data_consolidation.git IDC_TEMP cd IDC_TEMP/Image_Processing/fMRI git lfs pull –include GIGICAR.tar.gz git submodule update –init UK_biobank_pipeline_v_1

``` Note : git submodule update –init will work with git/2.23.0 & above. Otherwise use git init and then git submodule add UK_biobank_pipeline_v_1

GIGICA:

~~wget https://www.nitrc.org/frs/download.php/5267/GIGICAv1.1-CentOS6.3-CMD20130207.zip~~
~~unzip GIGICAv1.1-CentOS6.3-CMD20130207.zip~~
~~mv GIGICAv1.1-CentOS6.3-CMD20130207/ GIGICAR~~
~~rm -rf GIGICAv1.1-CentOS6.3-CMD20130207.zip~~

```bash

tar xvfz GIGICAR.tar.gz rm -rf GIGICAR.tar.gz ```

FSLNets:

This is for calculating networks (dependency: MATLAB and L1 precision) For more information: https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSLNets

`bash cd IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1 wget http://www.fmrib.ox.ac.uk/~steve/ftp/fslnets.tar.gz tar xvfz fslnets.tar.gz rm -rf fslnets.tar.gz `

L1precision:

To estimate L1-norm regularized partial correlation. Here, we are not regularizing/normalizing correlations for now. But its good to get this on path. `bash cd IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/FSLNets wget http://www.cs.ubc.ca/~schmidtm/Software/L1precision.zip unzip L1precision.zip rm -rf L1precision.zip `

With the setup being complete, now navigate to ~/IDC_TEMP/Image_Processing/fMRI/scripts for running the pipeline.

### Input Data:

Step 1 expects data to be in partial BIDS format. And for each subject, folder structure would be for example (runs both structural and functional pipelines and generate T1 brain mask for registration) :

${sub}/fMRI_nosmooth/rfMRI.nii.gz ${sub}/T1/T1.nii.gz

### To create Input data: ```

sh convert_to_BIDS.sh -f ${path_to_resting_data} -s ${path_to_t1} -d ${destination} -smooth 0 # or 1

``` This creates the corresponding directory, copies files, and reorients images to LAS.

### Submitting cluster job

This is an example of how to get the pipeline up and running locally. Assuming all wrapper scripts,UKBiobank pipeline and GIGICA are properly cloned:

### Step 1 Filter functional data in MNI152_2mm template space

Result to look for : ${dest}/${sub}/fMRI_nosmooth/rfMRI.ica/reg_standard/filtered_func_data_clean.nii.gz

Example command for preprocessing the data:

```bash

jid=$(qsub: -terse -j y -l h_vmem=12G -o ${dest}/${sub}/sge/$JOB_NAME-$JOB_ID.log ${path_to_script}/ukbb_fix.sh -s ${sub} -i ${inpath} -tr ${TR} -te ${TE} -fwhm 100 -p ${UKBB_Pipeline_Dir} -smooth 0 );

```

where sub - subject ID: inpath - Path for input directory where subject directory exists(output will be saved in ${inpath}/${sub}) TR - Repetition Time(sec) TE - Echo Time(ms) FWHM - Smoothing parameter - Full Width at Half Max p - location of UKBB pipeline directory -smooth - 0/1 0-no smoothing(uses WHII training data for FIX denoising)

1 - smoothing (uses Standard data for FIX denoising)

### Step 2:

Running GIGICA on filtered functional data

Run separately for 25 and 100 components.

Result to look for: ${dest}/${sub}/${sub}_gigica.mat which has subject specific time courses and ICs.

gigica.mat - icnVoxels x nComponents

tc : nTimecourses x nComponents

Other results: ${sub}_timecourses.nii.gz and ${sub}_componets.nii.gz

Example command for obtaining gigica matrix for an individual:

```bash

jid=$(qsub: -terse -b y -j y -l h_vmem=10G -o ${dest}/${sub}/sge/$JOB_NAME-$JOB_ID.log ${path_to_script}/run_GIGICA.sh -in ${filtered_img} -ref ${ref_ics} -mask ${mask_img} -dest ${out_base} -p ${gigica_dir} -a 0.5 );

```

where in - full path to filtered functional data registered to standard space from previous step (4D file)
ref - absolute path to reference group ICs(4D file) mask - absolute path to MNI152_2mm binarized mask dest - output directory along with base name p - path to GIGICA scripts directory a - similarity parameter by default 0.5 (optional)

### Step 3 Extract features from GIGICA matrix

Run separately for 25 and 100 components.

Result to look for: Within ${dest}/${sub}/rfMRI_d100/:

${sub}_NodeAmplitudes_v1.txt - which has nodal amplitudes of size n=21 or n=55 ${sub}_partialcorr_v1.txt - partial correlations of size (21x20/2 = 210 elements or 55x54/2 = 1485 elements) ${sub}_fullcorr_v1.txt - Full correlations of size (21x20/2 = 210 elements or 55x54/2 = 1485 elements)

Example command for obtaining final features for an individual:

```bash

jid=$(qsub: -terse -b y -j y -l h_vmem=8G -o ${dest}/${sub}/sge/$JOB_NAME-$JOB_ID.log ${script}/processing_gigica.sh -s ${sub} -tr ${TR} -iDir ${protoDir}/GIGICA/gigica_d100 -nets ${FSLNets} -p ${ukb} -n ${nc} -gDir ${ukb}/templates/group/ -tp ${ntp} -o ${dest})

```

where s - subject ID: tr - Repetition Time(sec) iDir - path for GIGICA input result directory nets - path to FSLNets. This must be within UK_biobank_pipeline_v_1 when we clone repository p - path to UKBiobank scripts directory n - number of components(25 or 100) gDir - path to group directory where template for melodic_IC_d25 and melodic_IC_d100 exists. tp - number of timepoints o - destination directory for saving final results.

### Working Example:

Copy data from project:

mkdir ${HOME}/Data/ -pv cp -r /cbica/projects/BLSA/Pipelines/rsfMRI/rsfMRI_2020/Data/Nifti/BLSA_7996_06-0_10/ ${HOME}/Data/
Set all environment variables and paths in settings.sh within scripts directory.
Create destination directory
mkdir ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/sge -pv

```bash: sh convert_to_BIDS.sh -f ${HOME}/Data/BLSA_7996_06-0_10/BLSA_7996_06-0_10_REST.nii.gz -s ${HOME}/Data/BLSA_7996_06-0_10/BLSA_7996_06-0_10_T1.nii.gz -d ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10 -smooth 0

```

Expected output is :
${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.nii.gz ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/T1/T1.nii.gz

For submitting this script to cluster: ```bash

jid=$(qsub
-terse -b y -j y -l short -o ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/sge/$JOB_NAME-$JOB_ID.log $HOME/IDC_TEMP/Image_Processing/fMRI/scripts/convert_to_BIDS.sh -f ${HOME}/Data/BLSA_7996_06-0_10/BLSA_7996_06-0_10_REST.nii.gz -s ${HOME}/Data/BLSA_7996_06-0_10/BLSA_7996_06-0_10_T1.nii.gz -d ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10 -smooth 0)

```

Check Orientation and see if it is LAS:

fslhd ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.nii.gz fslhd ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/T1/T1.nii.gz

Expectedqform_xorient Right-to-Left
qform_yorient Posterior-to-Anterior qform_zorient Inferior-to-Superior

Next run fmri pipeline by:

```bash

sh ukbb_fix.sh: -s BLSA_7996_06-0_10 -tr 2 -te 25 -fwhm 100 -p ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/ -i ${HOME}/Out/UKB_Pipeline/ -smooth 0

``` Expected: ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.ica/reg_standard/filtered_func_data_clean.nii.gz (in standard space)

For submitting this script to cluster: ```bash

jid=$(qsub
-terse -b y -j y -l h_vmem=12G -o ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/sge/$JOB_NAME-$JOB_ID.log $HOME/IDC_TEMP/Image_Processing/fMRI/scripts/ukbb_fix.sh -s BLSA_7996_06-0_10 -tr 2 -te 25 -fwhm 100 -p ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/ -i ${HOME}/Out/UKB_Pipeline/ -smooth 0)

```

vi) For GIGICA, mkdir ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/sge -pv

```bash sh run_GIGICA.sh

-in ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.ica/reg_standard/filtered_func_data_clean.nii.gz -ref ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/group/melodic_IC_100.nii.gz -mask ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/MNI152_T1_2mm_brain_mask_bin.nii.gz -dest ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/BLSA_7996_06-0_10 -p ${HOME}/GIGICAR/ -a 0.5

```

Pre-requisite: This script takes filtered_func_data_clean in standard space as input which is the output from previous step. Expected: ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/BLSA_7996_06-0_10_gigica.mat

The above script runs on MATLAB and exceeds interactive CPU/run limit. It may also use lot of CPUs. To avoid this, it can be submitted as batch job as below .

```bash

jid=$(qsub: -terse -b y -j y -l h_vmem=10G -o ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/sge/$JOB_NAME-$JOB_ID.log $HOME/IDC_TEMP/Image_Processing/fMRI/scripts/run_GIGICA.sh -in ${HOME}/Out/UKB_Pipeline/BLSA_7996_06-0_10/fMRI_nosmooth/rfMRI.ica/reg_standard/filtered_func_data_clean.nii.gz -ref ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/group/melodic_IC_100.nii.gz -mask ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/MNI152_T1_2mm_brain_mask_bin.nii.gz -dest ${HOME}/Out/GIGICA/gigica_d100/BLSA_7996_06-0_10/BLSA_7996_06-0_10 -p ${HOME}/IDC_TEMP/Image_Processing/fMRI/GIGICAR/ -a 0.5 )

```

vii) For Feature Extraction, mkdir ${HOME}/Out/Features/BLSA_7996_06-0_10/sge -pv

```bash

sh processing_gigica.sh: -s BLSA_7996_06-0_10 -tr 2 -iDir ${HOME}/Out/GIGICA/gigica_d100/ -nets ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/FSLNets -p ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/ -n 100 -gDir ${HOME}/IDC_TEMP/Representation/fMRI/UK_biobank_pipeline_v_1/templates/group/ -tp 180 -o ${HOME}/Out/Features/

```

Expected: ${HOME}/Out/Features/BLSA_7996_06-0_10/rfMRI_d100/BLSA_7996_06-0_10_NodeAmplitudes_v1.txt ${HOME}/Out/Features/BLSA_7996_06-0_10/rfMRI_d100/BLSA_7996_06-0_10_partialcorr_v1.txt ${HOME}/Out/Features/BLSA_7996_06-0_10/rfMRI_d100/BLSA_7996_06-0_10_fullcorr_v1.txt

For submitting this script to cluster:

```bash

jid=$(qsub: -terse -b y -j y -l h_vmem=8G -o ${HOME}/Out/Features/BLSA_7996_06-0_10/sge/$JOB_NAME-$JOB_ID.log $HOME/IDC_TEMP/Image_Processing/fMRI/scripts/processing_gigica.sh -s BLSA_7996_06-0_10 -tr 2 -iDir ${HOME}/Out/GIGICA/gigica_d100/ -nets ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/FSLNets -p ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/ -n 100 -gDir ${HOME}/IDC_TEMP/Image_Processing/fMRI/UK_biobank_pipeline_v_1/templates/group/ -tp 180 -o ${HOME}/Out/Features/ )

```

## References <a id=»1»>[1]</a> Miller KL, Alfaro-Almagro F, Bangerter NK, Thomas DL, Yacoub E, Xu J, Bartsch AJ, Jbabdi S, Sotiropoulos SN, Andersson JL, Griffanti L, Douaud G, Okell TW, Weale P, Dragonu I, Garratt S, Hudson S, Collins R, Jenkinson M, Matthews PM, Smith SM. Multimodal population brain imaging in the UK Biobank prospective epidemiological study. Nat Neurosci. 2016 Nov;19(11):1523-1536. doi: 10.1038/nn.4393 . Epub 2016 Sep 19. PMID: 27643430 ; PMCID: PMC5086094.

<a id=»2»>[2]</a> L. Griffanti, G. Salimi-Khorshidi, C.F. Beckmann, E.J. Auerbach, G. Douaud, C.E. Sexton, E. Zsoldos, K. Ebmeier, N. Filippini, C.E. Mackay, S. Moeller, J.G. Xu, E. Yacoub, G. Baselli, K. Ugurbil, K.L. Miller, and S.M. Smith. ICA-based artefact removal and accelerated fMRI acquisition for improved resting state network imaging. NeuroImage, 95:232-47, 2014

<a id=»3»>[3]</a> Du Y, Fan Y. Group information guided ICA for fMRI data analysis. Neuroimage. 2013 Apr 1;69:157-97. doi: 10.1016/j.neuroimage.2012.11.008 . Epub 2012 Nov 27. PMID: 23194820 .