{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# How pyAFQ uses BIDS\n\nThe pyAFQ API relies heavily on the\n[Brain Imaging Data Standard (BIDS)](https://bids-specification.readthedocs.io/en/stable/),\na widely used standard for organizing and describing neuroimaging data. This\nmeans that the software assumes that its inputs are organized according to the\nBIDS specification and its outputs conform where possible with BIDS.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>Derivatives of processing diffusion MRI are not currently fully\n    described in the existing BIDS specification, but describing these\n    is part of an ongoing effort. Wherever possible, we conform with\n    the draft implementation of the BIDS DWI derivatives available\n    [here](https://bids-specification.readthedocs.io/en/wip-derivatives/05-derivatives/05-diffusion-derivatives.html)</p></div>\n\nIn this example, we will explore the use of BIDS in pyAFQ and see\nhow BIDS allows us to extend and provide flexibility to the users\nof the software.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import os\nimport os.path as op\n\nimport AFQ.api.bundle_dict as abd\nfrom AFQ.api.group import GroupAFQ\nimport AFQ.data.fetch as afd\nimport AFQ.definitions.image as afm"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "To interact with and query BIDS datasets, we use\n [pyBIDS](https://bids-standard.github.io/pybids/), which we import here:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import bids\nfrom bids.layout import BIDSLayout"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We start with some example data. The data we will use here is\ngenerated from the\n[Stanford HARDI dataset](https://purl.stanford.edu/ng782rw8378).\nThe call below fetches\nthis dataset and organized it within the `~/AFQ_data` folder in the BIDS\nformat.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "afd.organize_stanford_data(clear_previous_afq=\"all\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "After doing that, we should have a folder that looks like this:\n\n| stanford_hardi\n| \u251c\u2500\u2500 dataset_description.json\n| \u2514\u2500\u2500 derivatives\n|     \u251c\u2500\u2500 freesurfer\n|     \u2502\u00a0\u00a0 \u251c\u2500\u2500 dataset_description.json\n|     \u2502\u00a0\u00a0 \u2514\u2500\u2500 sub-01\n|     \u2502\u00a0\u00a0     \u2514\u2500\u2500 ses-01\n|     \u2502\u00a0\u00a0         \u2514\u2500\u2500 anat\n|     \u2502\u00a0\u00a0             \u251c\u2500\u2500 sub-01_ses-01_T1w.nii.gz\n|     \u2502\u00a0\u00a0             \u2514\u2500\u2500 sub-01_ses-01_seg.nii.gz\n|     \u2514\u2500\u2500 vistasoft\n|         \u251c\u2500\u2500 dataset_description.json\n|         \u2514\u2500\u2500 sub-01\n|             \u2514\u2500\u2500 ses-01\n|                 \u2514\u2500\u2500 dwi\n|                     \u251c\u2500\u2500 sub-01_ses-01_dwi.bvals\n|                     \u251c\u2500\u2500 sub-01_ses-01_dwi.bvecs\n|                     \u2514\u2500\u2500 sub-01_ses-01_dwi.nii.gz\n\nThe top level directory (`stanford_hardi`) is our overall BIDS dataset folder.\nIn many cases, this folder will include folders with raw data for each subject\nin the dataset. In this case, we do not include the raw data folders and only\nhave the outputs of pipelines that were used to preprocess the data (e.g.,\ncorrect the data for subject motion, eddy currents, and so forth).\nIn general, only the preprocessed diffusion data is required for pyAFQ to run.\nSee the :doc:`\"Organizing your data\" </howto/data>` section of the\ndocumentation for more details.\nIn this case, one folder contains derivative of the Freesurfer software and\nanother folder contains the DWI data that has been preprocessed with the\nVistasoft software.\npyAFQ provides facilities to segment tractography results obtained\nusing other software as well. For example, we often use\n[qsiprep](https://qsiprep.readthedocs.io/en/latest/) to preprocess\nour data and reconstruct tractographies with software such as\n[MRTRIX](https://www.mrtrix.org/). Here, we will demonstrate how to use\nthese reconstructions in the pyAFQ segmentation and tractometry pipeline\nWe fetch this data and add it as a separate pipeline\nThe following code will download a previously-created tractography and\norganize it by adding it to the BIDS dataset folder and renaming them to be\nBIDS-compliant (e.g., `sub-01_ses_01_dwi_tractography.trk`).\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "afd.fetch_stanford_hardi_tractography()\n\nbids_path = op.join(op.expanduser('~'), 'AFQ_data', 'stanford_hardi')\ntractography_path = op.join(bids_path, 'derivatives', 'my_tractography')\nsub_path = op.join(tractography_path, 'sub-01', 'ses-01', 'dwi')\n\nseg_file = op.join(afd.afq_home, \"stanford_hardi\", \"derivatives\",\n                   \"freesurfer\", \"sub-01\", \"ses-01\", \"anat\",\n                   \"sub-01_ses-01_seg.nii.gz\")\npve = afm.PVEImages(\n    afm.LabelledImageFile(\n        path=seg_file,\n        inclusive_labels=[0]),\n    afm.LabelledImageFile(\n        path=seg_file,\n        exclusive_labels=[0, 1, 2], combine=\"and\"),\n    afm.LabelledImageFile(\n        path=seg_file,\n        inclusive_labels=[1, 2]))\n\nos.makedirs(sub_path, exist_ok=True)\nos.rename(\n    op.join(\n        op.expanduser('~'),\n        'AFQ_data',\n        'stanford_hardi_tractography',\n        'full_segmented_cleaned_tractography.trk'),\n    op.join(\n        sub_path,\n        'sub-01_ses-01-dwi_tractography.trk'))\n\nafd.to_bids_description(\n    tractography_path,\n    **{\"Name\": \"my_tractography\",\n       \"PipelineDescription\": {\"Name\": \"my_tractography\"},\n       \"GeneratedBy\": [{\"Name\": \"my_tractography\"}]})"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "After we do that, our dataset folder should look like this:\n\n| stanford_hardi\n| \u251c\u2500\u2500 dataset_description.json\n| \u2514\u2500\u2500 derivatives\n|     \u251c\u2500\u2500 freesurfer\n|     \u2502\u00a0\u00a0 \u251c\u2500\u2500 dataset_description.json\n|     \u2502\u00a0\u00a0 \u2514\u2500\u2500 sub-01\n|     \u2502\u00a0\u00a0     \u2514\u2500\u2500 ses-01\n|     \u2502\u00a0\u00a0         \u2514\u2500\u2500 anat\n|     \u2502\u00a0\u00a0             \u251c\u2500\u2500 sub-01_ses-01_T1w.nii.gz\n|     \u2502\u00a0\u00a0             \u2514\u2500\u2500 sub-01_ses-01_seg.nii.gz\n|     \u251c\u2500\u2500 my_tractography\n|     |   \u251c\u2500\u2500 dataset_description.json\n|     \u2502\u00a0\u00a0 \u2514\u2500\u2500 sub-01\n|     \u2502\u00a0\u00a0     \u2514\u2500\u2500 ses-01\n|     \u2502\u00a0\u00a0         \u2514\u2500\u2500 dwi\n|     \u2502\u00a0\u00a0             \u2514\u2500\u2500 sub-01_ses-01-dwi_tractography.trk\n|     \u2514\u2500\u2500 vistasoft\n|         \u251c\u2500\u2500 dataset_description.json\n|         \u2514\u2500\u2500 sub-01\n|             \u2514\u2500\u2500 ses-01\n|                 \u2514\u2500\u2500 dwi\n|                     \u251c\u2500\u2500 sub-01_ses-01_dwi.bvals\n|                     \u251c\u2500\u2500 sub-01_ses-01_dwi.bvecs\n|                     \u2514\u2500\u2500 sub-01_ses-01_dwi.nii.gz\n\nTo explore the layout of these derivatives, we will initialize a\n:class:`BIDSLayout` class instance to help us see what is in this dataset\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "layout = bids.BIDSLayout(bids_path, derivatives=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Because there is no raw data in this BIDS layout (only derivatives),\npybids will report that there are no subjects and sessions:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(layout)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "But a query on the derivatives will reveal the different derivatives that\nare stored here:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(layout.derivatives)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We can use a :class:`bids.BIDSValidator` object to make sure that the\nfiles within our data set are BIDS-compliant. For example, we can\nextract the tractography derivatives part of our layout using:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "my_tractography = layout.derivatives[\"my_tractography\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This variable is also a BIDS layout object. This object has a ``get``\nmethod, which allows us to query and find specific items within the\nlayout. For example, we can ask for files that have a suffix consistent\nwith tractography results:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "tractography_files = my_tractography.get(suffix='tractography')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Or ask for files that have a ``.trk`` extension:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "tractography_files = my_tractography.get(extension='.trk')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "In this case, both of these would produce the same result.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "tractography_file = tractography_files[0]\nprint(tractography_file)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We can also get some more structured information about this file:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(tractography_file.get_entities())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We can use a :class:`bids.BIDSValidator` class instance to validate that\nthis file is compliant with the specification. Note that the validator\nrequires that the filename be provided relative to the root of the BIDS\ndataset, so we have to split the string that contains the full path\nof the tractography to extract only the part that is relative to the\nroot of the entire BIDS ``layout`` object:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "tractography_full_path = tractography_file.path\ntractography_relative_path = tractography_full_path.split(layout.root)[-1]\n\nvalidator = bids.BIDSValidator()\nprint(validator.is_bids(tractography_relative_path))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Next, we specify the information we need to define the bundles that we are\ninterested in segmenting. In this case, we are going to use a list of\nbundle names for the bundle info. These names refer to bundles for\nwhich we already have clear definitions of the information\nneeded to segment them (e.g., waypoint ROIs and probability maps).\nFor an example that includes custom definition of bundle info, see the\n[plot_callosal_tract_profile example](http://tractometry.org/pyAFQ/auto_examples/plot_callosal_tract_profile.html).\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "bundle_info = abd.default_bd()[\n    \"Left Inferior Longitudinal\",\n    \"Right Inferior Longitudinal\",\n    \"Left Arcuate\",\n    \"Right Arcuate\",\n    \"Left Corticospinal\",\n    \"Right Corticospinal\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now, we can define our GroupAFQ object, pointing to the derivatives of the\n`'my_tractography'` pipeline as inputs. This is done by setting the\n`import_tract` key-word argument. We pass the\n`bundle_info` defined above. We also point to the preprocessed\ndata that is in a `'dmriprep'` pipeline. Note that the pipeline name\nis not necessarily the name of the folder it is in; the pipeline name is\ndefined in each pipeline's `dataset_description.json`. These data were\npreprocessed with 'vistasoft', so this is the pipeline we'll point to\nIf we were using `'qsiprep'`, this is where we would pass that\nstring instead. If we did that, AFQ would look for a derivatives\nfolder called `'stanford_hardi/derivatives/qsiprep'` and find the\npreprocessed DWI data within it. Finally, to speed things up\na bit, we also sub-sample the provided tractography. This is\ndone by defining the segmentation_params dictionary input.\nTo sub-sample to 10,000 streamlines, we define\n`'nb_streamlines' = 10000`.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "my_afq = GroupAFQ(\n    bids_path,\n    dwi_preproc_pipeline='vistasoft',\n    t1_preproc_pipeline='freesurfer',\n    bundle_info=bundle_info,\n    import_tract={\n        \"suffix\": \"tractography\",\n        \"scope\": \"my_tractography\"\n    },\n    pve=pve,\n    segmentation_params={'nb_streamlines': 10000})"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Finally, to run the segmentation and extract tract profiles, we call\nThe `export_all` method. This creates all of the derivative outputs of\nAFQ within the 'stanford_hardi/derivatives/afq' folder.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "my_afq.export_all()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "A few common issues that can hinder BIDS from working properly are:\n\n1. Faulty `dataset_description.json` file. You need to make sure that the\n   file contains the right names for the pipeline. See above for an example\n   of that.\n2. File naming convention doesn't uniquely identify file with bids filters.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The outputs of AFQ are also BIDS compatible. Here we demonstrate how\nto load the afq entities and show all files with the key-value pair\ndesc-bundles\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "layout = BIDSLayout(bids_path)\nlayout.add_derivatives(\n    f'{bids_path}/derivatives/afq',\n    config=['bids', 'derivatives'])\nprint(layout.get(desc=\"bundles\", return_type=\"filename\"))"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.13.13"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}