Omni Path Specification
Universal address for multimodal brain data
This document specifies the structure, semantics, and usage of the omni path.
What is an omni path?
Omni path is a canonical address to access multimodal brain data. The goal is to have an addressing schema that enables consistent and dynamic referencing across diverse datasets and modalities, and ultimately support a data lakehouse of brain data.
An omni path starts with a namespace prefix indicating whether the path is raw (native data) or omni (derived data). The raw namespace preserves the original dataset structure, while the omni namespace provides a standardized schema for cross-dataset access and dynamic querying.
For example:
- Raw (native):
/raw/{dataset}/{**original_path} - Omni (derived):
/omni/{subject}/:modality/:space/:dtype/{:qualifiers}/@coords
Raw namespace
Raw namespace is organized by dataset and follows the structure:
/raw/{dataset}/{**original_path}where:
{dataset}: dataset identifier (e.g.,hcp,abide,openneuro-ds00001){**original_path}: dataset-specific path defined by the dataset provider (e.g., files and folders as in BIDS)
Raw paths intentionally preserve the data provider layout. They do not enforce a vocabulary or specific structure.
Omni namespace
Omni paths are organized by universal subject identifiers and follows the structure:
/omni/{subject}/:modality/:space/:dtype/{:qualifiers}/@coordswhere:
{subject}: universal subject identifier. See Subject identifiers.:modality: modality term:space: reference/registration space term:dtype: data type{:qualifiers}: optional list of qualifiers@coords: coordinate selector (use@*to request all)
| Symbol | Meaning | Notes |
|---|---|---|
: |
Canonical vocabulary term | Segment must be valid in the canonical vocabulary (or resolvable to it). |
? |
Unresolved term | Placeholder for unknowns; queryable and later resolvable during ingestion. |
@ |
Indexing | Explicit coordinate selector for spatial/time/stream access. See Coordinates. |
Path segments
| Segment | Required | Type | Description |
|---|---|---|---|
{subject} |
✅ | string | Canonical subject ID (e.g., hcp-100307) |
:modality |
✅ | vocab | Modality (e.g., :fmri, :t1w, :eeg) |
:space |
✅ | vocab | Space (e.g., :mni152, :native) |
:dtype |
✅ | vocab | Representation (e.g., :bold, :intensity, :voltage) |
{:qualifiers} |
optional | vocab list | Additional canonical qualifiers (task, processing, etc.) |
@coords |
✅ | selector | Coordinate/stream selector (@* allowed) |
Subject identifiers
Omni uses a deterministic canonical subject ID to avoid collisions across datasets:
"{dataset_prefix}-{clean_id}"
dataset_prefix: canonical dataset code (e.g.,hcp)clean_id: dataset subject identifier normalized into a stable form
Example:
hcp-100307
Qualifiers
Qualifiers are optional additional segments appended after :dtype:
/omni/{subject}/:modality/:space/:dtype/:qual1/:qual2/.../@coords
Typical qualifier families:
- acquisition/condition:
:rest,:task,:eyes-open,:eyes-closed - processing:
:denoised,:filtered,:source-localized - feature forms:
:parcellated,:roi-mean,:embedding
Coordinates
Coordinates represent spatial, temporal, and stream indexing. They are expressed using @...:
| Form | Meaning |
|---|---|
@* |
entire data |
@x,y,z |
spatial point. Interpretation of the xyz is defined by :space (e.g., :MNI152 implies standard space coordinates). |
@x,y,z/t |
spatial point + timepoint. t indicates temporal indexing (e.g., fMRI volume index, EEG timepoint). |
@x,y,z0:z1/t0:t1 |
spatial bounding box + time range |
@Cz |
Named stream selector (i.e., channel or variable). They are modality-dependent and should map to named axes (channels, variables, parcels). |
Raw vs Omni
| Aspect | Raw | Omni (derived) |
|---|---|---|
| Prefix | /raw/ |
/omni/ (or alias /derived/) |
| Structure | dataset-defined | fixed schema |
| Symbols | none required | :, ?, @ required |
| Subject ID | dataset convention | deterministic universal ID |
| Coordinates | none (native implied) | explicit @... selector |
| Vocabulary | none | enforced vocabulary |
Examples
Canonical derived paths:
/derived/hcp-100307/:fmri/:MNI152/:bold/:rest/@*
/derived/hcp-100307/:fmri/:MNI152/:bold/:rest/:denoised/@32,45,12/0:1200
/derived/hcp-100307/:t1w/:MNI152/:intensity/@*
/derived/hcp-100307/:eeg/:MNI152/:voltage/:rest/:source-localized/@*
/derived/hcp-100307/:multimodal/:MNI152/:embedding/:rest/@*
Corresponding raw paths:
/raw/hcp/100307/**
Query patterns (API examples)
Assume an API with:
dataset.query(pattern: str) -> list[path]dataset.get(path: str) -> objectobject.raw
Raw: what did the dataset provide for a subject?
dataset.query("/raw/hcp/100307/**")Omni: all resting fMRI in standard space
dataset.query("/derived/*/:fmri/:MNI152/:bold/:rest/@*")Omni: specific voxel time series
dataset.get("/omni/hcp-100307/:fmri/:MNI152/:bold/:rest/@-42,38,12/0:1200")Trace back provenance to raw inputs
dataset.get("/omni/hcp-100307/:fmri/:MNI152/:bold/:rest/@*").rawUnresolved terms (?)
Unknown or not-yet-mapped terms are prefixed with ?.
Query: all paths containing unknown terms
dataset.query("/*/*/?*")Query: a specific unknown term across datasets
dataset.query("/*/?weirdmodality")Query: fully resolved paths only
dataset.query("/omni/*/:*/:*/:*/@*")Validation
Canonical vocabulary (
:): segments prefixed with:must be members of (or resolvable to) the canonical vocabulary.Coordinates (
@): coordinate selectors must be valid with respect to:
:modality(voxels, surface, streams):space(units, bounds):dtype(temporal indexing)
- Use
@*when requesting the entire object/stream.
Summary
/raw/...preserves dataset-native layout and provenance./omni/...(or/derived/...) provides a fixed, enforced schema for cross-dataset querying.:marks canonical terms,?marks unresolved terms, and@provides coordinate-aware indexing.