Omni Path Specification
Universal address for multimodal brain data
This document specifies the structure, semantics, and usage of the omni path.
What is an omni path?
Omni path is a canonical address to access brain data. The goal is to have an addressing schema that enables consistent and dynamic referencing across diverse modalities and datasets, and ultimately support implementinh a data lakehouse for human brain data.
An omni path is a string that starts with a namespace prefix indicating whether the path is raw (original data) or omni (derived, processed data). The raw namespace preserves the original dataset structure, while the omni namespace provides a standardized schema for cross-dataset access and dynamic querying.
For example:
- Raw (native):
/raw/ds12/sub-102/{**original_path} - Omni (derived):
/omni/ds12-102/:eeg/:native/:voltage/:rest/Cz/@*
Raw namespace is organized by dataset and follows the structure:
/raw/{dataset}/{**original_path}where:
{dataset}: dataset identifier (e.g.,hcp,abide,openneuro-ds00001){**original_path}: dataset-specific path defined by the dataset provider (e.g., BIDS structure, HCP structure, etc.)
Raw paths intentionally preserve the data provider layout. They do not enforce a vocabulary or specific structure.
Omni namespace
Omni namespace is organized by universal subject identifiers and follows the structure:
/omni/{subjects}/:modality/:space/:dtype/{:qualifiers}/@coordswhere:
{subjects}: universal subject identifier (may include dataset identifier). See Subject identifiers. This can be one or more subjects (e.g.,hcp-100307,hcp-100307,hcp-100408) separated by commas.the segments following
{subject}are controlled vocabulary terms that describe the data. The are separated by/and must be prefixed with:to indicate they are from the canonical vocabulary. The required segments are::modality: modality term, e.g.,:fmri,:eeg:space: reference/registration space term, e.g.,:mni152,:native:dtype: data type, e.g.,:bold,:voltage
And the optional segments are:
{:qualifiers}: optional list of qualifiers, e.g.,:denoised,:rest,:task@coords: coordinate selector (use@*to request all), e.g.,@32,45,12/0:1200for a voxel time series or@Czfor an EEG channel
As seen in the structure, there are three special symbols used in the omni namespace to indicate specific semantics:
| Symbol | Meaning | Notes |
|---|---|---|
: |
Controlled vocabulary term | Segment must be valid in the canonical vocabulary (or resolvable to it). |
? |
Unresolved term | Placeholder for unknowns; queryable and later resolvable during ingestion. |
@ |
Indexing | Explicit selector for spatial/time/stream coordinates. See Coordinates. |
Path segments
| Segment | Required | Type | Description |
|---|---|---|---|
{subject} |
✅ | string | Canonical subject ID (e.g., hcp-100307) |
:modality |
✅ | vocab | Modality (e.g., :fmri, :t1w, :eeg) |
:space |
✅ | vocab | Space (e.g., :mni152, :native) |
:dtype |
✅ | vocab | Representation (e.g., :bold, :intensity, :voltage) |
{:qualifiers} |
optional | vocab list | Additional canonical qualifiers (task, processing, etc.) |
@coords |
optional | selector | Coordinate/stream selector (defaults to @*) |
Subject identifiers
A deterministic canonical subject id is preferred to enable consistent referencing across datasets:
"{dataset_prefix}-{clean_id}"
{dataset_prefix}: canonical dataset code (e.g.,hcp){clean_id}: dataset subject identifier normalized into a stable form
Example: hcp-100307
Qualifiers
Qualifiers are optional additional segments that provide more specific information about the data.
/omni/{subject}/:modality/:space/:dtype/:qual1/:qual2/.../@coords
Typical qualifier families:
- acquisition/condition:
:rest,:task,:eyes-open,:eyes-closed - processing:
:denoised,:filtered,:source-localized - feature forms:
:parcellated,:roi-mean,:embedding
Coordinates
Coordinates represent spatial, temporal, and stream indexing. They are expressed using @... syntax at the end of the path. The interpretation of the coordinates depends on the modality, space, and dtype.
| Form | Meaning |
|---|---|
@* |
entire data |
@x,y,z |
spatial point. Interpretation of the xyz is defined by :space (e.g., :MNI152 implies standard space coordinates). |
@x,y,z/t |
spatial point + timepoint. t indicates temporal indexing (e.g., fMRI volume index, EEG timepoint). |
@x,y,z0:z1/t0:t1 |
spatial bounding box + time range |
@Cz |
Named stream selector (i.e., channel or variable). They are modality-dependent and should map to named axes (channels, variables, parcels). |
Raw vs Omni
| Aspect | Raw | Omni (derived) |
|---|---|---|
| Prefix | /raw/ |
/omni/ (or alias /derived/) |
| Structure | dataset-defined | fixed schema |
| Symbols | none required | :, ?, @ required |
| Subject ID | dataset convention | deterministic universal ID |
| Coordinates | none (native implied) | explicit @... selector |
| Vocabulary | none | enforced vocabulary |
Examples
Canonical derived paths:
/derived/hcp-100307/:fmri/:MNI152/:bold/:rest/@*
/derived/hcp-100307/:fmri/:MNI152/:bold/:rest/:denoised/@32,45,12/0:1200
/derived/hcp-100307/:t1w/:MNI152/:intensity/@*
/derived/hcp-100307/:eeg/:MNI152/:voltage/:rest/:source-localized/@*
/derived/hcp-100307/:multimodal/:MNI152/:embedding/:rest/@*
Corresponding raw paths:
/raw/hcp/100307/**
Query patterns (API examples)
Assume an API with:
dataset.query(pattern: str) -> list[path]dataset.get(path: str) -> objectobject.raw
Raw: what did the dataset provide for a subject?
dataset.query("/raw/hcp/100307/**")Omni: all resting fMRI in standard space
dataset.query("/derived/*/:fmri/:MNI152/:bold/:rest/@*")Omni: specific voxel time series
dataset.get("/omni/hcp-100307/:fmri/:MNI152/:bold/:rest/@-42,38,12/0:1200")Trace back provenance to raw inputs
dataset.get("/omni/hcp-100307/:fmri/:MNI152/:bold/:rest/@*").rawUnresolved terms (?)
Unknown or not-yet-mapped terms are prefixed with ?.
Query: all paths containing unknown terms
dataset.query("/*/*/?*")Query: a specific unknown term across datasets
dataset.query("/*/?weirdmodality")Query: fully resolved paths only
dataset.query("/omni/*/:*/:*/:*/@*")Validation
Canonical vocabulary (
:): segments prefixed with:must be members of (or resolvable to) the canonical vocabulary.Coordinates (
@): coordinate selectors must be valid with respect to:
:modality(voxels, surface, streams):space(units, bounds):dtype(temporal indexing)
- Use
@*when requesting the entire object/stream.
Summary
/raw/...preserves dataset-native layout and provenance./omni/...(or/derived/...) provides a fixed, enforced schema for cross-dataset querying.:marks canonical terms,?marks unresolved terms, and@provides coordinate-aware indexing.