Using Metadata¶
Metadata
allows users to annotate a QIIME 2
Result with study-specific values: age, elevation, body site, pH, etc.
QIIME 2 offers a consistent API for developers to expose their Methods and Visualizers to user-defined metadata. For
more details about how users might create and utilize metadata in their
studies, check out the Metadata In QIIME 2 tutorial.
Metadata¶
Actions may request an entire Metadata
object to work on. At its core, Metadata
is
just a pandas pd.Dataframe
, but the Metadata
object provides many convenience methods and
properties, and unifies the code necessary for handling these data (or
metadata). Examples of Actions that consume and operate on
Metadata
include:
Plugins may work with metadata directly, or they may choose to filter, regroup, partition, pivot, etc. - it all depends on the intended outcome relevant to the method or visualizer in question.
Metadata
is subject to framework-level
validations, normalization, and verification. We recommend familiarizing
yourself with this behavior before utilizing Metadata
in your Action. We think having this kind
of behavior available via a centralized API helps ensure consistency for all
users of Metadata
.
def my_viz(output_dir: str, md: qiime2.Metadata) -> None:
df = md.to_dataframe()
...
Metadata Columns¶
Plugin Actions may also request one or more
MetadataColumn
to operate on, a good
example of this is identifying which column of metadata contains barcodes, when
using demux emp-single
or cutadapt demux-paired
, for example. The
exciting aspect of this is that there are no longer hard-coded column-naming
requirements, allowing the user to select a naming convention appropriate to
their study.
Instances of MetadataColumn
exist as
one of two concrete classes: NumericMetadataColumn
and CategoricalMetadataColumn
.
By default, QIIME 2 will attempt to infer the type of each metadata column: if the column consists only of numbers or missing data, the column is inferred to be numeric. Otherwise, if the column contains any non-numeric values, the column is inferred to be categorical. Missing data (i.e. empty cells) are supported in categorical columns as well as numeric columns.
...
numeric_md_cols = metadata.filter(column_type='numeric')
categorical_md_cols = metadata.filter(column_type='categorical')
...
If your Action always needs one type of column or another, you can simply register that type in your plugin registration:
plugin.methods.register_function(
...
parameters={'metadata': MetadataColumn[Numeric]},
parameter_descriptions={'metadata': 'Numeric metadata column to '
'compute pairwise Euclidean distances from'},
...
This will ensure that all the necessary type-checking is performed by the framework before these data are passed into the Action utilizing it.
Numeric Metadata Columns¶
Columns that consist only of numeric (or missing) values are eligible for being
instantiated as NumericMetadataColumn
(although these values can be loaded
as CategoricalMetadataColumn
, too).
Categorical Metadata Columns¶
All types of data columns can be instantiated as
CategoricalMetadataColumn
- values will be cast to strings.
How can the Metadata API Help Me?¶
The Metadata API has many interesting features - here are some of the more commonly utlitized elements amongst the core plugins.
Merging Metadata¶
Interfaces can allow users to specify more than one
metadata file at a time, the framework will handle merging the files
or objects
prior to handing the final merged
set to your Action.
Dropping Empty Columns¶
When working with a single metadata metadata column, plugin code can determine
if there are missing values
, and then subsequently
drop those IDs
from the column.
Normalizing TSV Files¶
By saving
a materialized
Metadata
instance,
visualizations that want to provide data exports can do so in a consistent
manner (e.g. longitudinal volatility
, and the relevant code).
Advanced Filtering¶
The filter
method can be used
to restrict column types, drop empty columns, or remove columns made entirely
of unique values.
SQL Filtering¶
Advanced metadata querying is enabled by SQL-based filtering
.
Making Artifacts Viewable as Metadata¶
By registering a transformer from a
particular format to qiime2.Metadata
, the framework will
allow the type represented by that format to be viewed as Metadata
— this can open up all kinds of exciting
opportunities for plugins!
@plugin.register_transformer
def _1(data: cool_project.InterestingDataFormat) -> qiime2.Metadata:
df = pd.Dataframe(data)
return qiime2.Metadata(df)
A visualizer for free!¶
If your type is viewable as Metadata
(as in, the necessary
transformers are registered), there is a general-purpose metadata visualization
called metadata tabulate
, which renders an interactive table of the metadata
in question. Cool!