Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

How to merge metadata

Metadata can come from many different sources, and some QIIME 2 artifacts also look and behave a lot like metadata. QIIME 2 therefore has a few different ways to handle metadata merging.

Implicit merging

This supports merging of metadata that contains overlapping ids, but not overlapping column names. Simply passing --m-input-file multiple times will combine the metadata columns in the specified files:

qiime metadata tabulate \
    --m-input-file sample-metadata-1.tsv \
    --m-input-file sample-metadata-2.tsv \
    --o-visualization tabulated-combined-metadata.qzv

The resulting metadata after the merge will contain the intersection of the identifiers across all of the specified files (i.e., an inner join). In other words, the merged metadata will only contain identifiers that are shared across all provided metadata files.

Implicit metadata merging is supported anywhere that metadata is accepted in QIIME 2.

Explicit merging

Explicit merging of metadata supports merging of metadata that contains overlapping ids or overlapping column names, but not both overlapping ids and overlapping column names. This can be achieved with the merge action provided by the q2-metadata plugin. The result will be the union (i.e., outer join) of the ids and columns from the two metadata inputs. Merging metadata with neither overlapping ids or overlapping column names is also possible with this action.

Call qiime metadata merge --help for detailed information on how to use this command.

Attempting to merge metadata with both overlapping ids and overlapping columns will currently fail because conflicting column values for a sample are not resolved. See Merging metadata with potentially conflicting values for more discussion of this topic.

To explicitly merge more than two metadata objects, run this command multiple times, iteratively, using the output of the previous run as one of the metadata inputs.

The output of qiime metadata merge is an ImmutableMetadata artifact (because QIIME 2 methods only ever produce artifacts). This artifact can be used anywhere that a metadata file can be used, or it can be exported to a metadata .tsv file in the typical format.

Merging Artifacts with Metadata

Both implicit and explicit merging of metadata also works with artifacts that can be viewed as metadata. (See How to use Artifacts as Metadata for details on this concept.) For example, it might be interesting to have the option to color points in an Emperor plot based on the sample alpha diversity, in addition to the typical sample metadata. This can be accomplished by providing both the sample metadata file and the SampleData[AlphaDiversity] artifact as metadata files in an implicit merge:

curl -sL \
  "https://data.qiime2.org/2021.4/tutorials/metadata/unweighted_unifrac_pcoa_results.qza" > \
  "unweighted_unifrac_pcoa_results.qza"

qiime emperor plot \
    --i-pcoa unweighted_unifrac_pcoa_results.qza \
    --m-metadata-file sample-metadata.tsv \
    --m-metadata-file faith_pd_vector.qza \
    --o-visualization unweighted-unifrac-emperor-with-alpha.qzv

Merging metadata with potentially conflicting values

QIIME 2 does not have support for merging metadata with potentially conflicting values. This can arise if different metadata that you want to merge has overlapping identifiers and overlapping column names. For example if the both metadata files being merged have an age column, each could provide a different age value for the same sample. QIIME 2 doesn’t attempt to resolve that - it’s up to you to do that.

Our current recommendations for how to handle a case like this are: