| Internet-Draft | MKA Stem | February 2026 |
| Whited | Expires 27 August 2026 | [Page] |
This document defines a multi-track profile of the Matroska container format for storing stems that is also backwards compatible with existing media players.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 27 August 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Stem are recordings of individual instruments, or clusters of instruments, used by DJs and music producers for live mixing of music. Historically stem files have been stored as individual audio files, or using patent-encumbered or vendor specific proprietary container formats. The Matroska container format formally specified in [RFC9559] is ideally situated as a container for stems. This specification documents a profile for the Matroska container format that allows it to store lossless or lossy stems as well as metadata about the stems for use in DJ applications or Digital Audio Workstations.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
STEM files have a few basic requirements:¶
Each stem file may contain an arbitrary number of tracks containing audio and MUST include at least 3 tracks (the mixed audio and at least two stems). Each track MUST be encoded using the same codec with the same parameters including bitrate, channel number, channel layout, and sample rate.¶
The first track containing audio data MUST be the final post-mix audio and MUST have the Matroska default flag set. This helps preserve backwards compatibility in media players which do not support this format (which typically play the first audio stream found).¶
The remaining tracks will be individual stems and MUST have the same audio length as the first track such that playing each stem track from the beginning would result in the same audio (excluding mastering) as the final mix present in the first track. For example, if the original track is 3 minutes long and the stem file includes a percussion track but the percussion does not start until minute 2 the percussion stem would still be 3 minutes long but would contain a minute of silence at the start of the track.¶
Each stem track MUST NOT have the Matroska default flag set.¶
The stem tracks SHOULD NOT have any gain normalization applied. Instead they should retain the same levels as they would have in the final mix present in the first track so that if all stems were played at unity gain the levels would be equivalent to the final mix.¶
Each stem track (ie. all tracks that are not the first track)
MUST set the value of the
\Segment\Tracks\TrackEntry\Name field to a
human-readable track name for the stem, for example "Percussion" or
"Vocals".¶
For each stem track a \Segment\Tags\Tag must also be set with
its target set to the stem track.
The tag must contain a SimpleTag element with the
TagName field set to "StemColor" and the TagString
field set to a color representing the track in RGB hex format
(ie. "#145374").¶
Because mastering happens post-mix and the stems are pre-mix audio the stem tracks SHOULD NOT have any mastering steps applied. Instead, metadata for configuring a compressor and limiter SHOULD be included in the file's global metadata as simple tags. After mixing, playback applications MAY choose to feed the mix through a Digital Signal Processor configured with the limiter and compressor settings read from the metadata.¶
| Tag | Requirement Level | Values |
|---|---|---|
| STEM:COMPRESSOR:ENABLED | REQUIRED | "TRUE" or "FALSE" |
| STEM:COMPRESSOR:RATIO | OPTIONAL | TODO |
| STEM:COMPRESSOR:OUTPUT_GAIN | OPTIONAL | TODO |
| STEM:COMPRESSOR:THRESHOLD | OPTIONAL | TODO |
| STEM:COMPRESSOR:ATTACK | OPTIONAL | TODO |
| STEM:COMPRESSOR:INPUT_GAIN | OPTIONAL | TODO |
| STEM:COMPRESSOR:RELEASE | OPTIONAL | TODO |
| STEM:COMPRESSOR:HP_CUTOFF | OPTIONAL | TODO |
| STEM:COMPRESSOR:HP_DRY_WET | OPTIONAL | TODO |
| Tag | Requirement Level | Values |
|---|---|---|
| STEM:LIMITER:ENABLED | REQUIRED | "TRUE" or "FALSE" |
| STEM:LIMITER:RELEASE | OPTIONAL | TODO |
| STEM:LIMITER:THRESHOLD | OPTIONAL | TODO |
| STEM:LIMITER:CEILING | OPTIONAL | TODO |
This memo includes no request to IANA.¶
This document should not affect the security of the Internet.¶