THE CASE FOR MXF-EMBEDDED EBUCORE METADATA IN ARCHIVING APPLICATIONS | Dieter VAN RIJSSELBERGEN, Jean-Pierre EVAIN, Marco DOS SANTOS OLIVEIRA and Maarten VERWAEST, Limecraft; European Broadcasting Union
A solution to current descriptive metadata delivery problems is the use of metadata embedded in the audio-visual essence containers themselves. This way, metadata can no longer get lost and needs no separate out-of-band delivery mechanism. Using EBUCore metadata embedded in essence files using a freely available reference SDK can ease the adoption of embedded metadata significantly and can help archive systems in supporting such standards-compliant embedded descriptive metadata. In this paper we describe the proceedings and lessons learnt from a development project of EBU and Limecraft, in which we investigated the use of MXF-embedded EBUCore metadata as way to support feeding metadata-enriched MXF files to a variety of media production and archiving systems.
ToxOtis: A Java Interface to the OpenTox Predictive Toxicology Network
Similar to THE CASE FOR MXF-EMBEDDED EBUCORE METADATA IN ARCHIVING APPLICATIONS | Dieter VAN RIJSSELBERGEN, Jean-Pierre EVAIN, Marco DOS SANTOS OLIVEIRA and Maarten VERWAEST, Limecraft; European Broadcasting Union
LoCloud - D2.1: Core Infrastructure Specifications (including Business Proces...locloud
Similar to THE CASE FOR MXF-EMBEDDED EBUCORE METADATA IN ARCHIVING APPLICATIONS | Dieter VAN RIJSSELBERGEN, Jean-Pierre EVAIN, Marco DOS SANTOS OLIVEIRA and Maarten VERWAEST, Limecraft; European Broadcasting Union (20)
AI You Can Trust - Ensuring Success with Data Integrity Webinar
THE CASE FOR MXF-EMBEDDED EBUCORE METADATA IN ARCHIVING APPLICATIONS | Dieter VAN RIJSSELBERGEN, Jean-Pierre EVAIN, Marco DOS SANTOS OLIVEIRA and Maarten VERWAEST, Limecraft; European Broadcasting Union
2. Author’s name(s)
THE CASE FOR EMBEDDED METADATA
A strong case can be made for the use of essence-embedded descriptive metadata for
archiving purposes because it can simplify the delivery and archiving processes in a
number of ways.
The delivery process of essence and metadata is made much simpler as the need for an
out-of-band delivery of the metadata is averted; when the essence arrives, so does the
metadata. There’s no need for separate files to be delivered along with the essence, no
hassle with file naming conventions or dealing with missing metadata in cases transfers
were aborted or not even performed. This makes the delivery considerably simpler for
both archives and producers, as illustrated in Figure 1. Out-of-band delivery of metadata
(Figure 1a) requires three layers of processing components. One layer is for the
processing of essence and its associated metadata, e.g., MXF parsers and metadataspecific XML processors that can handle specific instances of metadata documents for a
given metadata specification. On the other end, a delivery technology is required for
actually transferring files or bit streams to an archive. Examples of the latter include FTP
and HTTP (or more secure variants that employ encryption for all bytes transferred). In
between these two layers, a (preferably automated) delivery interpretation layer must be
setup that will relate the individual files delivered via the delivery technology to one
another to form the input for the top layer which interprets the actual metadata and
essence. This interpretation layer can become quite complex as exotic use cases and
error handling must be supported. E.g., atomic delivery must be supported such that no
processing is done on essence for which the metadata has not arrived yet. Also, in cases
that multiple metadata files must be delivered, the correct file must be identified for each
required set of metadata elements.
Essence + Metadata
MXF + Metadata XML
Essence + Metadata
MXF ← Metadata
Delivery
Interpretation
(file01.mxf + file01.xml)
(a)
Figure 1
Delivery
Technology
(FTP)
(b)
Delivery
Technology
(FTP)
The required processing layers in (a) delivery with out-of-band metadata and (b) delivery with
embedded metadata.
When considering the delivery of essence with embedded metadata, there is no longer
the need for a delivery interpretation layer. When transfers are completed and validated
at the transport level, control can be handed directly to the top essence and metadata
parsing layer as all information required to relate essence and metadata are contained in
that one single file (cf. Figure 1b). Such a setup can be beneficial for archiving
environments. The delivery and ingesting process is significantly simplified and no
complex delivery chains based on various – possibly ad-hoc – conventions need to be
setup at either side of the content exchange. Additionally, when the essence container
format in question is chosen right, many instances of various metadata kinds can be
embedded simultaneously (or appended one after the other in the same container),
resolving the issues at hand for many ingesting organisations and archives in a single
effort.
Naturally, if the use of embedded metadata is meant to replace the arbitrary parts of
existing side-channel metadata delivery, it is crucial that the mechanism for embedding
such metadata be properly defined, and based on best-practices and standards used in
2
FIAT/IFTA World Conference 2013 in Dubai
3. Title of paper
the video production and archiving ecosystem. This is where the combination of MXF and
EBUCore comes into the picture, as described in the next section.
THE CASE FOR MXF-EMBEDDED EBUCORE METADATA
Over the past decade, the Material Exchange Format (MXF) (SMPTE-377M, 2004)
(SMPTE, 2009) has become the dominant container format for file-based audio-visual
essence storage for professional video production. Its design incorporates many essence
encoding standards and extensive support for various types of metadata embedded into
an MXF container. The MXF format and its use have been defined extensively in various
SMPTE and EBU standards and recommendations and form a solid foundation for
proposing mechanism for embedded metadata exchanges.
Figure 2 shows the overall structure of an MXF file container. The file consists of a
header, body and footer, spread over a number of partitions (one header partition, one
footer partition and any number of body partitions). Metadata can be stored in various
places in the MXF container, first and foremost in the file header, but it can also be
repeated and updated in subsequent partitions, for reasons of redundancy or to support
growing files in which finalized metadata (e.g., duration) can only be appended at the end
of the file. In Figure 2, this metadata is contained within the blocks labelled “Header
Metadata”.
Figure 2
Structure of an MXF file container. The file contains a number of partitions, of which the header
and footer partition can contain header metadata sets.
Two kinds of metadata can be stored in an MXF file: structural and descriptive metadata.
The structural metadata defines the structure of the essence and its timeline in the MXF
file. It defines the various tracks of essence contained in the file, declares the encoding
parameters defined for each such track and specifies how each track relates to one
another in the overall timeline represented by the MXF file. As such, the structural
metadata is required for the correct interpretation of the MXF file. Descriptive metadata
can be hooked onto the structural metadata’s elements (e.g., it can be used to describe a
track) to describe the semantics of the essence involved, or to provide references to the
specifics of the production process that produced the essence in question.
Embedded MXF metadata is fully supported by the MXF specifications. In fact, a number
of targeted specifications have been ratified that define instances of descriptive metadata
for MXF, of which the Descriptive Metadata Scheme-1 (DMS-1) (SMPTE-380M, 2004) is
the most prominent. DMS-1 defines three frameworks in which production metadata can
be associated with the contents of the MXF file. These frameworks can describe titles,
publication events, rights information, contacts associated with the creation of the
essence, etc. However, the use of DMS-1 has remained very limited, not due to the
qualities of the standard, but rather because the standard has been defined only within
the isolation of the MXF ecosystem. No official outside-of-MXF serialization of DMS-1
exists, which makes its adoption harder, because each inclusion or extraction of DMS-1
metadata always requires a conversion between DMS-1 and another metadata format.
Such conversions are typically performed by software that is complex as it deals with the
equally complex MXF standards, and possible hard to integrate, as it must process MXF
files efficiently and is hence often written in low-level languages such as C/C++.
FIAT/IFTA World Conference 2013 in Dubai
3
4. Author’s name(s)
The use of MXF-embedded metadata becomes much more likely if we can incorporate
existing metadata standards used outside of MXF containers and embed those into MXF
containers in a standards-conforming fashion. Because the adoption threshold of such a
standard is much lower, it is likely to have been used more extensively, and will have
been revised and optimized more often because of actual use. This way, the existing
user base and deployments and all tools already available for creating, updating and
processing the metadata can be reused as-is. Embedding logic only has to deal with
translating external metadata to a lossless MXF representation, while any conversion to
and from other metadata standards is performed with other MXF-agnostic tools.
One such metadata standard is employed in the AS-11 application specification of MXF
files for program contribution (AMWA, 2012). AS-11 defines a limited set of flat custom
metadata fields for embedding into MXF files. At the same time, these same metadata
fields can easily be written in simple text files (e.g., as is the case in the reference code
implementation of the AS-11 specification), which eases its adoption significantly, e.g., by
members of the UK’s Digital Production Partnership (DPP).
Even though the AS-11 case shows an attractive approach to embedded metadata, a
more interesting metadata standard to consider is EBU’s EBUCore (EBU, 2013).
EBUCore extends greatly upon the Dublin Core standard (DCMI, 2004) and provides the
framework for descriptive and technical metadata for use in archiving, service-oriented
architectures and also in ontologies for semantic web and linked data developments
related concerning audio-visual media. EBUCore has seen a number of revisions since
its inception and is currently adopted by various broadcasters worldwide, supported by a
toolset that has been increasing and maturing along with the specification.
In the remainder of this paper, we discuss the efforts and results of a development track
executed by Limecraft and EBU to investigate how EBUCore metadata can be embedded
into MXF file containers in a compliant and optimized fashion, as a means of facilitating
easier metadata exchanges to and from production facilities and archives.
EMBEDDING EBUCORE IN MXF CONTAINERS
The regular representation of EBUCore is in the form of XML documents, for which the
structure is defined by an XML Schema document (W3C, 2004). EBUCore documents
are typically edited using tools and frameworks optimized for dealing with XML
documents, e.g., the MINT tool for mapping between EBUCore and other metadata
standards (cf. the concerning publication also in this session). The challenge in our case
has been to define a translation and serialization method for placing this XML metadata
in MXF files in such a way that the philosophy and best practices in MXF container
processing were adhered to. In particular, the requirements were as follows:
1. EBUCore metadata should be embedded into MXF files using best practices and
compliant with the MXF specification;
2. EBUCore metadata should be embedded into MXF files without loss of
information;
3. MXF files with EBUCore embedded should remain compatible with EBUCoreagnostic MXF processors.
Referring back to the previous section, metadata to be included in MXF files should be
contained in the MXF header metadata. The MXF file format specifies a flexible
mechanism for serializing metadata elements into MXF file bytes, which we will briefly
summarize in order to provide the necessary context for this paper. An MXF file is
actually a sequence of many so-called Key-Length-Value (KLV) packets. Each such KLV
packet can contain many types of data, incl. metadata, essence, and index table entries.
The key of the packet identifies the packet’s intention and the length field enables MXF
parsers to tell when one packet ends and the next one begins, allowing traversal of the
4
FIAT/IFTA World Conference 2013 in Dubai
5. Title of paper
entire file from front to back. A subset of all KLV packets in a file, typically at the
beginning of the header partition, forms the header metadata. These packets are
formatted in such a way that their value part contains the information for a single
metadata object, the format of which can be determined by the unique key assigned to
the packet. E.g., Figure 3 shows a part of the structure of an ‘Identification’ metadata
object class identified by the key 06.0e.2b.34.02.53.01.01.0d.01.01.01.01.01.30.00.
Figure 3 illustrates that the Identification class contains a variety of fields, each of which
is strongly typed (incl., numeric types, character string types, dates, and arrays of each of
these types). Additionally, thanks to one specific metadata field type, the reference,
relations between KLV metadata packets can be establish, which allows for the
construction of complex metadata structures. As far as standard MXF metadata
concerns, these structures are normatively defined by SMPTE specifications and are
typically conveyed in a structured fashion in software using dictionaries, of which Figure 3
depicts a small excerpt.
<Identification base="InterchangeObject" detail="Identification set" type="localSet" baseline="yes"
key="06 0e 2b 34 02 53 01 01 0d 01 01 01 01 01 30 00">
<ThisGenerationUID use="required" type="UUID" key="3c 09" globalKey="06 0e 2b 34 01 01 01 02 05 20 07 01 01 00 00 00"/>
<CompanyName use="required" type="UTF16String" key="3c 01" globalKey="06 0e 2b 34 01 01 01 02 05 20 07 01 02 01 00 00"/>
<ProductName use="required" type="UTF16String" key="3c 02" globalKey="06 0e 2b 34 01 01 01 02 05 20 07 01 03 01 00 00"/>
<ProductVersion use="optional" type="ProductVersionType" key="3c 03" globalKey="06 0e 2b 34 01 01 01 02 05 20 07 01 04 00 00 00"/>
<VersionString use="required" type="UTF16String" key="3c 04" globalKey="06 0e 2b 34 01 01 01 02 05 20 07 01 05 01 00 00"/>
<ProductUID use="required" type="AUID" key="3c 05" globalKey="06 0e 2b 34 01 01 01 02 05 20 07 01 07 00 00 00"/>
<ModificationDate detail use="required" type="Timestamp" key="3c 06" globalKey="06 0e 2b 34 01 01 01 02 07 02 01 10 02 03 00 00"/>
<ToolkitVersion use="optional" type="ProductVersionType" key="3c 07" globalKey="06 0e 2b 34 01 01 01 02 05 20 07 01 0a 00 00 00"/>
<Platform use="optional" type="UTF16String" key="3c 08" globalKey="06 0e 2b 34 01 01 01 02 05 20 07 01 06 01 00 00"/>
</Identification>
Figure 3
Excerpt of the MXF metadata dictionary. Displayed is the definition of the Identification metadata
set, with its member fields, data types and representative Universal Labels.
Even though standard MXF metadata has been meticulously defined, it can also be
easily extended, provided that the same mechanism is used to describe metadata
extensions. Custom fields can be appended to pre-defined metadata classes and new
classes can be defined such that a seamlessly integrated metadata extension of the
original metadata is obtained.
SERIALIZATION STRATEGIES
Even given the requirements and MXF file structure listed in the previous section, a
number of serialization strategies can be pursued to interpret the KLV structure of MXF
files for metadata serialization. We list them next, and provide a summary of advantages
and disadvantages of each strategy.
The first strategy of metadata serialization involves embedding a single KLV packet in
which the EBUCore XML metadata document is written as-is, and we refer to this
strategy as ‘dark’ (cf. Figure 4a). The KLV packet is inserted as the last packet at the end
of the regular header metadata and is identified by a specific EBUCore dark metadata
key. No further modifications are done to the MXF file metadata. This method is
described as dark because, even if the KLV element’s key is known, the metadata
doesn’t participate in the overall structure of metadata and remains a ‘blob’ of abstract
data with respect to the MXF file.
The second strategy encompasses the creation of an exhaustive representation of the
metadata entity structure (e.g., object classes, fields and references) such that each
logical entity (i.e., metadata object) becomes a single KLV packet (cf. Figure 4b). The
structure of each metadata entity is described using a dictionary as described in the
previous section. In this case references between objects are implemented using special
MXF-designated reference data types. Additionally, this mechanism can be coupled to
existing MXF structural metadata using the same mechanism. We refer to this strategy as
the full ‘KLV’ serialization strategy.
FIAT/IFTA World Conference 2013 in Dubai
5
6. Author’s name(s)
The third strategy of serialization writes only a minimal set of KLV packets to the MXF
file, just sufficient to ensure a reference can be placed in the MXF metadata to an
external file in which the actual metadata is stored and that acts as a ‘side-car’ to the
MXF container (cf. Figure 4c). Like with the KLV serialization strategy, this side-car
metadata reference can also be linked to existing MXF structural metadata.
KLV
(a) Dark
Figure 4
(b) KLV
(c) Side-car
Embedded metadata serialization strategies for serializing metadata into the header metadata of
an MXF file container; (a) dark, (b) KLV, and (c) side-car.
The dark serialization strategy is the simplest one, and involves writing the metadata file
into the MXF file directly and in a single place. However, while this does conform to the
MXF specification (namely, the packet is written as a conformant KLV packet and should
be ignored by MXF parsers if unknown), it is not considered best-practice, as the
metadata inserted requires further processing once read from the file, and it is in no
intrinsic way related to the MXF header metadata, except through custom interpretation
not associated with any of the MXF specifications. For example, any relationship between
the metadata and a limited part of the MXF timeline can only be determined by fully
processing the metadata. On the other hand, this technique requires only little effort to
implement and requires only simple modifications for embedding metadata into existing
and legacy MXF files. Additionally, due to the fact that the metadata is serialized exactly
as is, bit-per-bit, we can be assured that the metadata is stored in a lossless way.
The KLV serialization strategy, on the other hand, follows a reverse approach, and
models each and every metadata object of the original metadata scheme as an individual
KLV packet, with each of its fields described exhaustively using MXF metadata data
types, and fully integrated with the MXF structural metadata. Clearly, this technique
requires the most preparation in advance to provide a mapping between the original
representation of the metadata (e.g., XML described by a schema or other document
type) and to implement as each object must be translated to its counterpart. However,
this strategy provides the best approach in terms of best practices in the MXF ecosystem
and seamlessly fuses existing structural metadata with newly embedded descriptive
metadata; in the same way as it is done for other nominative MXF descriptive metadata
standards such as DMS-1.
The sidecar method can be used in scenarios where metadata updates are likely to
occur frequently and the recurring modification of MXF files is not feasible. The downside
of this approach is that metadata file must be transferred and kept together with the MXF
file throughout the production and distribution process. Unfortunately, this would again
require a delivery interpretation layer to be part of the ingesting process. Its complexity
would be reduced however, as the MXF file’s metadata would refer unambiguously to the
correct external metadata files.
In an archiving context, the sidecar strategy is less convenient, because metadata is not
likely to change often, if at all. Additionally, it requires a delivery interpretation layer at the
ingest point of the archive, in addition to a parsing effort of the MXF file itself, and as such
6
FIAT/IFTA World Conference 2013 in Dubai
7. Title of paper
is not the best candidate for use in archival situations. Concerning the KLV approach or
dark approach, the trade-off must be made between convenience of implementation and
richness of metadata. In any case, in our research, we have investigated both
approaches and have built tools to support all three strategies such that implementers
can pick the one best suited for their workflows.
MAPPING EBUCORE METADATA TYPES
Whenever metadata is written using the KLV or side-car strategy, a translation must be
made between the original metadata representation format and the metadata provisions
defined in the MXF file format. In order to meet requirement #2, this translation must be
done meticulously and using the native data types and structures available, as an answer
to requirement #1. We constructed such a translation between the EBUCore XML
schema and a KLV representation, in which the types of EBUCore were mapped to their
equivalent counter-parts in terms of native MXF data types and classes, each of which is
identified by a SMPTE Universal Label (UL) as illustrated in Figure 5 (cf. the key and
globalKey attributes). The result of this translation effort has been submitted to SMPTE
for inclusion in the Class 13 section of the SMPTE Metadata Dictionary and will be
publicly registered such that it is available for interested parties to study and employ in
individual implementations.
As we tried to ensure that MXF-compliant data types and structures were used in the
serialization of EBUCore metadata, an important goal was to ensure that information
could be translated between both representations in a lossless way. Note however, that
the result of a serialization back and forth between XML and KLV will not be bit-wise
identical, but will be semantically equivalent (e.g., spaces, redundant namespace
declaration, etc. will not be needlessly included in the KLV serialization).
<ebucoreCoreMetadata key="06 0E 2B 34 02 7F 01 0B 0D 02 01 01 01 02 00 00" base="InterchangeObject" type="localSet">
<identifier globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 01 00" type="StrongRefBatch" use="optional"/>
<title globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 02 00" type="StrongRefBatch" use="optional"/>
<alternativeTitle globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 03 00" type="StrongRefBatch" use="optional"/>
<creator globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 04 00" type="StrongRefBatch" use="optional"/>
<subject globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 05 00" type="StrongRefBatch" use="optional"/>
<description globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 06 00" type="StrongRefBatch" use="optional/>
<publisher globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 07 00" type="StrongRefBatch" use="optional"/>
<contributor globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 08 00" type="StrongRefBatch" use="optional"/>
<date globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 09 00" type="StrongRefBatch" use="optional"/>
<type globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 0A 00" type="StrongRefBatch" use="optional"/>
<language globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 0B 00" type="StrongRefBatch" use="optional"/>
<coverage globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 0C 00" type="StrongRefBatch" use="optional"/>
<rights globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 0D 00" type="StrongRefBatch" use="optional"/>
<rating globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 0E 00" type="StrongRefBatch" use="optional"/>
<version globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 0F 00" type="StrongRef" use="optional"/>
<publicationHistory globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 10 00" type="StrongRef" use="optional"/>
<customRelation globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 11 00" type="StrongRefBatch" use="optional"/>
<basicRelation globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 12 00" type="StrongRefBatch" use="optional"/>
<format globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 13 00" type="StrongRefBatch" use="optional"/>
<part globalKey="06 0E 2B 34 01 01 01 0E 0D 02 01 01 02 A0 14 00" type="StrongRefBatch" use="optional"/>
</ebucoreCoreMetadata>
Figure 5
Excerpt of the EBUCore XML-to-KLV mapping defined for the full KLV serialization strategy.
Displayed is the mapping for the CoreMetadata metadata set.
In translating between both representations of metadata we have learnt a number of
lessons that we list here.
As a general rule, types in EBUCore could be mapped to a KLV version mostly
unchanged and with an identical purpose (e.g., a PublicationEventType is mapped to a
similar KLV object type). Relationships between objects (incl., one-to-one, one-to-many,
…) can be translated except for a number of differences in possible cardinalities. E.g.,
while XML Schema defines a 1..n cardinality, there is no such equivalent for MXF.
However, such issues are easily resolved by including simple work-arounds, e.g., by
FIAT/IFTA World Conference 2013 in Dubai
7
8. Author’s name(s)
using a required field and then optional list of additional entries to deal with the previous
case.
Furthermore, while MXF supports many common numeric and character data types, a
number of more advanced types have not been included. E.g., there is a single
timestamp type in MXF but this type does not support time zones. Additionally, best
practices in MXF dictate a preferred way of handling other types of time-related data
using edit-units (i.e., frames) whenever possible, while EBUCore supports a variety of
other time measuring types (e.g., seconds, frame numbers). In these cases, the tools
used for the translation and embedding of the metadata must ideally perform the required
conversions such that native MXF data types and constructs can be employed. Mapping
constructs in which opaque data types are stored as character strings, and then reinterpreted when retrieved from the MXF file should be avoided whenever possible.
Right from the start of the mapping effort, special attention was given to the backward
compatibility and support for future versions of EBUCore. It is crucial, especially in
archival contexts, that older versions of embedded EBUCore metadata remain readable
by newer EBUCore-aware MXF parsers. In order to realize this, versioning of the KLVtranslated EBUCore metadata was incorporated at each level and has been incorporated
explicitly within the SMPTE registration the KLV representation of EBUCore. In fact, the
current SMPTE registration applies to version 1.4 of EBUCore, and every subsequent
version will also be fully registered and standardized as such. This is done by assigning
each unique element UL that identifies a KLV metadata class or field a version byte that
is used to identify the EBUCore standard it conforms to. Additionally, a unique label is
attached to the beginning of the MXF metadata, identifying the version of EBUCore
metadata stored in that container. This allows MXF parsers to load the correct version
and vocabulary of an EBUCore processor early, before any EBUCore metadata elements
are encountered.
UNION OF EBUCORE AND MXF METADATA
When a full KLV serialization strategy is employed, the added descriptive metadata can
be setup to fully interact with the structural metadata. In particular, EBUCore and
standard MXF metadata are united by means of the MXF essence timeline. The MXF file
container defines the essence contained in the file as a package of essence and data
tracks, each of which forms part of the timeline. Some tracks are plain picture or sound
tracks that represent actual video or audio streams, while others define custom time
codes or ancillary data related to the imagery. Finally, tracks can be Descriptive Metadata
tracks that reference a set of descriptive metadata elements also contained in the header
metadata. EBUCore metadata is inserted into the MXF metadata in this way, such that it
properly interacts with the timeline model (SMPTE-EG42, 2004), as illustrated in Figure
6. EBUCore metadata that describes the entire essence file is referenced from a Static
Track that covers the entire length of the essence uniformly. In particular, an
ebucoreMainFramework bridging object is used to link the static track to the actual
EBUCore CoreMetadata object. On the other hand, EBUCore Parts that describe only a
limited section of the essence are also properly modelled on the MXF timeline by using a
Descriptive Metadata Event Track on which temporal segments are assigned for each of
the Part objects in the EBUCore metadata.
8
FIAT/IFTA World Conference 2013 in Dubai
9. Title of paper
Figure 6
Interaction between the embedded EBUCore metadata and the MXF timeline concept.
This kind of interaction enables a very powerful expression of the relationship between
the descriptive metadata and the essence in the MXF container. Temporal segmentation
and description of the essence can be done using native MXF data structures, and if
needed, multiple instances of descriptive metadata (possibly originating from different
metadata standards) can be combined on a single timeline.
Note finally that the presented association of metadata is also relevant with respect to the
side-car serialization strategy. In this case, only the timeline elements (the Static Track
and its segments) and the ebucoreMainFramework are written. However, the
ebucoreMainFramework only contains a field into which the location of the side-car file is
written and no other EBUCore metadata (i.e., CoreMetadata object) is present.
TOOLS FOR EMBEDDED METADATA:
THE EBU MXF SDK REFERENCE IMPLEMENTATION
While we have argued the case for MXF-embedded EBUCore metadata and have
highlighted its potential advantages, an important factor in the adoption of such a
technology requires processing tools to be easily available. For this reason, we have
developed a freely available reference software implementation such that interested
users can get started right away integrating embedded metadata. The software is
provided as a freely available open source Software Development Kit (SDK) which
provides both ease of use through a number of pre-packaged tools and flexibility in the
form of a software library (Van Rijsselbergen, 2013). Figure 7 shows how the SDK can be
used. A number of command line tools are available that use the functionalities of the
SDK to provide end user functions such as embedding EBUCore metadata in an existing
MXF file (ebu2mxf.exe), extracting EBUCore metadata from an MXF file (mxf2ebu.exe)
and embedding EBUCore metadata in an newly created MXF file (raw2bmx.exe).
Additionally, applications can use the SDK as a software library to incorporate its features
by calling a variety of functions. For those cases, the SDK is provided with documentation
for the public function API that the SDK exposes to external programs.
eb
e
.
f
x
m
2
u
ra
e
x
e
.
x
m
b
2
w
mx
e
x
e
.
u
b
e
2
f
Custom Tools and
System
Integrations
EBU MXF SDK
Figure 7
Uses of the EBU MXF SDK: using command-line tools or as a function library for custom tools.
FIAT/IFTA World Conference 2013 in Dubai
9
10. Author’s name(s)
HOW THE SDK WORKS
This section describes the EBUCore processing functionality of the SDK (illustrated in
Part in Figure 8Error: Reference source not found). The SDK can read and write two
representations of EBUCore; the XML variant is read from and written to XML documents
that conform to the EBUCore XML schema, the MXF variant is read and written to KLV
packets, the native encoding of information units in MXF files. For both XML and MXF
representations, the EBUCore metadata is read (or written to) an in-memory
representation (i.e., an instantiated object model) first and then translated to the other
representation through the bi-directional mapping discussed in the previous section of
this paper.
Figure 8
Features of the EBU MXF SDK: Writing and reading EBUCore metadata (1), processing audiovisual essence (2), reading and extending existing MXF files (3) and offering base functionality of
the incorporation of other embedded metadata standards besides EBUCore (4).
In the second mode of operation, the SDK writes EBUCore metadata into an existing
MXF file, the path depicted in Part in Figure 8. This mode requires more complex
application logic, as the existing file must be modified as efficiently as possible, and the
existing metadata must be modified in such a way as to remain fully compliant with the
MXF file format specification.
MXF files may carry multiple instances of the file’s metadata (i.e., each new partition can
contain an updated set of metadata). This way, streaming and growing file scenarios can
be supported in which increasingly accurate metadata is continuously inserted as the file
being is extended, resulting in an MXF file that contains the most complete metadata in
its footer partition. Partitions marked as open and incomplete can instruct MXF
interpreters to ignore early sets of metadata and only consider a final closed and
complete metadata set as the definitive MXF file structure description.
Unless explicitly instructed otherwise, the SDK uses this mechanism to append the
updated metadata in the footer partition of the MXF file. This involves a rewrite of only the
footer partition, which requires only limited writing operations since footer partitions
contain no essence. Most of the header and (bulky) body partitions remain unchanged,
except for an update of the small partition header KLV pack to signal an – as of now –
open and incomplete metadata set. Note that, when selecting the metadata to extend, the
SDK also interprets partition flags to select only the finalized metadata for extension with
EBUCore elements.
Considering the complexity of the MXF file format specification, it is not unlikely that
certain implementations of MXF interpreters will lack support for selection of metadata
beyond the header partition, and will expect this partition to contain only a single
10
FIAT/IFTA World Conference 2013 in Dubai
11. Title of paper
complete metadata set. To support these systems, the SDK can be explicitly instructed to
write the EBUCore metadata to the header partition, at the expense of a byte shift
operation across the remainder of the MXF file.
Finally, we wish to point out the fact that the SDK has been constructed in such a way
that the code used for embedding metadata using the various strategies discussed above
has been separated from the code that performs the mapping between representations of
EBUCore such that it can easily be reused for embedding non-EBUCore metadata. The
serialization code can be reused, with only a new mapping effort that needs to be
implemented for each additionally supported metadata standard, as illustrated in Part
in Figure 8.
EMBEDDED EBUCORE METADATA: PROOF-OF-CONCEPT DEMONSTRATOR
To illustrate the use of the SDK and embedded EBUCore metadata, we have built a
proof-of-concept demonstrator for the ingest of EBUCore metadata in Limecraft Flow, an
on-line collaboration and media production environment built for the production and
archiving of various media production formats such as drama and factual television
programs (Limecraft, 2013).
The SDK tools were integrated in such a way that they naturally extend the existing
ingest process with a metadata extraction step that analyses incoming MXF files and
reads the EBUCore metadata (if any). The EBUCore metadata extracted is then made
available to users of the application as searchable metadata to aid them in retrieving
media assets.
Search
Application
Media Probing
«User»
Metadata Index
Feature Detection
mxf2ebu.exe
Retrieval
Application
EBU MXF SDK
Media Repository
Figure 9
Functional overview of the proof-of-concept demonstrator in Limecraft Flow. Depicted is the
ingest process that was extended with an incorporation of the EBU MXF SDK for extraction and
indexing of EBUCore metadata.
Figure 9 shows a breakdown of the components involved in the demonstrator. MXF files
are ingested and delivered into a folder from which new files are analyzed (i.e., the type
of file is determined, and a number of feature detection procedures, incl. shot cut
detection, are executed). As an extension, we have added mxf2ebu.exe, one of the tools
powered by the reference SDK, which reads the embedded EBUCore metadata and
extracts it in the form of an XML document. This document is then added to a metadata
index which can be queried by a Search Application accessible to end users. This way,
all information present in the EBUCore metadata embedded in the MXF file can be
searched for immediately and without the need for complicated software layers for
delivery interpretation. Based on queries and their search results, users can instruct the
system to retrieve found assets, which in turn still contain the embedded metadata so
that downstream in the production chain systems can benefit from the presence of
embedded metadata without requiring out-of-band delivery mechanisms.
A possible further extension to this demonstrator could include the serialization of
metadata in MXF files for which metadata is known, but not yet embedded in the essence
files. Upon retrieval after a user’s research, the Retrieval Application could invoke
another tool in the SDK, ebu2mxf.exe, to embed the metadata as EBUCore when the
FIAT/IFTA World Conference 2013 in Dubai
11
12. Author’s name(s)
asset files are retrieved from the media repository. Because the SDK has been optimized
to perform serialization with a minimal of file processing operations, this could be
performed on the fly when files are being delivered to the systems of end users.
CONCLUSIONS
In this paper, we have discussed advantages of using of embedded metadata in essence
containers for the transportation of descriptive metadata. In particular, we have shown
that EBUCore metadata, a standard published by EBU and endorsed by many adopters,
embedded in MXF container files can be a powerful mechanism for metadata delivery in
the ingest process of archival systems. We have illustrated various strategies that can be
employed to embed the metadata such that the serialization is done in a standardscompliant and semantically lossless fashion. To aid the adoption of the techniques
discussed, we have built a freely available open-source SDK that can be used to build
applications that support embedded metadata, and finally, we described a proof-ofconcept example use of the SDK in real-world scenario.
REFERENCES
SMPTE-377M, 2004. Standard for Television – Material Exchange Format (MXF) – File
Format Specification. SMPTE 377M-2004.
SMPTE, 2009. Standard for Television – Material Exchange Format (MXF) – File Format
Specification. SMPTE S377-1-2009.
SMPTE-380M, 2004. Standard for Television – Material Exchange Format (MXF) –
Descriptive Metadata Scheme-1. SMPTE 380M-2004.
Advanced Workflow Association, AMWA, 2012. AMWA Application Specification – AS-11
MXF Program Contribution. AS-11. Available from http://www.amwa.tv.
EBU, 2013. EBU CORE METADATA SET (EBUCore) Version 1.4. EBU Tech 3293.
Dublin Core Metadata Initiative, DCMI, 2004. Dublin Core Metadata Element Set, version
1.1: Reference Description.
W3C, 2004. World Wide Web Consortium – XML Schema, Second Edition.
Available from http://www.w3.org/standards/techs/xmlschema.
SMPTE-EG42, 2004. Engineering Guideline for Television – Material Exchange Format
(MXF) – MXF Descriptive Metadata. SMPTE EG42-2004.
Van Rijsselbergen, D., Dos Santos Oliveira, M.,Evain, J-P. (2013, 28 May). EBU MXF
SDK – An SDK for MXF embedded EBUCore metadata processing and analysis.
Retrieved 20 September, 2013, from https://github.com/Limecraft/ebu-mxfsdk/.
Limecraft (2013, 1 July). Limecraft Flow – Your online media production office. Retrieved
20 September, 2013, from http://www.limecraft.com.
12
FIAT/IFTA World Conference 2013 in Dubai