Media element state machine explained

22 Mar 2013 by David Corvoysier

The HTML5 specification gives a detailed description of the algorithm to be applied when rendering a media content.

The text is however sometimes a bit cryptic, with a lot of details only relevant to implementers. I will try in this article to give a simplified description of the expected behaviour of a Media element from a web application developer perspective.

One may distinguish typically three phases:

  • initialization: the Media element selects a media source and retrieves its properties (duration, size)
  • buffering: the Media element retrieves and store as many data as required to start rendering the content
  • playback: the Media element decodes and renders the content

Initialization

A newly inserted HTML Media element will be initialized only:

  • when its src attribute is set,
  • when a children source element is inserted.

Note: both situations can occur either declaratively (through markup) or programmatically (through javascript).

An initialized HTML Media element will be reset only:

  • when its src attribute is modified,
  • when its load() method is invoked.

Note: An HTML Media element will not be reset when a source child is inserted or modified.

During the initialization phase, the user-agent will apply the Media resource selection algorithm. to select the most appropriate media resource.

At the end of the initialization phase, the Media element should have:

  • its networkState set to NETWORK_LOADING,
  • its readyState set to HAVE_NOTHING,
  • its currentSrc set to the actual Media source URL.

The algorithm may be blocked at this stage until an explicit user request to play the content. This happens in particular:

  • if the Media element preload attribute has been set to none,
  • on some user-agents (typically apple mobile devices) that want to prevent the user to be charged for useless data transfer (Please refer to this article for details).

Otherwise, the browser will immediately transition to the buffering phase.

Buffering

During the buffering, phase, the user-agent will fetch the selected Media resource as described in the Media resource fetch algorithm.

If the autoplay attribute of the media element is set to true, or if the play() method has been called explicitly, the user-agent will immediately try to download as much data as needed to play the content through. Otherwise, the amount of data loaded at this stage is mostly implementation dependent.

The application developer may however influence the browser preloading behaviour by setting the preload attribute to:

  • metadata to download only what is needed to determine the duration and dimension of the content,
  • auto to download proactively as much data as needed to be able to start playback immediately.

At the end of the buffering phase, the Media element could have its readyState set to:

  • HAVE_META_DATA: only the duration and the dimension of the content are then available. At this stage, the browser will have sent a loadedmetadata event.
  • HAVE_CURRENT_DATA: a single frame of content is available (and can be used to be displayed in a canvas for instance). At this stage, the browser will have sent in addition a loadeddata event.
  • HAVE_FUTURE_DATA: enough frames are available to start playback. At this stage, the browser will have sent in addition a canplay event.
  • HAVE_ENOUGH_DATA: enough frames are available to play the content through. At this stage, the browser will have sent in addition a canplaythrough event.

If the autoplay attribute of the Media element is set to true, the browser will wait until it reaches the HAVE_ENOUGH_DATA state to transition to the next step and start rendering content.

However, if the play method has been called explicitly, the playback will start as soon as the HAVE_FUTURE_DATA state is reached.

Playback

As mentioned in the previous paragraph, the Media element cannot start playing content before having reached at least the HAVE_FUTURE_DATA state that corresponds to the canplay event.

The user-agent constructs a media timeline based on the metadata retrieved from the stream. In most case, it will be the timeline as described in the original stream, with the following exceptions:

  • if the Media resource specifies an explicit start date, the user-agent will store it in the startDate attribute, but will define its timeline relatively to it, starting from zero.
  • if the Media resource specifies a discontinuous timeline, the user-agent will expand the timeline of the first content to the entire stream.

The application developer can specifiy an initial playback position by either:

  • setting the Media element currentTime attribute to that position,
  • specifying the position using a Media fragment URI.

If no specific playback position has been specified, the user-agent will start the playback at the initial playback position defined in the stream.

During playback, the browser exposes the ‘official’ playback position in the currentTime attribute, that doesn’t necessarily reflect accurately the real playback position.

The speed at which the content is being played is exposed by the user-agent in the playbackRate attribute of the Media element.

Unless specified differently in the defaultPlaybackRate attribute, a content will be initially played at an 1.0 rate.

If at any time during playback the Media element runs out of data, it will generate a waiting event and switch back to the buffering phase.

When the end of the stream is reached:

  • the ended attribute is set to true,
  • an ended event is emitted,
  • if the loop attribute is set to true, the playback resumes at the earliest playback position: otherwise it stops.
comments powered by Disqus