Content model – the fiendishly complex details

The HTML5 specification splits the detailed content model into two parts:

  1. Seven content categories – broad (but not perfect) similarities between ranges of elements
  2. Consideration on an element-by-element basis of where they should appear, and what should appear within them

Seven content categories

This section quotes and paraphrases liberally from the specification section 3.2.5 Content models.

Content category Comments Elements in the category
Metadata content Metadata content is content that sets up the presentation or behavior of the rest of the content, or that sets up the relationship of the document with other documents, or that conveys other “out of band” information. base, command, link, meta, noscript, script, style, title
Flow content Most elements that are used in the body of documents and applications are categorized as flow content. a, abbr, address, area (if it is a descendant of a map element), article, aside, audio, b, bdi, bdo, blockquote, br, button, canvas, cite, code, command, datalist, del, details, dfn, dialog, div, dl, em, embed, fieldset, figure, footer, form, h1, h2, h3, h4, h5, h6, header, hgroup, hr, i, iframe, img, input, ins, kbd, keygen, label, map, mark, math, menu, meter, nav, noscript, object, ol, output, p, pre, progress, q, ruby, s, samp, script, section, select, small, span, strong, style (if the scoped attribute is present), sub, sup, svg, table, textarea, time, u, ul, var, video, wbr, Text
Sectioning content Sectioning content is content that defines the scope of headings and footers. article, aside, nav, section
Heading content Heading content defines the header of a section (whether explicitly marked up using sectioning content elements, or implied by the heading content itself). h1, h2, h3, h4, h5, h6, hgroup
Phrasing content Phrasing content is the text of the document, as well as elements that mark up that text at the intra-paragraph level. Runs of phrasing content form paragraphs. a, abbr, area (if it is a descendant of a map element), audio, b, bdi, bdo, br, button, canvas, cite, code, command, datalist, del, dfn, em, embed, i, iframe, img, input, ins, kbd, keygen, label, map, mark, math, meter, noscript, object, output, progress, q, ruby, s, samp, script, select, small, span, strong, sub, sup, svg, textarea, time, u, var, video, wbr, Text
Embedded content Embedded content is content that imports another resource into the document, or content from another vocabulary that is inserted into the document. audio, canvas, embed, iframe, img, math, object, svg, video
Interactive content Interactive content is content that is specifically intended for user interaction. a, audio (if the controls attribute is present), button, details, embed, iframe, img (if the usemap attribute is present), input (if the type attribute is not in the Hidden state), keygen, label, menu (if the type attribute is in the toolbar state), object (if the usemap attribute is present), select, textarea, video (if the controls attribute is present)

An element can belong to zero or more categories. H1 appears under both Flow and Heading above. li doesn’t appear at all.

There are also additional specialised categories that themselves don’t appear in this taxonomy, eg for some form elements.

How the categories overlap

The specification includes a Venn diagram of how the categories overlap, with accompanying text:

Venn diagram of content categories

Sectioning content, heading content, phrasing content, embedded content, and interactive content are all types of flow content. Metadata is sometimes flow content. Metadata and interactive content are sometimes phrasing content. Embedded content is also a type of phrasing content, and sometimes is interactive content.

Looking at H1 again, we can see that all heading elements are also flow elements.

Palpable content

Another categorisation cuts across the taxonomy: palpable elements. These are ess important than the content categories. They are:

a, abbr, address, article, aside, audio (if the controls attribute is present), b, bdi, bdo, blockquote, button, canvas, cite, code, details, dfn, div, dl (if the element’s children include at least one name-value group), em, embed, fieldset, figure, footer, form, h1, h2, h3, h4, h5, h6, header, hgroup, i, iframe, img, input (if the type attribute is not in the Hidden state), ins, kbd, keygen, label, map, mark, math, menu (if the type attribute is in the toolbar state or the list state), meter, nav, object, ol (if the element’s children include at least one li element), output, p, pre, progress, q, ruby, s, samp, section, select, small, span, strong, sub, sup, svg, table, textarea, time, u, ul (if the element’s children include at least one li element), var, video, Text that is not inter-element whitespace

Element by element

For example:

The p element
Categories –
What content categories is p a member of?
Flow content, Palpable content
Contexts in which this element can be used Where flow content is expected
Content model –
What content types can a p element contain?
phrasing content

Some of the element specifications are more complex, eg stating specific other elements that must or must not appear in context with them.

You can find these references across a range of pages:

Palpable content again

We would generally expect text inside a p element, but it’s easy to imagine legitimate cases in which the paragraph is empty.

This reflects a general rule that is not a hard requirement: elements whose content model allows any flow or phrasing content should have at least one child node that is palpable, and that does not have the hidden attribute specified.

Transparency

Some elements have a transparent content model. They inherit their content model from their parent. eg the content model for a elements:

Transparent, but there must be no interactive content descendant.

If the element has no parent then it accepts flow content. This case is hard to imagine, since body accepts flow content and head accepts metadata content.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s