Information and Services by Zac Hester

Document Structure

If you've learned how to write and edit XHTML/HTML documents by hand and have spent any time learning the more current methods, the reasons to use certain tags in certain situations may seem arbitrary. For instance, there's a big push to get people to stop using tables for columnar layouts. The unfortunate result of this and other areas of new document structure is that designers are trying to re-learn what they know about layout markup. This results in document structures that are no better off than if they just used the less "acceptable" constructs.

In this write-up I hope to point out some of the reasons for the new ideas behind updating your layout's markup. In addition, I hope to pass on some of the techniques I've used that can make your documents more accessible, easier to style and script and more optimized for anything from text-terminal browsers to modern, GUI browsers.

One of my many annoyances about markup purists that they just shout buzz words like "accessibility," "alternative media," "graceful degradation," and a bunch of other ideas about what we "should" be doing. What no one seems to spend any time doing is providing simple, straight-forward guidelines and examples about how to accomplish these goals. The following is a quick list of ideas and approaches to solving these problems and still make the work simple enough to realize these goals in a tight-budgeted, deadline-oriented development environment.

Before we begin, it's important to understand that your time will be best spent learning these concepts if we are using a markup that is best suited to these concepts of document structure. To this end, I've chosen to practically ignore HTML and now favor XHTML in all of my web documents. The reason is that XHTML is much simpler at representing true structure. HTML constructs get so bogged down in support for legacy syntax that it's hard to push forward and make our documents accessible to the next generation of browsing applications. If you're still tied to HTML, these examples may still be useful, but you should be aware that the syntax will be much simpler and you might need to downgrade some of the tags.

The Document as an Object

If you've ever coded in an object-oriented language such as C++ or Java, you may be familiar with the concept of an object. If not, this might be a bumpy ride. In a nutshell, a data object is a bundle of information and information handling procedures that provide a consistent and expandable interface to that information. A good example is how you use your web browser to access information on the web. The way you navigate to a web site never changes (typing an address or clicking on a link), but the information displayed will change based on your environment.

The reason it's important to understand the concept of an object is that every web document that is read into a web browser has to be represented as an object in the browser software. To make the document expandable and adaptable to practically every possible use, web documents are organized into what's called a tree-structured object. In simple terms, a tree data structure is a way to represent and organize information in a hierarchical fashion. This means that each element of the structure is a descendant or "branch" from a parent element. This idea of branching is why it is called a tree structure. If you've ever read any introductory information about XHTML/HTML, I'm certain you've seen the basic document tree. A quick example of this tree is using the DOM inspector in Mozilla Firefox.

DOM Inspector
Figure 1: Mozilla Firefox's DOM Inspector

As shown, you can see how each of the document structures are descendants--or children--of the root element called "HTML."

By representing a document as an organized data structure, we have the ability to operate on this document using all kinds of programming and pathing constructs. The two important aspects of using this structure in everyday web applications is scripting and styling.

When we need to create scripts to automate or enhance features of our document, we need to be able to operate on the various elements of the document. To meet these needs, it becomes important that we're able to access these elements in a simple, organized programming interface. If you use JavaScript for more than just forms, you're already familiar with the DOM interface to the document's structure.

In the area of styles, there is still a gulf between designers who use style sheets as replacements for traditional HTML methods and designers who use style sheets to simplify their document structure. This article focuses on the latter use of style sheets.

Proper Structure

As it turns out, proper document structure not only simplifies scripting and styling aspects, but it is also easier to create and update.

The Document From the Top Down

First of all, it's important to think about the document from the top (or root) down. For every web document there is a root element that sits at the top of the hierarchy. Below the root, there are two children: "head" and "body." From these two elements, there are any number of additional children. When we're creating a web document, though, it seems like structure takes a back seat to how the page looks. If we think only about layout and design and not structure, our document will only be useful to visual, GUI-based web browsers. Furthermore, if you plan your structure around sizes of elements and graphics, you limit that page's usefulness on anything but a GUI-based web browser on a certain minimum-size monitor. Now, take that page and display it on a WAP device that might give you a 3" diagonal display, and your layout element that's a mere 200 pixels in width forces your visitors to scroll all over the place.

This is why it's imperative to create a document structure that stands on its own without a lot of visual elements. When you use a table to create a two-column layout, you're forcing your page to take on certain dimensions. When you use properly nested div tags, you create a basic structure that does nothing but logically separate the portions of the document. There's no information about where things should be placed or how wide or tall each element should be. These things should be left up to a style sheet. Additionally, graphical elements can also be added through style sheets.

Graphics

This brings us to our first specific area of document structure: graphics. It's important to divide the graphics we use on a web site into two categories: layout and visual enhancement, and illustrative or informational graphics. The latter category is usually simple since it's made up entirely from pictures that we use within the content of our site. It might be the picture of an article's author, a PNG capture of a bar graph, screenshot, or a schematic. Just think of it as an inseparable component of your information.

Graphics that are used for layout and visual enhancement include those that are used purely for show. This includes things like the neat graphic you use along the top of each web page or the borders of a box that contain little bits of information. A great example of "showy" graphics are the little corner images that people use to create the effect of a rounded-rectangular box. For better or for worse, these graphical "goodies" are sometimes considered critical to the presentation of the information. But, that doesn't mean that these graphics need to be a part of the document's structure. If you're creating those rounded-corner boxes out of a nine-cell table and four little pictures of quarter circles, it's time to step up to a much more accessible method of creating those effects.

Once you learn some of the little "tricks" to using CSS to eliminate useless document structure, you can plan out your document to expect to use these techniques. Following up with the example of using a rounded rectangle, it's pretty simple to just use common document structure to achieve this effect.

Rounding the Corners with CSS

Let's consider a chunk of a document that we might want to wrap in a neat little box:

<div class="box">
<h2>The Title of the Box</h2>
<p>A paragraph inside the box.</p>
<p>Another paragraph inside the box.</p>
<div><div>Something along the bottom of the box.</div></div>
</div>

It's important to see that instead of creating any layout, we've only gone out of our way to add structure to the document. In a text-only or very old web browser that doesn't support style sheets, this piece of the document will display perfectly without any strange side-effects of things like tables columns and other random behavior in anything but a large, GUI-base browser. Now, in an ideal world, we wouldn't even have to go out of our way to even double-nest the footer in two divs. But, the double-divs are necessary to create a rounded box that stretches with its parent container. If you don't want your box to be "stretchy," you don't have to double-nest the divs for the footer.

If you're used to using tags like "table" and "img" to pull of this effect, it might surprise you that our box's markup is so sparse. But, that's the whole point of using document structure to effectively represent your information in a manner that doesn't make your information rely on things like layout and graphical elements when we should just be worried about displaying information.

So, how do we make this structure turn into a rounded-corner box? We do it with CSS, of course. If you realize that you're free to set all the margins, borders, padding, and background images of each of these structural elements, it only takes a couple simple style rules:

div.box {background:url(topleft.png) top left no-repeat #DDDDDD;}
div.box h2 {margin:0;
  background:url(topright.png) top right no-repeat transparent;}
div.box div {background:url(bottomleft.png) bottom left no-repeat transparent;}
div.box div div {background:url(bottomright.png) bottom right no-repeat transparent;}

These rules will place four graphics at the four corners of the box and set the box's background to a light gray (which should be the background color of your corner images). You could also fancy up the borders and apply repeating images to the backgrounds along the sides of the box. If you needed to accomplish this affect on both sides of the box, there would need to be supporting document structure, so it would probably be easiest to add an extra div around the content and possibly a second if you want it to be independent of the child tags (p). At this point, you would also need to provide more precise selectors, so adding a class to the outside footer div would allow you to select the appropriate elements in your CSS.

The point of taking you through this example is to explain how to use document structure to accomplish the goals of additional layout overhead--like images that don't pertain to the information that needs to be displayed.

Table Layouts

A large area of controversy right now is whether or not to stop using tables to position sections of the document. The biggest problem with tables is that they are intended to display information that needs to be related through a table. An example is a list of names and addresses. But, since tables provide a reasonable amount of control over each of the cells, it has become overly common to use a table to lay out sections of a document.

Currently, there exists two major types of solutions to the problems with positioning a document's layout: correct but less universal, and incorrect but universal. The big problem is that the most proliferate browser to date (Microsoft's Internet Explorer) does not support the full CSS specification for web browsers. If you try to use the proper document structure and change the way it lays out using the appropriate CSS, Internet Explorer doesn't even begin to support the basic display rules. This means you either have to make the document less available to other devices and more available to Internet Explorer or choose another, less correct method of solving the layout problem.

At this point, there are many ways to "hack" the table-less layout. If you do a few Google searches, you'll find several without looking too hard. No matter what solution you decide to use, keep in mind that your solution should involve using structure within the document that is appropriate for the information being displayed and altering the layout using style sheets.

Conclusion

This article was intended to provide you with a better feel for the reasons behind using proper structure to represent your information and rely on supplementary methods to affect how that information is displayed in the final browser.

The reason this level of understanding is necessary is to not have to replace our old ways with new, equally-wrong ways of representing our documents. It's not enough to simply replace all the tags that are named "table" with tags that are named "div" and use a lot of CSS tricks. The underlying structure has to change. It's time to start thinking about your document in terms of the information you need to convey, not the way that information has to be arranged and visually-enhanced.

Last Modified: December 19, 2005