=================================== Cascading Style Sheets (Overview) =================================== Sebastian Geerken Apr 2002: first version, posted to the developer's list Oct 2003: revised and extended, while implementing phase 1, adopted as developer's documentation Nov 2003: small corrections About ===== This is an overview of the implementation of CSS in dillo, without details on the internals of the single modules, they are described in separate documents. This text does furthermore not cover the problems how to render particular attributes, since this is the task of the Dw module. Most attributes can already be rendered, some of them (e.g floats, fixed positions) have to be implemented, which can be done bit by bit. Modus Operandi ============== The implementation of CSS is splitted into two phases. After phase 2, dillo will hopefully support full CSS, and it should be simple to add interesting features like XML/CSS parsing. Phase 1 concentrates on the general structure, and will have the following limitations: 1. There will be a distinction between simple and complex CSS properties. A complex attribute is, in this context, defined as an attribute, which change (after it has already been rendered) will make structural changes in the widget tree necessary, e.g. when changing the attribute "display" from "inline" to "block", some words of the parent DwPage will have to be replaced by a newly created DwPage, which furthermore will contain these replaced words. On the other hand, changing simple attributes like fonts, colors etc. will at most involve recalculating sizes. [1] In phase 1, the Doctree module will support the construction of elements with complex properties, but no changes. Since HTML rendering and CSS processing is done in in asynchronous way (see below), this means that complex properties are only allowed in the user agent and the user stylesheet. 2. The CSS module will only handle style sheets with very simple selectors, we will first focus on a fast implementation. Supported selectors only regard the element in question, and do not support attributes, only (HTML) classes. In phase 2, dillo will get a complete CSS engine, either written from scratch, or an already existing one (e.g. RCSS by Raph Levien). The User's View =============== An important goal is asynchronous HTML/CSS parsing: when the HTML parser reads a tag referring to an external style sheet, it continues to render the document _without_ the style sheet, while the style sheet (if it is not already in the cache) is retrieved and parsed parallel to this, and then applied on the current rendered part of the document. If the time difference is large enough, the user will notice a sudden change of colors, fonts, etc., but will be able to read the content with less delay time (which may be several seconds). Overview ======== The following diagram shows the associations between the data structures, and there multiplicities, for parsing HTML documents. Worth to notice is that for every document (represented by SgmlDoc), there is one document tree (Document) and one CSS context. +------------+ +-------------+ | CssContext |< - - -| Css_doctree | +------------+ +-------------+ 1 ^ ^ | | , - - - - - - - -' `- - - - - - - - - - - . 1 | | V +---------+ 1 1 +-----------------+ | SgmlDoc | ------------------------------> | DoctreeDocument | +---------+ +-----------------+ ^ 1 | 1 | | | 0..1 | * +------------+ 1 * +-----------+ 0..1 1 +----------------+ 0..1 | SgmlParser | -----> | SgmlState | -------> | DocTreeElement |---. +------------+ +-----------+ +----------------+ | . . * | | /_\ /_\ `-------' | | | | +------------+ +-----------+ | HtmlParser | | HtmlState | +------------+ +-----------+ | 0..1 | V 1 +--------+ | HtmlLB | +--------+ For more information about the SGML and the HTML parser, see "SGML.txt". In short, the purposes: - HtmlParser exists as long as the HTML document is parsed, and is inherited from SgmlParser, which contains all data for the general SGML/XML parsing, while HtmlParser adds some HTML specific data. - SgmlDoc exists as long as a document is shown (partly, when the SgmlParser still exists, or fully, after the SgmlParser has already been destroyed). CSS operations are done related to this structure, since CSS documents may be applied even if the document has already been retrieved fully, and so the SgmlParser does not exist anymore. - HtmlLB exists as long as SgmlDoc exists. It adds some more data like links and forms. The way how CssContext works in detail, is described in "CSS.txt"; its purpose, and details on the other structures are described below. A further role plays the module "Css_doctree", which provides no data structures, but provides some functions to prepare CSS values for the document tree. CssValue and DwStyle ==================== There are two ways style attributes are handled: CssProperty/CssValue and DwStyle. The first is used by the CSS module: CssProperty is an enumeration type, and CssValue a union, which represents values exactly in the way they have been parsed (i.e. the value may only depend on the property itself, not e.g. on the context, see below). Both, the document tree, and the dillo widget, use the structure DwStyle, which is created by the module "css_doctree", when the attributes for an element are evaluated. The representation of values differs generally from CssValue, a particular property is in one of the following categories (for the exact terminology, see [CSS2] chapter 6.1): 1. Absolute values are represented directly. Examples are absolute lengths. The value "auto" is handled the same way. 2. Some relative values are immediately computed, this may depend on attributes of the parent element. Examples are relative line heights, i.e. "line-height: 150%" will be computed into an absolute (pixel) value. 3. Other relative sizes are represented this way in DwStyle, examples are relative widths and heights. Whether a specific attribute falls into category 2 or 3, is determined by two factors: 1. If the attribute value is independent of certain values, which changes affect only the level of Dw (important: window size), they can, for simplicity, put into category 2. Otherwise, they must belong to 3. The latter may not be inherited, for the reason, see next point. 2. Since only *computed* values may be inherited, attributes, which values are inherited, may not be part of category 2, since Dw will not be able to handle them correctly. The Document Tree ================= The document tree has two purposes: 1. representation of the document structure, needed (in phase 2) for the evaluation of CSS selectors, and 2. near-complete encapsulation of the dillo widget. The SGML parser accesses only the document tree, and not anymore Dw, only the HTML parser must, in some cases, refer to Dw directly. The interface is similar to a small subset of the Document Object Model (see [DOM2]) [2], and provides methods for the following purposes: 1. construction of nodes (mainly elements and text), adding them to other nodes, 2. examining the structure (e.g. for evaluating CSS selectors), 3. assigning style attributes, 4. drawing, and 5. changing the pseudo class. Some notes about the latter three points: The document tree is in most cases able to construct and access the Dw structures simply by style attributes. E.g., if the attribute "display" has the value "table", it "knows" that it must create a DwTable and add it to the DwPage associated with the parent node. This way, the SGML/HTML parser may be simplified, much functionality can be replaced by a user-agent-defined style sheet, as in [CSS2] appendix A. Since this is not in all cases possible, two back-doors are kept open: 1. It is possible to add a special type of element to the tree, with a specified DwWidget. The tag is processed this way. 2. Both, DwStyle and CssProperty/CssValue, are extended by non-standard attributes, when necessary. For better distinction, they are preceded by "x_". Examples are "x_link" and "x_colspan". An element may have a pseudo class, which is used in the style evaluation (see below). Changing this is e.g. done when the user clicks on a not yet visited link, the pseudo class then switches from "link" to "visited". Changes in Dw ------------- (This is only relevant for phase 2.) Dw will certainly change for CSS, but some restrictions, which are inherent to the basic design, will not attempted to be overcome, since this would make Dw over-complicated. Instead the complexity is apportioned on both modules, Dw and the document tree. Some restrictions to consider: - The allocation (this is the space a widget a widget allocates at a time) of a dillo widget is always rectangular, so that e.g. an inline section cannot be represented as a widget, but only as a part of a widget. (Furthermore, a dillo widget is rather complex, so that the number of widgets should be limited.) - The allocation of a child widget must be a within the allocation of the parent widget. In some cases, this leads to a widget tree, which order differs from the document tree; e.g. since the HTML document snippet may be rendered like this: - - - - - - - - - - - - - - - - - - - - - - - - - . | * Some text. * Some more and - - - - - - - - - - - - -.| | longer text. |Some longer text, so that * Final text. the effect described || ` - - - - - - - - - - - |above in this passage can ' be demonstrated. | ` - - - - - - - - - - - - Dw will render this as a DwPage (for the