Tag soup is HTML code, written without regard for the rules of HTML structure and semantics (HTML is the markup language which composes Web pages). Generally, tag soup is created when the author is using HTML for a presentational document rather than a semantic document.
Tag soup is characterized by a large number of common authoring mistakes, such as malformed HTML tags, improperly-nested HTML elements, unescaped character entities (especially ampersands (&) and less-than signs (<)), and the use of presentational HTML elements and attributes in order to create visual effects without respect for their implied meaning (that is, against their semantic purpose, see divitis).
Although often thought of as typifying private and semi-professional or hobbyist Web sites, tag soup is created by many professional web page layout programs, and written by hand by many professional web developers for some of the highest-profile sites.
Tag Soup is also the name of a Java Library for transforming HTML into valid XHTML. See the Tag Soup project home page.
One possible cause of the proliferation of tag soup may be that until the release of Macromedia Dreamweaver MX 2003, no WYSIWYG editor produced valid and well-formed code.
Another factor in the popularity of tag soup is that most mainstream Web browsers currently in use tolerate code that is invalid or not well-formed without raising any errors. Thus, testing Web pages using current mainstream browsers will not enforce valid or well-formed pages.
Because of this, most current mainstream Web browsers can render Web pages in more than one mode, including a "Quirks mode". The Web browser switches into Quirks Mode when it encounters a Web page that appears to be using tag soup. Quirks Mode allows the browser to render the Web page in the same way as older browsers may have rendered it. The problem of tag soup is carried forth as each new browser that is released needs to be able to render the existing Web.
While most mainstream Web browsers can render tag soup in more or less the way the author 'intended' it, many other user agents cannot. For example, Web browsers for people with disabilities may have problems rendering the page. Other examples of user agents which may have problems with malformed code or code which is not used for its intended purpose include tools such as search engine spiders and Web browsers in hand-held devices.
However, XHTML 1.0 states that XHTML may be interpreted by current Web browsers as HTML if it follows a set of compatibility guidelines defined in Appendix C of the XHTML 1.0 Recommendation. At this time, the popular web browser Internet Explorer is unable to interpret XHTML documents as XML, and thus most current XHTML pages are served to browsers as HTML, using the MIME type of "text/html".
Because XHTML 1.0 served to browsers as HTML is parsed as if it were badly-formed HTML, XHTML 1.0 is affected by tag soup in the same way as HTML.
Future versions of XHTML after version 1.0 do not allow the XHTML to be served to browsers as HTML. If implemented according to the recommendation, this should prevent the problem of tag soup once XHTML served as XHTML is supported by all major browsers.
This article is licensed under the GNU Free Documentation License.
It uses material from the
"Tag soup".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world