Skip to main content
main-content

Über dieses Buch

What is the difference between a URL and a URI? How does HTTP fulfill its task? Why do we need XML? What is it, and will it eventually replace HTML? This book gives answers to these questions and a chore of others that may be asked by attentive inhabitants of cyberspace. The book is, of course, not just a glossary of abbreviations and frequently used terms. It is rather a comprehensive and still succinct presentation of the technology used in the World Wide Web. It is surprising to note that, even though hundreds of books have been published that discuss the Web, there have been none, so far, to thoroughly explain the inner workings of this popular Internet application, which is so simple to use and yet so complex when it comes to really understand what is going on inside. The target audience of this book is perhaps best described by how it was first used by the author himself: A draft version was chosen as the supporting text for a class of practitioners, who attended a continuing education course on WWW technology. These were people who knew what the Web is, and how it may be used for business, but needed to know how the technology works. During the planning for this course, the author found that no suitable book was on the market, and decided to write one himself.

Inhaltsverzeichnis

Frontmatter

Introduction

Introduction

Abstract
The World Wide Web (WWW), in this book simply called “the web”, is a set of technologies implementing a distributed hypermedia document model based on the Internet. The web made the Internet as popular as it is today. This book describes all basic concepts of web technology as well as many additional and new concepts.
Erik Wilde

Fundamentals

1. Fundamentals

Abstract
Although this book is not intended as an introduction for readers without some background as web users or without some knowledge of the basics of computer networks and the Internet, we give a very short introduction into the concepts which form the foundation of web technology. We also take a short look at the web’s history, where it all came from, and how it started. For a non-technical and very informative overview of the Internet and its history, the excellent book by Comer [48] is a good starting-point.
Erik Wilde

Basics

Frontmatter

2. Universal Resource Identifier (URI)

Abstract
The basic design of the web, as introduced in chapter 1, is that of a distributed hypermedia system. The two main architectural constructs at this level are the individual pieces of information or information resources (which in most cases are web pages), and the links between them, connecting these individual pieces of information to form an interconnected web of information resources. While we discuss the actual pieces of information in later chapters of the book, we first take a closer look on what these links are and how they work.
Erik Wilde

3. Hypertext Transfer Protocol (HTTP)

Abstract
Basically, the World Wide Web is a distributed hypermedia system, with information stored in the form of web pages, which are linked to each other using web links (better known by their official names URI or URL). This property of the web makes it necessary to have a means of accessing remote information from any system retrieving information from the web’s database (which is formed by all web pages which are available world wide). This method of accessing remote information is the Hypertext Transfer Protocol (HTTP), which is one of the key components of the web. The underlying model of the protocol is that of a client/server architecture, where the client wants to retrieve some information from the web and contacts a web server to do so.
Erik Wilde

4. Standard Generalized Markup Language (SGML)

Abstract
In 1986, some years before the web was invented, a language called Standard Generalized Markup Language (SGML) was defined and standardized in ISO international standard 8879 [110]1. In order to better understand HTML and XML, it is advantageous to know SGML, and to know a little bit about its concepts. DeRose [62] answers many of the questions which may arise when taking a closer look at SGML.
Erik Wilde

5. Hypertext Markup Language (HTML)

Abstract
The most visible part of web technology for users is the Hypertext Markup Language (HTML), the language which is used to design web pages. Normally, a web page is displayed in a formatted style, which means that the browser interprets the HTML page to generate a formatted presentation. However, since HTML has a limited number of constructs, pages may look similar because they use the same HTML constructs. Since its invention in 1990, HTML has undergone many revisions and extensions, and the current version (4.0) is far more powerful than the first one. The design goals of HTML, as stated in early publications about the web, can be summarized as follows:
  • Richness HTML should be rich (ie, powerful) enough to support a large number of possible applications. In order to achieve this goal, HTML must be general enough to be used for many different application areas.
  • Simplicity On the other hand, HTML should be simple to use so that the application of HTML is easy and many authors are encouraged to use it. For an average user (ie, no computer scientist), it should be easy to understand the concepts of HTML and to create HTML pages.
Erik Wilde

Advanced

Frontmatter

6. Cascading Style Sheets (CSS)

Abstract
In the previous two chapters about SGML and HTML, a lot has been said about the separation of content and presentation, about HTML as a language intended to carry content and not presentation aspects. For this idea to be realistic, there has to be a mechanism for specifying the presentation aspects of an HTML document. This mechanism is called Cascading Style Sheets (CSS) and is discussed in this section.
Erik Wilde

7. Extensible Markup Language (XML)

Abstract
Seeing that HTML implements only one particular document model, the Extensible Markup Language (XML) has been defined, making it possible to use documents of application-specific document types, which can be created, distributed, and interpreted in an XML environment.
Erik Wilde

8. Scripting and Programming

Abstract
The web is primarily focused on delivering content of various types using standardized transport mechanisms and data types. As a generalization of this architecture, there are also a number of technologies for defining active components, such as scripts and programs. The most widespread technology in this area is Dynamic HTML (DHTML), which uses a scripting language to add active functionality to otherwise passive HTML pages. In most cases, these actions are used to trigger some actions (ie, script portions) upon certain user actions such as mouse movements and button clicks. Scripting languages are discussed in section 8.1.
Erik Wilde

9. HTTP Servers

Abstract
Although HTTP as a protocol describes the way an HTTP server should interpret and service HTTP requests, there are many possible ways how this could be implemented in a program. A very simple and intuitive way is described by Hethmon [96], who uses the implementation of a simple HTTP server to explain HTTP as a protocol. In fact, the first HTTP server was a very small program mapping the name of the requested resource onto a file name sending the contents of the file as reply (the very first version of HTTP used at that time was HTTP/0.9 as described in section 3.1.1). However, the success of the web and the appearance of web servers hosting a large number of documents, as well as the increasing complexity of HTTP (adding new methods and a multitude of header fields), made the proper configuration and efficient management of an HTTP server a rather complex task.
Erik Wilde

10. Miscellaneous

Abstract
In addition to the basic architectural concepts described in the main chapters of this book, many additional technologies and concepts for the web infrastructure have been defined and implemented. Due to the speed of development, many new technologies and concepts appear every month. This chapter describes some of the more important components and concepts.
Erik Wilde

11. Related Technology

Abstract
The technologies described in this book are the most important ones for the web. However, they are only a subset of the technologies which are used for the web or in some web applications today. It is possible to divide these remaining technologies (which are not the main topic of this book) into two categories:
  • Underlying technologies Although the web technologies described in this book are sufficient to describe how the web works on an abstract level, a lot of other technologies are necessary to be able to apply web technologies to built a working system. For example, computer and network technologies, ranging from CPUs to optical fiber networks, have to be used to provide the infrastructure which is used to implement web technologies.
    Because these technologies are on a lower level of abstraction than web technologies (which simply assume that for example data can be sent reliably from one computer to another), they are most often referred to as underlying technologies.
  • Complementary technologies Although the web technologies described in this book are sufficient to implement a complete web browser, the trend goes towards integrated solutions. The leading browsers contain much more than just web technologies, they also incorporate a number of other access protocols for information resources (such as FTP and gopher), as well as support for using electronic mail and Usenet news.
Erik Wilde

Backmatter

Weitere Informationen