Skip to main content
main-content

Über dieses Buch

Although the World Wide Web is enjoying enormous growth rates, many Web publishers have discovered that HTML is not up to the requirements of modern corporate communication. For them, Adobe Acrobat offers a wealth of design possibilities. The close integration of Acrobat in the World Wide Web unites the structural advantages of HTML with the comprehensive layout possibilities of Portable Document Format (PDF). On the basis of practical examples and numerous tricks, this book describes how to produce PDF documents efficiently. Numerous tips on integrating Acrobat into CGI, JavaScript, VBScript, Active Server Pages, search engines, and so on make the book a mine of information for all designers and administrators of Web sites.

Inhaltsverzeichnis

Frontmatter

The Websurfer’s Point of View

Frontmatter

1. HTML and PDF

Abstract
Hypertext Markup Language (HTML) and Portable Document Format (PDF) are presentation formats: they describe documents which usually only need to be read and not changed or edited. This might seem rather strange for a file format, but corresponds to the age-old publishing model: a newspaper reader does not usually need to be able to alter the contents of his favorite paper. Publishing, the distribution of information to a small or large circle of readers is, for the most part, a one-way street. This does not exclude the possibility of interactive applications and the collaborative creation or editing of documents, but the operator of a Web site, not its readers or users, is the one responsible for its contents.
Thomas Merz

2. PDF in the Browser

Abstract
Every Web Browser can handle several file formats. Most browsers designed for Graphical User Interfaces (GUIs) can display HTML, text, GIF, and JPEG files in the program window. File formats which the browser does not “understand” itself can be coped with if the user configures an external program. The browser starts these programs as required and feeds them the data from the server. The browser and the server negotiate the type of data using the MIME protocol. MIME stands for “Multipurpose Internet Mail Extensions” and is a classification scheme for a multitude of file formats. When the server sends a particular MIME classification along with a requested file, the browser can do one of several things:
  • Use its own program code to process the data and display the file in its program window.
  • Start an external programs configured for that MIME type which processes the data and display the result.
  • If a browser plugin is registered for that MIME type, the plugin is used to process the data.
  • If the browser does not recognize the MIME type at all, the user can save the file to disk for later use.
Thomas Merz

The Publisher’s Point of View

Frontmatter

3. Planning PDF Documents

Abstract
This book does not advocate the replacement of HTML with PDF, but a reasonable combination of both formats. A Web designer ought to consider which format will best fulfill the requirements for each individual job. Each problem has to be solved within the framework of prevailing logistical, financial, and technical conditions. Mostly, it is necessary to consider not only the final goal, but also the origins of the documents: does the data to be made available on the Web originate from a database, as a by-product of a conventional publishing process or even archived paper documents? Which format will meet the expectations of the target group better and allow the data to be presented more easily? The main role of the document should also be clear: are the files just a method for transferring data which the user will probably print out anyway, or are they purely for online use? It can make sense to offer two versions in parallel, which of course increases the resources needed.
Thomas Merz

4. Creating PDF Files

Abstract
There are several ways of creating PDFs. They differ in the software required, functionality, and simplicity In the following section I would like to introduce the most important variants.
Thomas Merz

5. PDF Support in Applications

Abstract
In this chapter we will examine creating PDF files with various text, DTP and graphics programs. Of course, almost every program is “PDF capable” if you create PostScript output and run it through Acrobat Distiller. However, converting the data to PDF requires a lot of effort to insert the hypertext features using Exchange. If there are a large number of documents, or if the files need to be updated often, it is a good idea to specify as much PDF-relevant information as possible in the original document file. Ideally, these features (for example URLs for links to Web sites) are converted automatically to the corresponding PDF features. Adobe has defined the “pdfmark” PostScript operator which allows hypertext elements to be described in the PostScript code. Using pdfmark instructions, the PostScript data can be prepared so that Distiller will generate PDF files including links, bookmarks, and other hypertext elements — without the need for any post-processing in Exchange. A PDF-savvy program has two characteristics:
  • It offers access in the user interface to functions for generating those hypertext features which are supported. These include internal links or URLs included in the document. Many DTP programs support such functions by default.
  • When it outputs PostScript code, suitable pdfmark instructions to define these properties are included automatically. Programs which do this are still uncommon; several application can be enable with pluging or other extensions.
Thomas Merz

6. pdfmark Primer

Abstract
This chapter is devoted to pdfmark programming. The pdfmark operator is a PostScript extension which is only implemented in Acrobat Distiller (as opposed to PostScript printers). Using this operator, many non-layout-related features of a PDF file can be defined in the original document or in the corresponding PostScript code. Why bother with pdfmarks since you can implement these features in Acrobat Exchange? Contrary to adding hypertext features manually in Exchange, the pdfmark method has a big advantage in that you don’t have to redo all links and other special effects when document changes require generating a new PDF version. Instead, the hypertext features are automatically generated when distilling the PostScript file. It is very important to know that pdfmark instructions are processed in Acrobat Distiller only, but not in PDF Writer.
Thomas Merz

7. PDF Forms

Abstract
Forms are an important new feature introduced with Acrobat 3.0 and PDF 1.2. They offer huge potential for interactive applications. Firstly Acrobat forms emulate the well-known — and not very popular — paper forms containing descriptive text and input fields where the user may enter data. The filled-out form can be printed afterwards. This simplifies using conventional forms but doesn’t yet add substantial value. Electronic forms are more interesting if you no longer need to print them out. Instead, the form contents are stored on a disk file or transferred over the intranet or Internet. Web forms gained much popularity by making use of the relevant HTML features. Similarly, Acrobat documents may contain form fields, the contents of which are sent from the Web browser to the Web server. As you would expect from PDF, Acrobat forms offer many more design and layout features than HTML forms.
Thomas Merz

8. PDF in HTML Pages

Abstract
When PDF files are displayed in a Web browser’s window, there are two distinct variants:
The Web server sends a PDF file after a specific URL is typed in or called by a link. The Acrobat plugin processes the PDF data, takes control of the browser window, and displays the document in the window. Acrobat’s toolbar appears under the browser’s toolbar.
Part of an HTML page contains a PDF document (or several PDF documents). The PDF data is not directly embedded in the document, but is called by reference to a separate file (see Figure 8.1). This variant requires the use of special HTML tags.
Thomas Merz

The Webmaster’s Point of View

Frontmatter

9. PDF on the Web Server

Abstract
By its design, the Web isn’t restricted to a single file format. Instead, it is extensible and includes mechanisms for integrating new file types. So integrating PDF files into a Web server’s data stock seems hardly worth mentioning. However, there are a couple of configuration options which make for smooth PDF integration. First, there’s the MIME file classification scheme (“Multipurpose Internet Mail Extensions”) along with suitable icons for PDF files. To allow for page-at-time download of PDF files in the browser, the server must support the byterange protocol which is covered in the next section.
Thomas Merz

10. Form Data Processing

Abstract
This chapter builds on the discussion of PDF forms in Chapter 7. In particular, I’d like to point the reader to the treatment of the Forms Data Format (FDF) in Section 7.4. This chapter covers form processing from the server’s point of view, or more appropriately, the Webmaster’s. Again, we will use the Guagua sample form which can be used for ordering travel information. The form is shown in Figure 10.1 with the field names visible. Filling the fields and exporting the field contents (via “File”, “Export”, “Form Data…”) or sending the form to a Web server (with an appropriate PDF push button) yields FDF data similar to the following:
Thomas Merz

11. Full Text Retrieval and Search Engines

Abstract
Every Web surfer knows that you only tackle the unstructured information flooding in the World Wide Web with the help of search engines. There are millions of Web pages on thousands of servers. Some page somewhere surely contains the needed information — but which one is it? Web search engines read a high percentage of all available pages, and use the contents to build a database of words, a so-called full text index. Using this index, particular terms can be located quickly. Depending on the indexing and query software, complex queries — such as queries involving Boolean operators (and, or, not) for combining multiple terms — may be used.
Thomas Merz

12. Dynamic PDF

Abstract
The single most important trend in Web server development is dynamic content generation. Prefabricated Web pages on the server can’t satisfy all requirements for Internet and intranet applications. The restrictions imposed by static Web pages become evident in several situations:
  • Constantly changing information should be presented to the user in the most current version.
  • The contents of large databases would use too much disk space if stored as HTML pages.
  • Answers to queries defined by user input (e.g., in a form) can’t be stored ahead of time.
Thomas Merz

Backmatter

Weitere Informationen