Amazing paper + Document Taxonomy

[ipdf/documents.git] / LitReviewDavid.tex
diff --git a/LitReviewDavid.tex b/LitReviewDavid.tex

index a8947e8..5605cef 100644 (file)
--- a/LitReviewDavid.tex
+++ b/LitReviewDavid.tex
@@ -34,12 +34,33 @@ the content of the document to be explored in ways that perhaps the author had n
  However, these data-driven formats typically do not support fixed layouts, and the display differs from renderer to
  renderer.
  
-Ultimately, there are two fundamental stages by which all documents --- digital or otherwise --- are produced and displayed:
-\emph{layout} and \emph{display}. The \emph{layout} stage is where the positions and sizes of text and other graphics are
-determined, while the \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor.
+\subsection{A Taxonomy of Document formats}
  
-Different document formats approach these stages in different ways. Some treat the document as a program, written in
-a turing complete document language with instructions which emit shapes to be displayed. These shapes are either displayed
+The process of creating and displaying a document is a rather universal one (\ref{documenttimeline}), though
+different document formats approach it slightly differently. A document often begins as raw content: text and images
+(be they raster or vector) and it must end up as a set of photons flying towards the reader's eyes.
+
+\begin{figure}
+       \label{documenttimeline}
+       \centering \includegraphics[width=0.8\linewidth]{figures/documenttimeline}
+       \caption{The lifecycle of a document}
+\end{figure}
+
+There are two fundamental stages by which all documents --- digital or otherwise --- are produced and displayed:
+\emph{layout} and \emph{rendering}. The \emph{layout} stage is where the positions and sizes of text and other graphics are
+determined. The text will be \emph{flowed} around graphics, the positions of individual glyphs will be placed, ensuring
+that there is no undesired overlap and that everything will fit on the page or screen.
+
+The \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor.
+Each graphical element is rasterized and composited into a single image of the target resolution.
+
+
+Different document formats cover documents in different stages of this project. Bitmapped images,
+for example, would represent the output of the final stage of the process, whereas markup languages typically specify
+a document which has not yet been processed, ready for the layout stage. 
+
+Furthermore, some document formats treat the document as a program, written in
+a (usually turing complete) document language with instructions which emit shapes to be displayed. These shapes are either displayed
  immediately, as in PostScript, or stored in another file, such as with \TeX or \LaTeX, which emit a \texttt{DVI} file. Most other
  forms of document use a \emph{Document Object Model}, being a list or tree of objects to be rendered. \texttt{DVI}, \texttt{PDF},
  \texttt{HTML}\footnote{Some of these formats --- most notably \texttt{HTML} --- implement a scripting lanugage such as JavaScript,
@@ -47,12 +68,76 @@ which permit the DOM to be modified while the document is being viewed.} and SVG
  store documents in pre-layout stages, whereas even turing complete document formats such as PostScript typically encode documents
  which already have their elements placed.
  
-Existing document formats, due to being designed to model paper,
-have limited precision (8 decimal digits for PostScript\cite{plrm}, 5 decimal digits for PDF\cite{pdfref17}).
-This matches the limited resolution of printers and ink, but is limited when compared to what aught to be possible
-with ``zoom'' functionality, which is prevent from working beyond a limited scale factor, lest artefacts appear due
+\begin{description}
+       \item[\TeX \, and \LaTeX]
+       Donald Knuth's typesetting language \TeX \, is one of the older computer typesetting systems, originally conceived in 1977\cite{texdraft}.
+       It implements a turing-complete language and is human-readable and writable, and is still popular
+       due to its excellent support for typesetting mathematics.
+       \TeX only implements the ``layout'' stage of document display, and produces a typeset file,
+       traditionally in \texttt{DVI} format, though modern implementations will often target \texttt{PDF} instead.
+       
+       This document was prepared in \LaTeXe.
+       
+       \item[DVI]
+       \TeX \, traditionally outputs to the \texttt{DVI} (``DeVice Independent'') format: a binary format which consists of a
+       simple stack machine with instructions for drawing glyphs and curves\cite{fuchs1982theformat}.
+       
+       A \texttt{DVI} file is a representation of a document which has been typeset, and \texttt{DVI}
+       viewers will rasterize this for display or printing, or convert it to another similar format like PostScript
+       to be rasterized.
+       
+       \item[HTML]
+       The Hypertext Markup Language (HTML)\cite{html2rfc} is the widely used document format which underpins the
+       world wide web. In order for web pages to adapt appropriately to different devices, the HTML format simply
+       defined semantic parts of a document, such as headings, phrases requiring emphasis, references to images or links
+       to other pages, leaving the \emph{layout} up to the browser, which would also rasterize the final document.
+       
+       The HTML format has changed significantly since its introduction, and most of the layout and styling is now controlled
+       by a set of style sheets in the CSS\cite{css2spec} format.
+       
+       \item[PostScript]
+       Much like DVI, PostScript\cite{plrm} is a stack-based format for drawing vector graphics, though unlike DVI (but like \TeX), PostScript is
+       text-based and turing complete. PostScript was traditionally run on a control board in laser printers, rasterizing pages at high resolution
+       to be printed, though PostScript interpreters for desktop systems also exist, and are often used with printers which do not support PostScript natively.\cite{ghostscript}
+       
+       PostScript programs typically embody documents which have been typeset, though as a turing-complete language, some layout can be performed by the document.
+       
+       \item[PDF]
+       Adobe's Portable Document Format (PDF)\cite{pdfref17} takes the PostScript rendering model, but does not implement a turing-complete language.
+       Later versions of PDF also extend the PostScript rendering model to support translucent regions via Porter-Duff compositing\cite{porter1984compositing}.
+       
+       PDF documents represent a particular layout, and must be rasterized before display.
+\end{description}
+
+\subsection{Precision in Document Formats}
+
+Existing document formats --- typically due to having been designed for documents printed on paper, which of course has
+limited size and resolution --- use numeric types which can only represent a fixed range and precision.
+While this works fine with printed pages, users reading documents on computer screens using programs
+with ``zoom'' functionality are prevented from working beyond a limited scale factor, lest artefacts appear due
  to issues with numeric precision.
  
+\TeX uses a $14.16$ bit fixed point type (implemented as a 32-bit integer type, with one sign bit and one bit used to detect overflow)\cite{beebe2007extending}.
+This can represent values in the range $[-(2^14), 2^14 - 1]$ with 16 binary digits of fractional precision.
+
+The DVI files \TeX \, produces may use ``up to'' 32-bit signed integers\cite{fuchs1982theformat} to specify the document, but there is no requirement that
+implementations support the full 32-bit type. It would be permissible, for example, to have a DVI viewer support only 24-bit signed integers, though many
+files which require greater range may fail to render correctly.
+
+PostScript\cite{plrm} supports two different numeric types: \emph{integers} and \emph{reals}, both of which are specified as strings. The interpreter's representation of numbers
+is not exposed, though the representation of integers can be divined by a program by the use of bitwise operations. The PostScript specification lists some ``typical limits''
+of numeric types, though the exact limits may differ from implementation to implementation. Integers typically must fall in the range $[-2^{31}, 2^{31} - 1]$,
+and reals are listed to have largest and smallest values of $\pm10^{38}$, values closest to $0$ of $\pm10^{-38}$ and approximately $8$ decimal digits of precision,
+derived from the IEEE 754 single-precision floating-point specification.
+
+Similarly, the PDF specification\cite{pdfref17} stores \emph{integers} and \emph{reals} as strings, though in a more restricted format than PostScript.
+The PDF specification gives limits for the internal representation of values. Integer limits have not changed from the PostScript specification, but numbers
+representable with the \emph{real} type have been specified differently: the largest representable values are $\pm 3.403\times 10^{38}$, the smallest non-zero representable values are
+\footnote{The PDF specification mistakenly leaves out the negative in the exponent here.}
+$\pm1.175 \times 10^{-38}$ with approximately $5$ decimal digits of precision \emph{in the fractional part}.
+Adobe's implementation of PDF uses both IEEE 754 single precision floating-point numbers and (for some calculations, and in previous versions) 16.16 bit fixed-point values.
+
+
  \section{Rendering}
  
  Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours; 
@@ -70,7 +155,7 @@ never entirely successful, and sharp edges, such as those found in text and diag
  
  Vector graphics lack many of these problems: the representation is independent of the output resolution, and rather
  an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines,
-arcs and ``B\'ezier curves''. 
+arcs and ``B\'ezier curves''\cite{catmull1974asubdivision}. 
  As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at
  a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image
  at that resolution.
@@ -102,9 +187,9 @@ renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300pr
  On modern computer architectures, there are two basic number formats supported:
  fixed-width integers and \emph{floating-point} numbers. Typically, computers
  natively support integers of up to 64 bits, capable of representing all integers
-between $0$ and $2^{64} - 1$\footnote{Most machines also support \emph{signed} integers,
+between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers,
  which have the same cardinality as their \emph{unsigned} counterparts, but which
-represent integers between $-(2^{63})$ and $2^{63} - 1$}.
+represent integers in the range $[-(2^{63}), 2^{63} - 1]$}.
  
  By introducing a fractional component (analogous to a decimal point), we can convert
  integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater
@@ -153,14 +238,22 @@ These types are typically built from several native data types such as integers
  paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms}
  These, therefore, are likely slower than the native types they are built on.
  
-
  While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats,
  modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64}
-IEEE floats. 
+IEEE floats.  Note, however, that some parts of the GPU are only able to use some formats,
+so precision will likely be truncated at some point before display.
  Higher precision numeric types can be implemented or used on the GPU, but are
-slow.
-\cite{emmart2010high}
+slow.\cite{emmart2010high}
  
+Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows
+values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats,
+this would have a recurring representation:
+\begin{equation}
+       \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots
+\end{equation}
+Whereas with a rational type, this is simply $\frac{1}{3}$.
+Rationals do not have a unique representation for each value, typically the reduced fraction is used
+as a characteristic element.
  
  
  \section{Quadtrees}