XGitUrl: https://git.ucc.asn.au/?p=ipdf%2Fdocuments.git;a=blobdiff_plain;f=LitReviewDavid.tex;h=5cde5c65ade51a6f11b9a5882cb456c3217eac4b;hp=b6e75cc90731faa24632505ad674987fc813fcaf;hb=cedece74cb7204d175a411564861dd954c541547;hpb=f7fbe7987c6bb67c1b85741426dff5c8ce54edd8
diff git a/LitReviewDavid.tex b/LitReviewDavid.tex
index b6e75cc..5cde5c6 100644
 a/LitReviewDavid.tex
+++ b/LitReviewDavid.tex
@@ 2,6 +2,8 @@
\usepackage[utf8]{inputenc}
\usepackage{hyperref}
\usepackage{graphicx}
+\usepackage{amsmath}
+\usepackage{amssymb}
%opening
\title{Literature Review}
@@ 19,77 +21,125 @@ could be passed on from person to person without them ever meeting.
And thus the document was born.
Traditionally, documents have been static: just marks on paper, but with the advent of computers many more possibilities open up.
+Traditionally, documents have been static: just marks on rock, parchment or paper, but with the advent of computers many more possibilities open up.
\section{Document Formats}
Most existing document formats  such as the venerable PostScript and PDF  are, however, designed to imitate
existing paper documents, largely to allow for easy printing. In order to truly take advantage of the possibilities operating in the digital
domain opens up to us, we must look to new formats.
+\section{Rendering}
Formats such as \texttt{HTML} allow for a greater scope of interactivity and for a more datadriven model, allowing
the content of the document to be explored in ways that perhaps the author had not anticipated.\cite{hayes2012pixels}
However, these datadriven formats typically do not support fixed layouts, and the display differs from renderer to
renderer.
+Computer graphics comes in two forms: bitmapped (or raster) graphics, which is defined by an array of pixel colours;
+and \emph{vector} graphics, defined by mathematical descriptions of objects. Bitmapped graphics are well suited to photographs
+and match how cameras, printers and monitors work.
Ultimately, there are two fundamental stages by which all documents  digital or otherwise  are produced and displayed:
\emph{layout} and \emph{display}. The \emph{layout} stage is where the positions and sizes of text and other graphics are
determined, while the \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor.
Existing document formats, due to being designed to model paper,
have limited precision (8 decimal digits for PostScript\cite{plrm}, 5 decimal digits for PDF\cite{pdfref17}).
This matches the limited resolution of printers and ink, but is limited when compared to what aught to be possible
with ``zoom'' functionality, which is prevent from working beyond a limited scale factor, lest artefacts appear due
to issues with numeric precision.
+\begin{figure}[h]
+ \centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example}
+ \caption{A circle as a vector image and a $32 \times 32$ pixel raster image}
+\end{figure}
\section{Rendering}
Computer graphics comes in two forms: bitmapped (or raster) graphics, which is defined by an array of pixel colours,
and \emph{vector} graphics, defined by mathematical descriptions of objects. Bitmapped graphics are well suited to photographs
and are match how cameras, printers and monitors work. However, bitmap devices do not handle zooming beyond their
``native'' resolution  the resolution where one document pixel maps to one display pixel , exhibiting an artefact
called pixelation where the pixel structure becomes evident. Attempts to use interpolation to hide this effect are
never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly effected.
+However, bitmap devices do not handle zooming beyond their ``native'' resolution (the resolution where one document pixel maps
+to one display pixel), exhibiting an artefact called pixelation where the pixel structure becomes evident. Attempts to use
+interpolation to hide this effect are never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected.
Vector graphics lack many of these problems: the representation is independent of the output resolution, and rather
+Vector graphics avoid many of these problems: the representation is independent of the output resolution, and rather
an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines,
arcs and ``B\'ezier curves''.
+arcs and glyphs.
+
As existing displays (and printers) are bitmapped devices, vector documents must be \emph{rasterized} into a bitmap at
a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image
at that resolution.
+% Project specific line
This project will be based around vector graphics, as these properties make it more suited to experimenting with zoom
quality.

The rasterization process typically operates on an individual ``object'' or ``shape'' at a time: there are special algorithms
for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier
Curves\cite{goldman_thefractal}. Typically, these are rasterized independently and composited in the bitmap domain using PorterDuff
compositing\cite{porter1984compositing} into a single image. This allows complex images to be formed from many simple pieces, as well
as allowing for layered translucent objects, which would otherwise require the solution of some very complex constructive geometry problems.

+\subsection{Rasterizing Vector Graphics}
+
+Before an vector document can be rasterized, the coordinates of any shapes must
+be transformed into \emph{screen space} or \emph{viewport space}\cite{blinn1992trip}.
+On a typical display, many of these screenspace coordinates require very little precision or range.
+However, the coordinate transform must take care to ensure that precision is preserved during this transform.
+
+After this transformation, the image is decomposed into its separate shapes, which are rasterized
+and then composited together.
+Most graphics formats support PorterDuff compositing\cite{porter1984compositing}.
+PorterDuff compositing gives each element (typically a pixel) a ``coverage'' value,
+denoted $\alpha$ which represents the contribution of that element to the final scene.
+Completely transparent elements would have an $\alpha$ value of $0$, and completely opaque
+elements have an $\alpha$ of $1$. This permits arbitrary shapes to be layered over one another
+in the raster domain, while retaining softedges.
+
+The rasterization process may then proceed on one object (or shape) at a time. There are special algorithms for rasterizing
+different shapes.
+
+\begin{description}
+ \item[Line Segment]
+ Straight lines between two points are easily rasterized using Bresenham's algorithm\cite{bresenham1965algorithm}.
+ Bresenham's algorithm draws a pixel for every point along the \emph{long} axis of the line, moving along the short
+ axis when the error exceeds $\frac{1}{2}$ a pixel.
+
+ Bresenham's algorithm only operates on lines whose endpoints lie on integer pixel coordinates. Due to this, line ``clipping''
+ may be performed to find endpoints of the line segment such that the entire line will be onscreen. However, if line clipping is
+ performed na\"ively without also setting the error accumulator correctly, the line's slope will be altered slightly, becoming dependent
+ on the viewport.
+
+ \item[B\'ezier Curve]
+ A B\'ezier curve is a smooth (i.e.\ infinitely differentiable) curve between two points, represented by a Bernstein polynomial.
+ The coefficients of this Bernstein polynomial are known as the ``control points.''
+
+ B\'ezier curves are typically rasterized using De Casteljau's algorithm\cite{foley1996computer}
+ Line Segments are a firstorder B\'ezier curve.
+
+ \item[B\'ezier Spline]
+ A spline of order $n$ is a $C^{n1}$ smooth continuous piecewise function composed of polynomials of degree $\leq n$.
+ In a B\'ezier spline, these polynomials are Bernstein polynomials, hence the spline is a curve made by joining B\'ezier curves
+ endtoend (in a manner which preserves some level of smoothness).
+
+ Many vector graphics formats call B\'ezier splines of a given order (typically quadratic or cubic) ``paths'' and treat them as the
+ fundamental type from which shapes are formed.
+\end{description}
+
+
+%There are special algorithms
+%for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier
+%Curves\cite{goldman_thefractal}.
+
+\subsection{GPU Rendering}
While traditionally, rasterization was done entirely in software, modern computers and mobile devices have hardware support for rasterizing
some basic primitives  typically lines and triangles , designed for use rendering 3D scenes. This hardware is usually programmed with an
+lines and triangles designed for use rendering 3D scenes. This hardware is usually programmed with an
API like \texttt{OpenGL}\cite{openglspec}.
More complex shapes like B\'ezier curves can be rendered by combining the use of bitmapped textures (possibly using signeddistance
fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) with polygons approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}.
+fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) strtched over a triangle mesh
+approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}.
Indeed, there are several implementations of entire vector graphics systems using OpenGL: OpenVG\cite{robart2009openvg} on top of OpenGL ES\cite{oh2007implementation};
the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz} and the SVG/PostScript GPU
renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}.
+Indeed, there are several implementations of entire vector graphics systems using OpenGL:
+\begin{itemize}
+ \item The OpenVG standard\cite{robart2009openvg} has been implemented on top of OpenGL ES\cite{oh2007implementation};
+ \item the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz}
+ \item and the SVG/PostScript GPU renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}.
+\end{itemize}
\section{FloatingPoint Precision}
+\section{Numeric formats}
On modern computer architectures, there are two basic number formats supported:
fixedwidth integers and \emph{floatingpoint} numbers. Typically, computers
natively support integers of up to 64 bits, capable of representing all integers
between $0$ and $2^{64}  1$\footnote{Most machines also support \emph{signed} integers,
+between $0$ and $2^{64}  1$, inclusive\footnote{Most machines also support \emph{signed} integers,
which have the same cardinality as their \emph{unsigned} counterparts, but which
represent integers between $(2^{63})$ and $2^{63}  1$}.
+represent integers in the range $[(2^{63}), 2^{63}  1]$}.
Floatingpoint numbers\cite{goldberg1991whatevery} are the binary equivalent of scientific notation:
+By introducing a fractional component (analogous to a decimal point), we can convert
+integers to \emph{fixedpoint} numbers, which have a more limited range, but a fixed, greater
+precision. For example, a number in 4.4 fixedpoint format would have four bits representing the integer
+component, and four bits representing the fractional component:
+\begin{equation}
+ \underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75
+\end{equation}
+
+
+Floatingpoint numbers\cite{goldberg1992thedesign} are the binary equivalent of scientific notation:
each number consisting of an exponent ($e$) and a mantissa ($m$) such that a number is given by
\begin{equation}
n = 2^{e} \times m
@@ 98,22 +148,178 @@ each number consisting of an exponent ($e$) and a mantissa ($m$) such that a num
The IEEE 754 standard\cite{ieee754std1985} defines several floatingpoint data types
which are used\footnote{Many systems' implement the IEEE 754 standard's storage formats,
but do not implement arithmetic operations in accordance with this standard.} by most
computer systems. The standard defines 32bit (8bit exponent, 23bit mantissa) and
64bit (11bit exponent, 53bit mantissa) formats\footnote{The 2008
+computer systems. The standard defines 32bit (8bit exponent, 23bit mantissa, 1 sign bit) and
+64bit (11bit exponent, 53bit mantissa, 1 sign bit) formats\footnote{The 2008
revision to this standard\cite{ieee754std2008} adds some additional formats, but is
less widely supported in hardware.}
+less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits
+of precision respectively.
+
+Floatingpoint numbers behave quite differently to integers or fixedpoint numbers, as
+the representable numbers are not evenly distributed. Large numbers are stored to a lesser
+precision than numbers close to zero. This can present problems in documents when zooming in
+on objects far from the origin.
+
+IEEE floatingpoint has some interesting features as well, including values for negative zero,
+positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{denormal} values, which
+trade precision for range when dealing with very small numbers. Indeed, with these values,
+IEEE 754 floatingpoint equality does not form an equivalence relation, which can cause issues
+when not considered carefully.\cite{goldberg1991whatevery}
+
+There also exist formats for storing numbers with arbitrary precising and/or range.
+Some programming languages support ``big integer''\cite{java_bigint} types which can
+represent any integer that can fit in the system's memory. Similarly, there are
+arbitraryprecision floatingpoint data types\cite{java_bigdecimal}\cite{boost_multiprecision}
+which can represent any number of the form
+\begin{equation}
+ \frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed.
+\end{equation}
+These types are typically built from several native data types such as integers and floats,
+paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms}
+These, therefore, are likely slower than the native types they are built on.
How floatingpoint works and what its behaviour is w/r/t range and precision
\cite{goldberg1991whatevery}
\cite{goldberg1992thedesign}
+Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows
+values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floatingpoint formats,
+this would have a recurring representation:
+\begin{equation}
+ \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots
+\end{equation}
+Whereas with a rational type, this is simply $\frac{1}{3}$.
+Rationals do not have a unique representation for each value, typically the reduced fraction is used
+as a characteristic element.
+
+While traditionally, GPUs have supported some approximation of IEEE 754's 32bit floats,
+modern graphics processors also support 16bit\cite{nv_half_float} and 64bit\cite{arb_gpu_shader_fp64}
+IEEE floats, though some features of IEEE floats, like denormals and NaNs are not always supported.
+Note, however, that some parts of the GPU are only able to use some formats,
+so precision will likely be truncated at some point before display.
+Higher precision numeric types can be implemented or used on the GPU, but are
+slow.\cite{emmart2010high}
Arb. precision exists
Higher precision numeric types can be implemented or used on the GPU, but are
slow.
\cite{emmart2010high}
+\section{Document Formats}
+
+Most existing document formats  such as the venerable PostScript and PDF  are, however, designed to imitate
+existing paper documents, largely to allow for easy printing. In order to truly take advantage of the possibilities operating in the digital
+domain opens up to us, we must look to new formats.
+
+Formats such as \texttt{HTML} allow for a greater scope of interactivity and for a more datadriven model, allowing
+the content of the document to be explored in ways that perhaps the author had not anticipated.\cite{hayes2012pixels}
+However, these datadriven formats typically do not support fixed layouts, and the display differs from renderer to
+renderer.
+
+\subsection{A Taxonomy of Document formats}
+
+The process of creating and displaying a document is a rather universal one (\ref{documenttimeline}), though
+different document formats approach it slightly differently. A document often begins as raw content: text and images
+(be they raster or vector) and it must end up as a stream of photons flying towards the reader's eyes.
+
+\begin{figure}
+ \label{documenttimeline}
+ \centering \includegraphics[width=0.8\linewidth]{figures/documenttimeline}
+ \caption{The lifecycle of a document}
+\end{figure}
+There are two fundamental stages by which all documents  digital or otherwise  are produced and displayed:
+\emph{layout} and \emph{rendering}. The \emph{layout} stage is where the positions and sizes of text and other graphics are
+determined. The text will be \emph{flowed} around graphics, the positions of individual glyphs will be placed, ensuring
+that there is no undesired overlap and that everything will fit on the page or screen.
+
+The \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor.
+Each graphical element is rasterized and composited into a single image of the target resolution.
+
+
+Different document formats cover documents in different stages of this project. Bitmapped images,
+for example, would represent the output of the final stage of the process, whereas markup languages typically specify
+a document which has not yet been processed, ready for the layout stage.
+
+Furthermore, some document formats treat the document as a program, written in
+a (usually turing complete) document language with instructions which emit shapes to be displayed. These shapes are either displayed
+immediately, as in PostScript, or stored in another file, such as with \TeX or \LaTeX, which emit a \texttt{DVI} file. Most other
+forms of document use a \emph{Document Object Model}, being a list or tree of objects to be rendered. \texttt{DVI}, \texttt{PDF},
+\texttt{HTML}\footnote{Some of these formats  most notably \texttt{HTML}  implement a scripting lanugage such as JavaScript,
+which permit the DOM to be modified while the document is being viewed.} and SVG\cite{svg20111.1}. Of these, only \texttt{HTML} and \TeX typically
+store documents in prelayout stages, whereas even turing complete document formats such as PostScript typically encode documents
+which already have their elements placed.
+
+\begin{description}
+ \item[\TeX \, and \LaTeX]
+ Donald Knuth's typesetting language \TeX \, is one of the older computer typesetting systems, originally conceived in 1977\cite{texdraft}.
+ It implements a turingcomplete language and is humanreadable and writable, and is still popular
+ due to its excellent support for typesetting mathematics.
+ \TeX only implements the ``layout'' stage of document display, and produces a typeset file,
+ traditionally in \texttt{DVI} format, though modern implementations will often target \texttt{PDF} instead.
+
+ This document was prepared in \LaTeXe.
+
+ \item[DVI]
+ \TeX \, traditionally outputs to the \texttt{DVI} (``DeVice Independent'') format: a binary format which consists of a
+ simple stack machine with instructions for drawing glyphs and curves\cite{fuchs1982theformat}.
+
+ A \texttt{DVI} file is a representation of a document which has been typeset, and \texttt{DVI}
+ viewers will rasterize this for display or printing, or convert it to another similar format like PostScript
+ to be rasterized.
+
+ \item[HTML]
+ The Hypertext Markup Language (HTML)\cite{html2rfc} is the widely used document format which underpins the
+ world wide web. In order for web pages to adapt appropriately to different devices, the HTML format simply
+ defined semantic parts of a document, such as headings, phrases requiring emphasis, references to images or links
+ to other pages, leaving the \emph{layout} up to the browser, which would also rasterize the final document.
+
+ The HTML format has changed significantly since its introduction, and most of the layout and styling is now controlled
+ by a set of style sheets in the CSS\cite{css2spec} format.
+
+ \item[PostScript]
+ Much like DVI, PostScript\cite{plrm} is a stackbased format for drawing vector graphics, though unlike DVI (but like \TeX), PostScript is
+ textbased and turing complete. PostScript was traditionally run on a control board in laser printers, rasterizing pages at high resolution
+ to be printed, though PostScript interpreters for desktop systems also exist, and are often used with printers which do not support PostScript natively.\cite{ghostscript}
+
+ PostScript programs typically embody documents which have been typeset, though as a turingcomplete language, some layout can be performed by the document.
+
+ \item[PDF]
+ Adobe's Portable Document Format (PDF)\cite{pdfref17} takes the PostScript rendering model, but does not implement a turingcomplete language.
+ Later versions of PDF also extend the PostScript rendering model to support translucent regions via PorterDuff compositing\cite{porter1984compositing}.
+
+ PDF documents represent a particular layout, and must be rasterized before display.
+
+ \item[SVG]
+ Scalable Vector Graphics (SVG) is a vector graphics document format\cite{svg20111.1} which uses the Document Object Model. It consists of a tree of matrix transforms,
+ with objects such as vector paths (made up of B\'ezier curves) and text at the leaves.
+
+\end{description}
+
+\subsection{Precision in Document Formats}
+
+Existing document formats  typically due to having been designed for documents printed on paper, which of course has
+limited size and resolution  use numeric types which can only represent a fixed range and precision.
+While this works fine with printed pages, users reading documents on computer screens using programs
+with ``zoom'' functionality are prevented from working beyond a limited scale factor, lest artefacts appear due
+to issues with numeric precision.
+
+\TeX uses a $14.16$ bit fixed point type (implemented as a 32bit integer type, with one sign bit and one bit used to detect overflow)\cite{beebe2007extending}.
+This can represent values in the range $[(2^14), 2^14  1]$ with 16 binary digits of fractional precision.
+
+The DVI files \TeX \, produces may use ``up to'' 32bit signed integers\cite{fuchs1982theformat} to specify the document, but there is no requirement that
+implementations support the full 32bit type. It would be permissible, for example, to have a DVI viewer support only 24bit signed integers, though many
+files which require greater range may fail to render correctly.
+PostScript\cite{plrm} supports two different numeric types: \emph{integers} and \emph{reals}, both of which are specified as strings. The interpreter's representation of numbers
+is not exposed, though the representation of integers can be divined by a program by the use of bitwise operations. The PostScript specification lists some ``typical limits''
+of numeric types, though the exact limits may differ from implementation to implementation. Integers typically must fall in the range $[2^{31}, 2^{31}  1]$,
+and reals are listed to have largest and smallest values of $\pm10^{38}$, values closest to $0$ of $\pm10^{38}$ and approximately $8$ decimal digits of precision,
+derived from the IEEE 754 singleprecision floatingpoint specification.
+
+Similarly, the PDF specification\cite{pdfref17} stores \emph{integers} and \emph{reals} as strings, though in a more restricted format than PostScript.
+The PDF specification gives limits for the internal representation of values. Integer limits have not changed from the PostScript specification, but numbers
+representable with the \emph{real} type have been specified differently: the largest representable values are $\pm 3.403\times 10^{38}$, the smallest nonzero representable values are
+\footnote{The PDF specification mistakenly leaves out the negative in the exponent here.}
+$\pm1.175 \times 10^{38}$ with approximately $5$ decimal digits of precision \emph{in the fractional part}.
+Adobe's implementation of PDF uses both IEEE 754 single precision floatingpoint numbers and (for some calculations, and in previous versions) 16.16 bit fixedpoint values.
+
+The SVG specification\cite{svg20111.1} specifies numbers as strings with a decimal representation of the number.
+It is stated that a ``Conforming SVG Viewer'' must have ``all visual rendering accurate to within one device pixel to the mathematically correct result at the initial 1:1
+zoom ratio'' and that ``it is suggested that viewers attempt to keep a high degree of accuracy when zooming.''
+A ``Conforming HighQuality SVG Viewer'' must use ``doubleprecision floating point\footnote{Presumably the 64bit IEEE 754 ``double'' type.}'' for computations involving
+coordinate system transformations.
\section{Quadtrees}
When viewing or processing a small part of a large document, it may be helpful to
@@ 126,11 +332,16 @@ only processs  or \emph{cull}  parts of the document which are not onscre
The quadtree\cite{finkel1974quad}is a data structure  one of a family of \emph{spatial}
data structures  which recursively breaks down space into smaller subregions
which can be processed independently. Points (or other objects) are added to a single
node, which if certain criteria are met  typically the number of points in a node
exceeding a maximum, though in our case likely the level of precision required exceeding
that supported by the data type in use  is split into four equalsized subregions, and
+node which (if certain criteria are met) is split into four equalsized subregions, and
points attached to the region which contains them.
+Quadtrees have been used in computer graphics for both culling  excluding objects in
+nodes which are not visible  and ``level of detail'', where different levels of the quadtree store
+different quality versions of objects or data\cite{zerbst2004game}.
+Typically the number of points in a node
+exceeding a maximum triggers this split, though in our case likely the level of precision required exceeding
+that supported by the data type in use.
+
In this project, we will be experimenting with a form of quadtree in which each
node has its own independent coordinate system, allowing us to store some spatial
information\footnote{One bit percoordinate, perlevel of the quadtree} within the