Merging to make sure I have the latest papers.bib

[ipdf/documents.git] / LitReviewDavid.tex
diff --git a/LitReviewDavid.tex b/LitReviewDavid.tex

index edd8b2b..31f4968 100644 (file)
--- a/LitReviewDavid.tex
+++ b/LitReviewDavid.tex
@@ -2,6 +2,8 @@
  \usepackage[utf8]{inputenc}
  \usepackage{hyperref}
  \usepackage{graphicx}
  \usepackage[utf8]{inputenc}
  \usepackage{hyperref}
  \usepackage{graphicx}
+\usepackage{amsmath}
+\usepackage{amssymb}
  
  %opening
  \title{Literature Review}
  
  %opening
  \title{Literature Review}
@@ -32,6 +34,65 @@ the content of the document to be explored in ways that perhaps the author had n
  However, these data-driven formats typically do not support fixed layouts, and the display differs from renderer to
  renderer.
  
  However, these data-driven formats typically do not support fixed layouts, and the display differs from renderer to
  renderer.
  
+\subsection{A Taxonomy of Document formats}
+
+The process of creating and displaying a document is a rather universal one (\ref{documenttimeline}), though
+different document formats approach it slightly differently. A document often begins as raw content: text and images
+(be they raster or vector) and it must end up as a set of photons flying towards the reader's eyes.
+
+\begin{figure}
+       \label{documenttimeline}
+       \centering \includegraphics[width=0.8\linewidth]{figures/documenttimeline}
+       \caption{The lifecycle of a document}
+\end{figure}
+
+There are two fundamental stages by which all documents --- digital or otherwise --- are produced and displayed:
+\emph{layout} and \emph{rendering}. The \emph{layout} stage is where the positions and sizes of text and other graphics are
+determined. The text will be \emph{flowed} around graphics, the positions of individual glyphs will be placed, ensuring
+that there is no undesired overlap and that everything will fit on the page or screen.
+
+The \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor.
+Each graphical element is rasterized and composited into a single image of the target resolution.
+
+
+Different document formats cover documents in different stages of this project. Bitmapped images,
+for example, would represent the output of the final stage of the process, whereas markup languages typically specify
+a document which has not yet been processed, ready for the layout stage. 
+
+Furthermore, some document formats treat the document as a program, written in
+a (usually turing complete) document language with instructions which emit shapes to be displayed. These shapes are either displayed
+immediately, as in PostScript, or stored in another file, such as with \TeX or \LaTeX, which emit a \texttt{DVI} file. Most other
+forms of document use a \emph{Document Object Model}, being a list or tree of objects to be rendered. \texttt{DVI}, \texttt{PDF},
+\texttt{HTML}\footnote{Some of these formats --- most notably \texttt{HTML} --- implement a scripting lanugage such as JavaScript,
+which permit the DOM to be modified while the document is being viewed.} and SVG\cite{svg2011-1.1}. Of these, only \texttt{HTML} and \TeX typically
+store documents in pre-layout stages, whereas even turing complete document formats such as PostScript typically encode documents
+which already have their elements placed.
+
+\begin{description}
+       \item[\TeX \, and \LaTeX]
+       Donald Knuth's typesetting language \TeX \, is one of the older computer typesetting systems, originally conceived in 1977\cite{texdraft}.
+       It implements a turing-complete language and is human-readable and writable, and is still popular
+       due to its excellent support for typesetting mathematics.
+       \TeX only implements the ``layout'' stage of document display, and produces a typeset file,
+       traditionally in \texttt{DVI} format, though modern implementations will often target \texttt{PDF} instead.
+       
+       This document was prepared in \LaTeXe.
+       
+       \item[\texttt{DVI}]
+       \TeX \, traditionally outputs to the \texttt{DVI} format: a binary format which consists of a
+       simple stack machine with instructions for drawing glyphs and curves\cite{fuchs1982theformat}.
+       
+       A \texttt{DVI} file is a representation of a document which has been typeset, and \texttt{DVI}
+       viewers will rasterize this for display or printing, or convert it to another similar format like PostScript
+       to be rasterized.
+       
+       \item[\texttt{HTML}]
+       
+       
+\end{description}
+
+
+
  Existing document formats, due to being designed to model paper,
  have limited precision (8 decimal digits for PostScript\cite{plrm}, 5 decimal digits for PDF\cite{pdfref17}).
  This matches the limited resolution of printers and ink, but is limited when compared to what aught to be possible
  Existing document formats, due to being designed to model paper,
  have limited precision (8 decimal digits for PostScript\cite{plrm}, 5 decimal digits for PDF\cite{pdfref17}).
  This matches the limited resolution of printers and ink, but is limited when compared to what aught to be possible
@@ -40,16 +101,22 @@ to issues with numeric precision.
  
  \section{Rendering}
  
  
  \section{Rendering}
  
-Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours, 
+Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours; 
  and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs
  and are match how cameras, printers and monitors work. However, bitmap devices do not handle zooming beyond their
  ``native'' resolution --- the resolution where one document pixel maps to one display pixel ---, exhibiting an artefact
  called pixelation where the pixel structure becomes evident. Attempts to use interpolation to hide this effect are
  and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs
  and are match how cameras, printers and monitors work. However, bitmap devices do not handle zooming beyond their
  ``native'' resolution --- the resolution where one document pixel maps to one display pixel ---, exhibiting an artefact
  called pixelation where the pixel structure becomes evident. Attempts to use interpolation to hide this effect are
-never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly effected.
+never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected.
+
+\begin{figure}[h]
+       \centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example}
+       \caption{A circle as a vector image and a $32 \times 32$ pixel raster image}
+\end{figure}
+
  
  Vector graphics lack many of these problems: the representation is independent of the output resolution, and rather
  an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines,
  
  Vector graphics lack many of these problems: the representation is independent of the output resolution, and rather
  an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines,
-arcs and ``B\'ezier curves''. 
+arcs and ``B\'ezier curves''\cite{catmull1974asubdivision}. 
  As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at
  a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image
  at that resolution.
  As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at
  a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image
  at that resolution.
@@ -76,17 +143,26 @@ the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering m
  renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}.
  
  
  renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}.
  
  
-\section{Floating-Point Precision}
+\section{Numeric formats}
  
  On modern computer architectures, there are two basic number formats supported:
  fixed-width integers and \emph{floating-point} numbers. Typically, computers
  natively support integers of up to 64 bits, capable of representing all integers
  
  On modern computer architectures, there are two basic number formats supported:
  fixed-width integers and \emph{floating-point} numbers. Typically, computers
  natively support integers of up to 64 bits, capable of representing all integers
-between $0$ and $2^{64} - 1$\footnote{Most machines also support \emph{signed} integers,
+between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers,
  which have the same cardinality as their \emph{unsigned} counterparts, but which
  which have the same cardinality as their \emph{unsigned} counterparts, but which
-represent integers between $-(2^{63})$ and $2^{63} - 1$}.
+represent integers in the range $[-(2^{63}), 2^{63} - 1]$}.
  
  
-Floating-point numbers\cite{goldberg1991whatevery} are the binary equivalent of scientific notation:
-each number consisting of an exponent ($e$) and a mantissa $(m)$ such that a number is given by
+By introducing a fractional component (analogous to a decimal point), we can convert
+integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater
+precision. For example, a number in 4.4 fixed-point format would have four bits representing the integer
+component, and four bits representing the fractional component:
+\begin{equation}
+       \underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75
+\end{equation}
+
+
+Floating-point numbers\cite{goldberg1992thedesign} are the binary equivalent of scientific notation:
+each number consisting of an exponent ($e$) and a mantissa ($m$) such that a number is given by
  \begin{equation}
         n = 2^{e} \times m
  \end{equation}
  \begin{equation}
         n = 2^{e} \times m
  \end{equation}
@@ -94,31 +170,79 @@ each number consisting of an exponent ($e$) and a mantissa $(m)$ such that a num
  The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types
  which are used\footnote{Many systems' implement the IEEE 754 standard's storage formats,
  but do not implement arithmetic operations in accordance with this standard.} by most
  The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types
  which are used\footnote{Many systems' implement the IEEE 754 standard's storage formats,
  but do not implement arithmetic operations in accordance with this standard.} by most
-computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa) and 
-64-bit (11-bit exponent, 53-bit mantissa) formats\footnote{The 2008
+computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa, 1 sign bit) and 
+64-bit (11-bit exponent, 53-bit mantissa, 1 sign bit) formats\footnote{The 2008
  revision to this standard\cite{ieee754std2008} adds some additional formats, but is
  revision to this standard\cite{ieee754std2008} adds some additional formats, but is
-less widely supported in hardware.} 
-
-How floating-point works and what its behaviour is w/r/t range and precision
-\cite{goldberg1991whatevery}
-\cite{goldberg1992thedesign}
-
-Arb. precision exists
-
+less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits
+of precision respectively.
+
+Floating-point numbers behave quite differently to integers or fixed-point numbers, as
+the representable numbers are not evenly distributed. Large numbers are stored to a lesser
+precision than numbers close to zero. This can present problems in documents when zooming in
+on objects far from the origin.
+
+IEEE floating-point has some interesting features as well, including values for negative zero,
+positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{denormal} values, which
+trade precision for range when dealing with very small numbers. Indeed, with these values,
+IEEE 754 floating-point equality does not form an equivalence relation, which can cause issues
+when not considered carefully.\cite{goldberg1991whatevery}
+
+There also exist formats for storing numbers with arbitrary precising and/or range.
+Some programming languages support ``big integer''\cite{java_bigint} types which can
+represent any integer that can fit in the system's memory. Similarly, there are
+arbitrary-precision floating-point data types\cite{java_bigdecimal}\cite{boost_multiprecision}
+which can represent any number of the form
+\begin{equation}
+       \frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed.
+\end{equation}
+These types are typically built from several native data types such as integers and floats,
+paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms}
+These, therefore, are likely slower than the native types they are built on.
+
+While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats,
+modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64}
+IEEE floats.  Note, however, that some parts of the GPU are only able to use some formats,
+so precision will likely be truncated at some point before display.
  Higher precision numeric types can be implemented or used on the GPU, but are
  Higher precision numeric types can be implemented or used on the GPU, but are
-slow.
-\cite{emmart2010high}
+slow.\cite{emmart2010high}
  
  
+Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows
+values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats,
+this would have a recurring representation:
+\begin{equation}
+       \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots
+\end{equation}
+Whereas with a rational type, this is simply $\frac{1}{3}$.
+Rationals do not have a unique representation for each value, typically the reduced fraction is used
+as a characteristic element.
  
  
  \section{Quadtrees}
  
  
  \section{Quadtrees}
-The quadtree is a data structure which keeps
-\cite{finkel1974quad}
+When viewing or processing a small part of a large document, it may be helpful to
+only processs --- or \emph{cull} --- parts of the document which are not on-screen.
  
  \begin{figure}[h]
  
  \begin{figure}[h]
-       \includegraphics[width=0.4\linewidth]{figures/quadtree_example}
+       \centering \includegraphics[width=0.4\linewidth]{figures/quadtree_example}
+       \caption{A simple quadtree.}
  \end{figure}
  \end{figure}
-
+The quadtree\cite{finkel1974quad}is a data structure --- one of a family of \emph{spatial}
+data structures --- which recursively breaks down space into smaller subregions
+which can be processed independently. Points (or other objects) are added to a single
+node, which if certain criteria are met --- typically the number of points in a node
+exceeding a maximum, though in our case likely the level of precision required exceeding
+that supported by the data type in use --- is split into four equal-sized subregions, and
+points attached to the region which contains them.
+
+In this project, we will be experimenting with a form of quadtree in which each
+node has its own independent coordinate system, allowing us to store some spatial
+information\footnote{One bit per-coordinate, per-level of the quadtree} within the
+quadtree structure, eliminating redundancy in the coordinates of nearby objects.
+
+Other spatial data structures exist, such as the KD-tree\cite{bentley1975multidimensional},
+which partitions the space on any axis-aligned line; or the BSP tree\cite{fuchs1980onvisible},
+which splits along an arbitrary line which need not be axis aligned. We believe, however,
+that the simpler conversion from binary coordinates to the quadtree's binary split make
+it a better avenue for initial research to explore.
  
  \bibliographystyle{unsrt}
  \bibliography{papers}
  
  \bibliographystyle{unsrt}
  \bibliography{papers}