Automatic commit of irc logs

[ipdf/documents.git] / LitReviewDavid.tex
diff --git a/LitReviewDavid.tex b/LitReviewDavid.tex

index 5605cef..5cde5c6 100644 (file)
--- a/LitReviewDavid.tex
+++ b/LitReviewDavid.tex
@@ -21,7 +21,180 @@ could be passed on from person to person without them ever meeting.
  
  And thus the document was born.
  
-Traditionally, documents have been static: just marks on paper, but with the advent of computers many more possibilities open up.
+Traditionally, documents have been static: just marks on rock, parchment or paper, but with the advent of computers many more possibilities open up.
+
+
+\section{Rendering}
+
+Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours; 
+and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs
+and match how cameras, printers and monitors work. 
+
+
+\begin{figure}[h]
+       \centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example}
+       \caption{A circle as a vector image and a $32 \times 32$ pixel raster image}
+\end{figure}
+
+
+However, bitmap devices do not handle zooming beyond their ``native'' resolution (the resolution where one document pixel maps
+to one display pixel), exhibiting an artefact called pixelation where the pixel structure becomes evident. Attempts to use
+interpolation to hide this effect are never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected.
+
+Vector graphics avoid many of these problems: the representation is independent of the output resolution, and rather
+an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines,
+arcs and glyphs. 
+
+As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at
+a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image
+at that resolution.
+
+% Project specific line
+This project will be based around vector graphics, as these properties make it more suited to experimenting with zoom
+quality.
+
+\subsection{Rasterizing Vector Graphics}
+
+Before an vector document can be rasterized, the co-ordinates of any shapes must
+be transformed into \emph{screen space} or \emph{viewport space}\cite{blinn1992trip}.
+On a typical display, many of these screen-space coordinates require very little precision or range.
+However, the co-ordinate transform must take care to ensure that precision is preserved during this transform.
+
+After this transformation, the image is decomposed into its separate shapes, which are rasterized
+and then composited together.
+Most graphics formats support Porter-Duff compositing\cite{porter1984compositing}.
+Porter-Duff compositing gives each element (typically a pixel) a ``coverage'' value,
+denoted $\alpha$ which represents the contribution of that element to the final scene.
+Completely transparent elements would have an $\alpha$ value of $0$, and completely opaque
+elements have an $\alpha$ of $1$. This permits arbitrary shapes to be layered over one another
+in the raster domain, while retaining soft-edges.
+
+The rasterization process may then proceed on one object (or shape) at a time. There are special algorithms for rasterizing
+different shapes.
+
+\begin{description}
+       \item[Line Segment]
+       Straight lines between two points are easily rasterized using Bresenham's algorithm\cite{bresenham1965algorithm}.
+       Bresenham's algorithm draws a pixel for every point along the \emph{long} axis of the line, moving along the short
+       axis when the error exceeds $\frac{1}{2}$ a pixel.
+       
+       Bresenham's algorithm only operates on lines whose endpoints lie on integer pixel coordinates. Due to this, line ``clipping''
+       may be performed to find endpoints of the line segment such that the entire line will be on-screen. However, if line clipping is
+       performed na\"ively without also setting the error accumulator correctly, the line's slope will be altered slightly, becoming dependent
+       on the viewport.
+       
+       \item[B\'ezier Curve]
+       A B\'ezier curve is a smooth (i.e.\ infinitely differentiable) curve between two points, represented by a Bernstein polynomial.
+       The coefficients of this Bernstein polynomial are known as the ``control points.''
+       
+       B\'ezier curves are typically rasterized using De Casteljau's algorithm\cite{foley1996computer}
+       Line Segments are a first-order B\'ezier curve. 
+       
+       \item[B\'ezier Spline]
+       A spline of order $n$ is a $C^{n-1}$ smooth continuous piecewise function composed of polynomials of degree $\leq n$.
+       In a B\'ezier spline, these polynomials are Bernstein polynomials, hence the spline is a curve made by joining B\'ezier curves
+       end-to-end (in a manner which preserves some level of smoothness).
+       
+       Many vector graphics formats call B\'ezier splines of a given order (typically quadratic or cubic) ``paths'' and treat them as the
+       fundamental type from which shapes are formed.
+\end{description}
+
+
+%There are special algorithms
+%for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier
+%Curves\cite{goldman_thefractal}. 
+
+\subsection{GPU Rendering}
+While traditionally, rasterization was done entirely in software, modern computers and mobile devices have hardware support for rasterizing
+lines and triangles designed for use rendering 3D scenes. This hardware is usually programmed with an
+API like \texttt{OpenGL}\cite{openglspec}.
+
+More complex shapes like B\'ezier curves can be rendered by combining the use of bitmapped textures (possibly using signed-distance
+fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) strtched over a triangle mesh
+approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}.
+
+Indeed, there are several implementations of entire vector graphics systems using OpenGL: 
+\begin{itemize}
+       \item The OpenVG standard\cite{robart2009openvg} has been implemented on top of OpenGL ES\cite{oh2007implementation};
+       \item the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz} 
+       \item and the SVG/PostScript GPU renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}.
+\end{itemize}
+
+
+\section{Numeric formats}
+
+On modern computer architectures, there are two basic number formats supported:
+fixed-width integers and \emph{floating-point} numbers. Typically, computers
+natively support integers of up to 64 bits, capable of representing all integers
+between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers,
+which have the same cardinality as their \emph{unsigned} counterparts, but which
+represent integers in the range $[-(2^{63}), 2^{63} - 1]$}.
+
+By introducing a fractional component (analogous to a decimal point), we can convert
+integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater
+precision. For example, a number in 4.4 fixed-point format would have four bits representing the integer
+component, and four bits representing the fractional component:
+\begin{equation}
+       \underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75
+\end{equation}
+
+
+Floating-point numbers\cite{goldberg1992thedesign} are the binary equivalent of scientific notation:
+each number consisting of an exponent ($e$) and a mantissa ($m$) such that a number is given by
+\begin{equation}
+       n = 2^{e} \times m
+\end{equation}
+
+The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types
+which are used\footnote{Many systems' implement the IEEE 754 standard's storage formats,
+but do not implement arithmetic operations in accordance with this standard.} by most
+computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa, 1 sign bit) and 
+64-bit (11-bit exponent, 53-bit mantissa, 1 sign bit) formats\footnote{The 2008
+revision to this standard\cite{ieee754std2008} adds some additional formats, but is
+less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits
+of precision respectively.
+
+Floating-point numbers behave quite differently to integers or fixed-point numbers, as
+the representable numbers are not evenly distributed. Large numbers are stored to a lesser
+precision than numbers close to zero. This can present problems in documents when zooming in
+on objects far from the origin.
+
+IEEE floating-point has some interesting features as well, including values for negative zero,
+positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{denormal} values, which
+trade precision for range when dealing with very small numbers. Indeed, with these values,
+IEEE 754 floating-point equality does not form an equivalence relation, which can cause issues
+when not considered carefully.\cite{goldberg1991whatevery}
+
+There also exist formats for storing numbers with arbitrary precising and/or range.
+Some programming languages support ``big integer''\cite{java_bigint} types which can
+represent any integer that can fit in the system's memory. Similarly, there are
+arbitrary-precision floating-point data types\cite{java_bigdecimal}\cite{boost_multiprecision}
+which can represent any number of the form
+\begin{equation}
+       \frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed.
+\end{equation}
+These types are typically built from several native data types such as integers and floats,
+paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms}
+These, therefore, are likely slower than the native types they are built on.
+
+Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows
+values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats,
+this would have a recurring representation:
+\begin{equation}
+       \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots
+\end{equation}
+Whereas with a rational type, this is simply $\frac{1}{3}$.
+Rationals do not have a unique representation for each value, typically the reduced fraction is used
+as a characteristic element.
+
+While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats,
+modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64}
+IEEE floats, though some features of IEEE floats, like denormals and NaNs are not always supported.
+Note, however, that some parts of the GPU are only able to use some formats,
+so precision will likely be truncated at some point before display.
+Higher precision numeric types can be implemented or used on the GPU, but are
+slow.\cite{emmart2010high}
+
  
  \section{Document Formats}
  
@@ -38,7 +211,7 @@ renderer.
  
  The process of creating and displaying a document is a rather universal one (\ref{documenttimeline}), though
  different document formats approach it slightly differently. A document often begins as raw content: text and images
-(be they raster or vector) and it must end up as a set of photons flying towards the reader's eyes.
+(be they raster or vector) and it must end up as a stream of photons flying towards the reader's eyes.
  
  \begin{figure}
         \label{documenttimeline}
@@ -107,6 +280,11 @@ which already have their elements placed.
         Later versions of PDF also extend the PostScript rendering model to support translucent regions via Porter-Duff compositing\cite{porter1984compositing}.
         
         PDF documents represent a particular layout, and must be rasterized before display.
+       
+       \item[SVG]
+       Scalable Vector Graphics (SVG) is a vector graphics document format\cite{svg2011-1.1} which uses the Document Object Model. It consists of a tree of matrix transforms,
+       with objects such as vector paths (made up of B\'ezier curves) and text at the leaves.
+       
  \end{description}
  
  \subsection{Precision in Document Formats}
@@ -137,124 +315,11 @@ representable with the \emph{real} type have been specified differently: the lar
  $\pm1.175 \times 10^{-38}$ with approximately $5$ decimal digits of precision \emph{in the fractional part}.
  Adobe's implementation of PDF uses both IEEE 754 single precision floating-point numbers and (for some calculations, and in previous versions) 16.16 bit fixed-point values.
  
-
-\section{Rendering}
-
-Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours; 
-and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs
-and are match how cameras, printers and monitors work. However, bitmap devices do not handle zooming beyond their
-``native'' resolution --- the resolution where one document pixel maps to one display pixel ---, exhibiting an artefact
-called pixelation where the pixel structure becomes evident. Attempts to use interpolation to hide this effect are
-never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected.
-
-\begin{figure}[h]
-       \centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example}
-       \caption{A circle as a vector image and a $32 \times 32$ pixel raster image}
-\end{figure}
-
-
-Vector graphics lack many of these problems: the representation is independent of the output resolution, and rather
-an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines,
-arcs and ``B\'ezier curves''\cite{catmull1974asubdivision}. 
-As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at
-a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image
-at that resolution.
-
-This project will be based around vector graphics, as these properties make it more suited to experimenting with zoom
-quality.
-
-
-The rasterization process typically operates on an individual ``object'' or ``shape'' at a time: there are special algorithms
-for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier
-Curves\cite{goldman_thefractal}. Typically, these are rasterized independently and composited in the bitmap domain using Porter-Duff
-compositing\cite{porter1984compositing} into a single image. This allows complex images to be formed from many simple pieces, as well
-as allowing for layered translucent objects, which would otherwise require the solution of some very complex constructive geometry problems.
-
-While traditionally, rasterization was done entirely in software, modern computers and mobile devices have hardware support for rasterizing
-some basic primitives --- typically lines and triangles ---, designed for use rendering 3D scenes. This hardware is usually programmed with an
-API like \texttt{OpenGL}\cite{openglspec}.
-
-More complex shapes like B\'ezier curves can be rendered by combining the use of bitmapped textures (possibly using signed-distance
-fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) with polygons approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}.
-
-Indeed, there are several implementations of entire vector graphics systems using OpenGL: OpenVG\cite{robart2009openvg} on top of OpenGL ES\cite{oh2007implementation};
-the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz} and the SVG/PostScript GPU
-renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}.
-
-
-\section{Numeric formats}
-
-On modern computer architectures, there are two basic number formats supported:
-fixed-width integers and \emph{floating-point} numbers. Typically, computers
-natively support integers of up to 64 bits, capable of representing all integers
-between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers,
-which have the same cardinality as their \emph{unsigned} counterparts, but which
-represent integers in the range $[-(2^{63}), 2^{63} - 1]$}.
-
-By introducing a fractional component (analogous to a decimal point), we can convert
-integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater
-precision. For example, a number in 4.4 fixed-point format would have four bits representing the integer
-component, and four bits representing the fractional component:
-\begin{equation}
-       \underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75
-\end{equation}
-
-
-Floating-point numbers\cite{goldberg1992thedesign} are the binary equivalent of scientific notation:
-each number consisting of an exponent ($e$) and a mantissa ($m$) such that a number is given by
-\begin{equation}
-       n = 2^{e} \times m
-\end{equation}
-
-The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types
-which are used\footnote{Many systems' implement the IEEE 754 standard's storage formats,
-but do not implement arithmetic operations in accordance with this standard.} by most
-computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa, 1 sign bit) and 
-64-bit (11-bit exponent, 53-bit mantissa, 1 sign bit) formats\footnote{The 2008
-revision to this standard\cite{ieee754std2008} adds some additional formats, but is
-less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits
-of precision respectively.
-
-Floating-point numbers behave quite differently to integers or fixed-point numbers, as
-the representable numbers are not evenly distributed. Large numbers are stored to a lesser
-precision than numbers close to zero. This can present problems in documents when zooming in
-on objects far from the origin.
-
-IEEE floating-point has some interesting features as well, including values for negative zero,
-positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{denormal} values, which
-trade precision for range when dealing with very small numbers. Indeed, with these values,
-IEEE 754 floating-point equality does not form an equivalence relation, which can cause issues
-when not considered carefully.\cite{goldberg1991whatevery}
-
-There also exist formats for storing numbers with arbitrary precising and/or range.
-Some programming languages support ``big integer''\cite{java_bigint} types which can
-represent any integer that can fit in the system's memory. Similarly, there are
-arbitrary-precision floating-point data types\cite{java_bigdecimal}\cite{boost_multiprecision}
-which can represent any number of the form
-\begin{equation}
-       \frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed.
-\end{equation}
-These types are typically built from several native data types such as integers and floats,
-paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms}
-These, therefore, are likely slower than the native types they are built on.
-
-While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats,
-modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64}
-IEEE floats.  Note, however, that some parts of the GPU are only able to use some formats,
-so precision will likely be truncated at some point before display.
-Higher precision numeric types can be implemented or used on the GPU, but are
-slow.\cite{emmart2010high}
-
-Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows
-values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats,
-this would have a recurring representation:
-\begin{equation}
-       \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots
-\end{equation}
-Whereas with a rational type, this is simply $\frac{1}{3}$.
-Rationals do not have a unique representation for each value, typically the reduced fraction is used
-as a characteristic element.
-
+The SVG specification\cite{svg2011-1.1} specifies numbers as strings with a decimal representation of the number.
+It is stated that a ``Conforming SVG Viewer'' must have ``all visual rendering accurate to within one device pixel to the mathematically correct result at the initial 1:1
+zoom ratio'' and that ``it is suggested that viewers attempt to keep a high degree of accuracy when zooming.''
+A ``Conforming High-Quality SVG Viewer'' must use ``double-precision floating point\footnote{Presumably the 64-bit IEEE 754 ``double'' type.}'' for computations involving
+coordinate system transformations.
  
  \section{Quadtrees}
  When viewing or processing a small part of a large document, it may be helpful to
@@ -267,11 +332,16 @@ only processs --- or \emph{cull} --- parts of the document which are not on-scre
  The quadtree\cite{finkel1974quad}is a data structure --- one of a family of \emph{spatial}
  data structures --- which recursively breaks down space into smaller subregions
  which can be processed independently. Points (or other objects) are added to a single
-node, which if certain criteria are met --- typically the number of points in a node
-exceeding a maximum, though in our case likely the level of precision required exceeding
-that supported by the data type in use --- is split into four equal-sized subregions, and
+node which (if certain criteria are met) is split into four equal-sized subregions, and
  points attached to the region which contains them.
  
+Quadtrees have been used in computer graphics for both culling --- excluding objects in
+nodes which are not visible --- and ``level of detail'', where different levels of the quadtree store
+different quality versions of objects or data\cite{zerbst2004game}.
+Typically the number of points in a node
+exceeding a maximum triggers this split, though in our case likely the level of precision required exceeding
+that supported by the data type in use. 
+
  In this project, we will be experimenting with a form of quadtree in which each
  node has its own independent coordinate system, allowing us to store some spatial
  information\footnote{One bit per-coordinate, per-level of the quadtree} within the