davidfinal.tex

   1 \documentclass[a4paper,10pt]{cshonours}
   2 \bibliographystyle{acm}
   3 \usepackage[utf8]{inputenc}
   4 \usepackage{hyperref}
   5 \usepackage{graphicx}
   6 \usepackage{lastpage}
   7 \usepackage{fancyhdr}
   8 %\usepackage{listing}
   9 \usepackage{subcaption}
  10 \usepackage{amsmath}
  11 \usepackage{amssymb}
  12 \usepackage{amsthm}
  13 \usepackage{minted}
  14 \pagestyle{fancyplain}
  15
  16 %opening
  17 \title{Precision in vector documents: a spatial approach}
  18 \author{David Gow}
  19
  20 \begin{document}
  21 \cfoot{}
  22 \lfoot{\thepage\ of \pageref{LastPage}}
  23 \rfoot{DRAFT}
  24 \rhead{David Gow (20513694)}
  25 \lhead{\today}
  26
  27 \maketitle
  28
  29 \tableofcontents
  30
  31 \chapter{Introduction}
  32
  33 Documents are an important part of day-to-day life: we use them for business and pleasure,
  34 to inform and entertain. We often use images or diagrams to illustrate our ideas, and to convey
  35 spatial or visual information.
  36
  37 On the printed page, there is a practical limit to the size and resolution of these images.
  38 In order to store, for example, maps of large areas with fine detail, we resort to an atlas:
  39 splitting the map up into several pages, each of which covers a different part of the map.
  40
  41 With the advent of computers, we can begin to alleviate these problems. The fixed size and resolution
  42 of the screen is worked around by having the screen be a \emph{viewport} into a larger document, letting
  43 the document be \emph{panned} (translated laterally) and \emph{zoomed} (scaled to a particular magnification).
  44
  45 However, limits in the range and precision of the numbers used to store and manipulate documents
  46 artificially restrict the size and zoom-level of documents. While changing the numeric types used
  47 can solve these issues, here we investigate the use of a quadtree to take advantage of the spatial
  48 nature of the data. This is akin to maintaining the atlas of several different pages with different views
  49 of the map, but with the advantage of greater ease-of-use and continuity.
  50
  51 \chapter{Background}
  52
  53 \section{Rendering}
  54
  55 Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours;
  56 and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs
  57 and match how cameras, printers and monitors work.
  58
  59
  60 \begin{figure}[h]
  61         \centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example}
  62         \caption{\label{vectorraster}A circle as a vector image and a $32 \times 32$ pixel raster image}
  63 \end{figure}
  64
  65
  66 However, bitmap devices do not handle zooming beyond their ``native'' resolution (the resolution where one document pixel maps
  67 to one display pixel), exhibiting an artefact called pixelation (see Figure \ref{vectorraster}) where the pixel structure becomes evident. Attempts to use
  68 interpolation to hide this effect are never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected.
  69
  70 Vector graphics avoid many of these problems: the representation is independent of the output resolution, and rather
  71 an abstract description of what is being rendered, typically as a combination of simple geometric shapes like lines,
  72 arcs and glyphs.
  73
  74 As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at
  75 a given resolution.  The resulting bitmap is then an approximation of the vector image
  76 at that resolution. This bitmap is then displayed or printed.
  77
  78 % Project specific line
  79 We will discuss the use of vector graphics, as they lend themselves more naturally to scaling,
  80 though many of the techniques will also apply to raster graphics. Indeed, the use of quadtrees
  81 to compress bitmapped data is well established\cite{salomon2007data}, and the view-reparenting
  82 techniques have been used to facilitate an infinite zoom into a (procedurally generated) document\cite{munroe2014pixels}.
  83
  84 \subsection{Rasterizing Vector Graphics}
  85
  86 Before an vector document can be rasterized, the co-ordinates of any shapes must
  87 be transformed into \emph{screen space} or \emph{viewport space}\cite{blinn1992trip}.
  88 On a typical display, many of these screen-space coordinates require very little precision or range.
  89 However, the co-ordinate transform must take care to ensure that precision is preserved during this transform.
  90
  91 After this transformation, the image is decomposed into its separate shapes, which are rasterized
  92 and then composited together.
  93 Most graphics formats support Porter-Duff compositing\cite{porter1984compositing}.
  94 Porter-Duff compositing gives each element (typically a pixel) a ``coverage'' value,
  95 denoted $\alpha$ which represents the contribution of that element to the final scene.
  96 Completely transparent elements would have an $\alpha$ value of $0$, and completely opaque
  97 elements have an $\alpha$ of $1$. This permits arbitrary shapes to be layered over one another
  98 in the raster domain, while retaining soft-edges.
  99
 100 The rasterization process may then proceed on one object (or shape) at a time. There are special algorithms for rasterizing
 101 different shapes.
 102
 103 \begin{description}
 104         \item[Line Segment]
 105         Straight lines between two points are easily rasterized using Bresenham's algorithm\cite{bresenham1965algorithm}.
 106         Bresenham's algorithm draws a pixel for every point along the \emph{long} axis of the line, moving along the short
 107         axis when the error exceeds $\frac{1}{2}$ a pixel.
 108
 109         Bresenham's algorithm only operates on lines whose endpoints lie on integer pixel coordinates. Due to this, line ``clipping''
 110         may be performed to find endpoints of the line segment such that the entire line will be on-screen. However, if line clipping is
 111         performed na\"ively without also setting the error accumulator correctly, the line's slope will be altered slightly, becoming dependent
 112         on the viewport.
 113
 114         \item[B\'ezier Curve]
 115         A B\'ezier curve is a smooth (i.e.\ infinitely differentiable) curve between two points, represented by a Bernstein polynomial.
 116         The coefficients of this Bernstein polynomial are known as the ``control points.''
 117
 118         B\'ezier curves are typically rasterized using De Casteljau's algorithm\cite{foley1996computer}
 119         Line Segments are a first-order B\'ezier curve.
 120
 121 \end{description}
 122
 123
 124 %There are special algorithms
 125 %for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier
 126 %Curves\cite{goldman_thefractal}.
 127
 128 \subsection{GPU Rendering}
 129 While traditionally, rasterization was done entirely in software, modern computers and mobile devices have hardware support for rasterizing
 130 lines and triangles designed for use rendering 3D scenes. This hardware is usually programmed with an
 131 API like \texttt{OpenGL}\cite{openglspec}.
 132
 133 More complex shapes like B\'ezier curves can be rendered by combining the use of bitmapped textures (possibly using signed-distance
 134 fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) strtched over a triangle mesh
 135 approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}.
 136
 137 Indeed, there are several implementations of entire vector graphics systems using OpenGL:
 138 \begin{itemize}
 139         \item The OpenVG standard\cite{robart2009openvg} has been implemented on top of OpenGL ES\cite{oh2007implementation};
 140         \item the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz}
 141         \item and the SVG/PostScript GPU renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}.
 142 \end{itemize}
 143
 144
 145 \section{Numeric formats}
 146
 147 On modern computer architectures, there are two basic number formats supported:
 148 fixed-width integers and \emph{floating-point} numbers. Typically, computers
 149 natively support integers of up to 64 bits, capable of representing all integers
 150 between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers,
 151 which have the same cardinality as their \emph{unsigned} counterparts, but which
 152 represent integers in the range $[-(2^{63}), 2^{63} - 1]$}.
 153
 154 By introducing a fractional component (analogous to a decimal point), we can convert
 155 integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater
 156 precision. For example, a number in 4.4 fixed-point format would have four bits representing the integer
 157 component, and four bits representing the fractional component:
 158 \begin{equation}
 159         \underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75
 160 \end{equation}
 161
 162
 163 Floating-point numbers\cite{goldberg1992thedesign, taocp2} are a superset of scientific notation,
 164 originally used by the Babylonians (in base 60) perhaps as early as 1750 BC.
 165 Each number $n$ (in a base $\beta$) consists of integers $e$ (the \emph{exponent}) and $m$ (the \emph{mantissa}) such that
 166 \begin{equation}
 167         n = \beta^{e} \times m_f
 168 \end{equation}
 169
 170 Both $e$ and $m$ typically have a fixed width $q$ and $p$ respectively in base $\beta$. For notational convenience, we also
 171 represent $m$ as a fraction $m_f = \beta^{-p} m$ so $|m_f| < 1$. We further call a floating point number
 172 \emph{normalised} if the first digit of $m$ is nonzero. The exponent $e$ is usually allowed to be
 173 either positive or negative (either by having a sign bit, or being offset a fixed amount). The value
 174 of a floating point number $n$ must therefore be $-\beta^{e_max} < n < \beta^{e_max}$. The smallest
 175 possible magnitude of a normalised floating point number is $\beta^{e_min-1}$, as $m_f$ must be at least $0.1$.
 176 Non-normalised numbers may represent $0$, and many normalised floating-point include zero in some way, too.
 177
 178 The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types
 179 which are used\footnote{Many systems implement the IEEE 754 standard's storage formats,
 180 but do not implement arithmetic operations in accordance with this standard\cite{openglspec, ARBshaderprecision}.} by most
 181 computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa, 1 sign bit) and
 182 64-bit (11-bit exponent, 53-bit mantissa, 1 sign bit) formats\footnote{The 2008
 183 revision to this standard\cite{ieee754std2008} adds some additional formats, but is
 184 less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits
 185 of precision respectively. The IEEE 754 standard also introduces an implicit $1$ bit in the most
 186 significant place of the mantissa when the exponent is not $e_min$.
 187
 188 Floating-point numbers behave quite differently to integers or fixed-point numbers, as
 189 the representable numbers are not evenly distributed. Large numbers are stored to a lesser
 190 precision than numbers close to zero. This can present problems in documents when zooming in
 191 on objects far from the origin. Furthermore, due to the limited precision, and the different
 192 ``alignment'' of operands in arithmetic, several algebraic properties of rings we take for
 193 granted in the integers do not exist, including the property of assosciativity\cite{taocp2}.
 194
 195 IEEE floating-point has some interesting features as well, including values for negative zero,
 196 positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{subnormal} values, which
 197 trade precision for range when dealing with very small numbers by not normalising
 198 numbers when the exponent is $e_min$. Indeed, with these values,
 199 IEEE 754 floating-point equality does not form an equivalence relation, which can cause issues
 200 when not considered carefully\cite{goldberg1991whatevery}.
 201
 202 There also exist formats for storing numbers with arbitrary precising and/or range.
 203 Some programming languages support ``big integer''\cite{java_bigint} types which can
 204 represent any integer that can fit in the system's memory. Similarly, there are
 205 arbitrary-precision floating-point data types\cite{java_bigdecimal}\cite{boost_multiprecision}
 206 which can represent any number of the form
 207 \begin{equation}
 208         \frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed.
 209 \end{equation}
 210 These types are typically built from several native data types such as integers and floats,
 211 paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms}
 212 Operations on these types, therefore, are usually slower than those performed on native types.
 213
 214 Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows
 215 values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats,
 216 this would have a recurring representation:
 217 \begin{equation}
 218         \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots
 219 \end{equation}
 220 Whereas with a rational type, this is simply $\frac{1}{3}$.
 221 Rationals do not have a unique representation for each value, typically the reduced fraction is used
 222 as a characteristic element.
 223
 224 While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats,
 225 modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64}
 226 IEEE floats, though some features of IEEE floats, like denormals and NaNs are not always supported.
 227 Note, however, that some parts of the GPU are not able to use all formats,
 228 so precision will likely be truncated at some point before display.
 229 Higher precision numeric types can be implemented or used on the GPU, but are
 230 slow.\cite{emmart2010high}
 231
 232
 233 \section{Document Formats}
 234
 235 Most existing document formats --- such as the venerable PostScript and PDF --- are, however, designed to imitate
 236 existing paper documents, largely to allow for easy printing. In order to truly take advantage of the possibilities operating in the digital
 237 domain opens up to us, we must look to new formats.
 238
 239 Formats such as \texttt{HTML} allow for a greater scope of interactivity and for a more data-driven model, allowing
 240 the content of the document to be explored in ways that perhaps the author had not anticipated.\cite{hayes2012pixels}
 241 However, these data-driven formats typically do not support fixed layouts, and the display differs from renderer to
 242 renderer.
 243
 244 \subsection{A Taxonomy of Document formats}
 245
 246 The process of creating and displaying a document is a rather universal one (\ref{documenttimeline}), though
 247 different document formats approach it slightly differently. A document often begins as raw content: text and images
 248 (be they raster or vector) and it must end up as a stream of photons flying towards the reader's eyes.
 249
 250 \begin{figure}
 251         \label{documenttimeline}
 252         \centering \includegraphics[width=0.8\linewidth]{figures/documenttimeline}
 253         \caption{\label{doclifecycle}The lifecycle of a document}
 254 \end{figure}
 255
 256 There are two fundamental stages (as shown in Figure \ref{doclifecycle}) by which all documents --- digital or otherwise --- are produced and displayed:
 257 \emph{layout} and \emph{rendering}. The \emph{layout} stage is where the positions and sizes of text and other graphics are
 258 determined. The text will be \emph{flowed} around graphics, the positions of individual glyphs will be placed, ensuring
 259 that there is no undesired overlap and that everything will fit on the page or screen.
 260
 261 The \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor.
 262 Each graphical element is rasterized and composited into a single image of the target resolution.
 263
 264
 265 Different document formats cover documents in different stages of this project. Bitmapped images,
 266 for example, would represent the output of the final stage of the process, whereas markup languages typically specify
 267 a document which has not yet been processed, ready for the layout stage.
 268
 269 Furthermore, some document formats treat the document as a program, written in
 270 a (usually Turing complete) document language with instructions which emit shapes to be displayed. These shapes are either displayed
 271 immediately, as in PostScript, or stored in another file, such as with \TeX or \LaTeX, which emit a \texttt{DVI} file. Most other
 272 forms of document use a \emph{Document Object Model}, being a list or tree of objects to be rendered. \texttt{DVI}, \texttt{PDF},
 273 \texttt{HTML}\footnote{Some of these formats --- most notably \texttt{HTML} --- implement a scripting lanugage such as JavaScript,
 274 which permit the DOM to be modified while the document is being viewed.} and SVG\cite{svg2011-1.1}. Of these, only \texttt{HTML} and \TeX typically
 275 store documents in pre-layout stages, whereas even Turing complete document formats such as PostScript typically encode documents
 276 which already have their elements placed.
 277
 278 \begin{description}
 279         \item[\TeX \, and \LaTeX]
 280         Donald Knuth's typesetting language \TeX \, is one of the older computer typesetting systems, originally conceived in 1977\cite{texdraft}.
 281         It implements a Turing-complete language and is human-readable and writable, and is still popular
 282         due to its excellent support for typesetting mathematics.
 283         \TeX only implements the ``layout'' stage of document display, and produces a typeset file,
 284         traditionally in \texttt{DVI} format, though modern implementations will often target \texttt{PDF} instead.
 285
 286         This document was prepared in \LaTeXe.
 287
 288         \item[DVI]
 289         \TeX \, traditionally outputs to the \texttt{DVI} (``DeVice Independent'') format: a binary format which consists of a
 290         simple stack machine with instructions for drawing glyphs and curves\cite{fuchs1982theformat}.
 291
 292         A \texttt{DVI} file is a representation of a document which has been typeset, and \texttt{DVI}
 293         viewers will rasterize this for display or printing, or convert it to another similar format like PostScript
 294         to be rasterized.
 295
 296         \item[HTML]
 297         The Hypertext Markup Language (HTML)\cite{html2rfc} is the widely used document format which underpins the
 298         world wide web. In order for web pages to adapt appropriately to different devices, the HTML format simply
 299         defined semantic parts of a document, such as headings, phrases requiring emphasis, references to images or links
 300         to other pages, leaving the \emph{layout} up to the browser, which would also rasterize the final document.
 301
 302         The HTML format has changed significantly since its introduction, and most of the layout and styling is now controlled
 303         by a set of style sheets in the CSS\cite{css2spec} format.
 304
 305         \item[PostScript]
 306         Much like DVI, PostScript\cite{plrm} is a stack-based format for drawing vector graphics, though unlike DVI (but like \TeX), PostScript is
 307         text-based and Turing complete. PostScript was traditionally run on a control board in laser printers, rasterizing pages at high resolution
 308         to be printed, though PostScript interpreters for desktop systems also exist, and are often used with printers which do not support PostScript natively.\cite{ghostscript}
 309
 310         PostScript programs typically embody documents which have been typeset, though as a Turing-complete language, some layout can be performed by the document.
 311
 312         \item[PDF]
 313         Adobe's Portable Document Format (PDF)\cite{pdfref17} takes the PostScript rendering model, but does not implement a Turing-complete language.
 314         Later versions of PDF also extend the PostScript rendering model to support translucent regions via Porter-Duff compositing\cite{porter1984compositing}.
 315
 316         PDF documents represent a particular layout, and must be rasterized before display.
 317
 318         \item[SVG]
 319         Scalable Vector Graphics (SVG) is a vector graphics document format\cite{svg2011-1.1} which uses the Document Object Model. It consists of a tree of matrix transforms,
 320         with objects such as vector paths (made up of B\'ezier curves) and text at the leaves.
 321
 322 \end{description}
 323
 324 \subsection{Precision in Document Formats}
 325
 326 Existing document formats --- typically due to having been designed for documents printed on paper, which of course has
 327 limited size and resolution --- use numeric types which can only represent a fixed range and precision.
 328 While this works fine with printed pages, users reading documents on computer screens using programs
 329 with ``zoom'' functionality are prevented from working beyond a limited scale factor, lest artefacts appear due
 330 to issues with numeric precision.
 331
 332 \TeX uses a $14.16$ bit fixed point type (implemented as a 32-bit integer type, with one sign bit and one bit used to detect overflow)\cite{beebe2007extending}.
 333 This can represent values in the range $[-(2^{14}), 2^{14} - 1]$ with 16 binary digits of fractional precision.
 334
 335 The DVI files \TeX \, produces may use ``up to'' 32-bit signed integers\cite{fuchs1982theformat} to specify the document, but there is no requirement that
 336 implementations support the full 32-bit type. It would be permissible, for example, to have a DVI viewer support only 24-bit signed integers, though many
 337 files which require greater range may fail to render correctly.
 338
 339 PostScript\cite{plrm} supports two different numeric types: \emph{integers} and \emph{reals}, both of which are specified as strings. The interpreter's representation of numbers
 340 is not exposed, though the representation of integers can be divined by a program by the use of bitwise operations. The PostScript specification lists some ``typical limits''
 341 of numeric types, though the exact limits may differ from implementation to implementation. Integers typically must fall in the range $[-2^{31}, 2^{31} - 1]$,
 342 and reals are listed to have largest and smallest values of $\pm10^{38}$, values closest to $0$ of $\pm10^{-38}$ and approximately $8$ decimal digits of precision,
 343 derived from the IEEE 754 single-precision floating-point specification.
 344
 345 Similarly, the PDF specification\cite{pdfref17} stores \emph{integers} and \emph{reals} as strings, though in a more restricted format than PostScript.
 346 The PDF specification gives limits for the internal representation of values. Integer limits have not changed from the PostScript specification, but numbers
 347 representable with the \emph{real} type have been specified differently: the largest representable values are $\pm 3.403\times 10^{38}$, the smallest non-zero representable values are
 348 $\pm1.175 \times 10^{-38}$ with approximately $5$ decimal digits of precision \emph{in the fractional part}.
 349 \footnote{The PDF specification mistakenly leaves out the negative in the exponent here.}
 350 Adobe's implementation of PDF uses both IEEE 754 single precision floating-point numbers and (for some calculations, and in previous versions) 16.16 bit fixed-point values.
 351
 352 The SVG specification\cite{svg2011-1.1} specifies numbers as strings with a decimal representation of the number.
 353 It is stated that a ``Conforming SVG Viewer'' must have ``all visual rendering accurate to within one device pixel to the mathematically correct result at the initial 1:1
 354 zoom ratio'' and that ``it is suggested that viewers attempt to keep a high degree of accuracy when zooming.''
 355 A ``Conforming High-Quality SVG Viewer'' must use ``double-precision floating point\footnote{Presumably the 64-bit IEEE 754 ``double'' type.}'' for computations involving
 356 coordinate system transformations.
 357
 358 \section{Quadtrees}
 359 When viewing or processing a small part of a large document, it may be helpful to
 360 only process --- or \emph{cull} --- parts of the document which are not on-screen.
 361
 362 \begin{figure}[h]
 363         \centering \includegraphics[width=0.4\linewidth]{figures/quadtree_example}
 364         \caption{A simple quadtree.}
 365 \end{figure}
 366 The quadtree\cite{finkel1974quad}is a data structure --- one of a family of \emph{spatial}
 367 data structures --- which recursively breaks down space into smaller subregions
 368 which can be processed independently. Points (or other objects) are added to a single
 369 node which (if certain criteria are met) is split into four equal-sized subregions, and
 370 points attached to the region which contains them.
 371
 372 Quadtrees have been used in computer graphics for both culling --- excluding objects in
 373 nodes which are not visible --- and ``level of detail'', where different levels of the quadtree store
 374 different quality versions of objects or data\cite{zerbst2004game}.
 375 Typically the number of points in a node
 376 exceeding a maximum triggers this split, though in our case likely the level of precision required exceeding
 377 that supported by the data type in use.
 378
 379 In this project, we will be experimenting with a form of quadtree in which each
 380 node has its own independent coordinate system, allowing us to store some spatial
 381 information\footnote{One bit per-coordinate, per-level of the quadtree} within the
 382 quadtree structure, eliminating redundancy in the coordinates of nearby objects.
 383
 384 Other spatial data structures exist, such as the KD-tree\cite{bentley1975multidimensional},
 385 which partitions the space on any axis-aligned line; or the BSP tree\cite{fuchs1980onvisible},
 386 which splits along an arbitrary line which need not be axis aligned. We believe, however,
 387 that the simpler conversion from binary coordinates to the quadtree's binary split make
 388 it a better avenue for initial research to explore.
 389
 390 \chapter{\texttt{IPDF}: A Document Precision Playground}
 391 In order to investigate the precision issues present in document formats and viewers,
 392 we developed a document viewer \texttt{IPDF}. \texttt{IPDF} is a \texttt{C++} program
 393 which supports rendering TrueType font outlines and a subset of the \texttt{SVG}
 394 format.
 395
 396 At its core, \texttt{IPDF} breaks documents down into a set of \emph{objects}, each
 397 with rectangular bounds $(x, y, w, h)$ and with several \emph{types}:
 398 \begin{enumerate}
 399         \item \textbf{Rectangle:} A simple, axis-aligned rectangle (either filled with a solid colour or
 400         left as an outline) covering the object bounds.
 401         \item \textbf{Ellipse:} A filled, axis-aligned ellipse, rendered parametrically.
 402         \item \textbf{B\'ezier Curve:} A cubic B\'ezier curve. B\'ezier curve
 403         control points are stored relative to the coordinate system provided by the
 404         document bounds, and several objects can share these coefficients.
 405 \end{enumerate}
 406
 407 \texttt{IPDF} can be compiled using different number representations, to allow for
 408 comparison between not only different-precision floating point numbers, but also
 409 arbitrary-precision types such as rationals.
 410
 411 Documents in \texttt{IPDF} may be rendered either on the CPU, using a custom software
 412 rasterizer built on the numeric types supported, or on the GPU with OpenGL 3.2 shaders, using
 413 the GPU's default numeric representation.
 414
 415 Furthermore, \texttt{IPDF} allows \texttt{SVG} files (and text rendered with TrueType fonts) to be inserted at any point and scale in
 416 the document, allowing for the same images to be compared at different scales.
 417
 418 \section{GPU Floating Point Rounding Behaviour}
 419 While the IEEE-754 standard specifies both the format of floating-point numbers and the operations they perform.
 420 However, the OpenGL specification\cite{openglspec}, while requiring the same basic format for single-precision floats,
 421 does not specify the behaviour of denormals, nor requires any support for NaN or infinite values. Similarly, no support
 422 for floating point exceptions is required, with the note that no operation should ever halt the GPU.
 423
 424 However, an extension to the specification, \texttt{GL\_ARB\_shader\_precision}\cite{ARBshaderprecision} in 2010
 425 allows programs to require stricter precision requirements. Notably, support for infinite values is required and
 426 maximum relative error in ULPs.
 427
 428 Different GPU vendors and drivers have different rounding behaviour, and most hardware (both CPU and GPU) provide
 429 ways of disabling support for some IEEE features to improve performance\cite{intelgpuspec, dawson2012notnormal}.
 430
 431 \begin{figure}[H]
 432 \centering
 433 \includegraphics[width=0.8\textwidth]{figures/circles_gpu_vs_cpu/comparison.pdf}
 434 \caption{\label{fig:circles}The edges of a unit circle viewed through bounds (x,y,w,h) =  (0.0869386,0.634194,2.63295e-07,2.63295e-07)}
 435 \end{figure}
 436
 437 Many GPUs now support double-precision floating-point numbers\footnote{Some drivers, however, do not yet support this feature, including one of our test machines.}
 438 as specified in the \texttt{GL\_ARB\_gpu\_shader\_fp64}\cite{arb_gpu_shader_fp64} OpenGL extension.
 439 However, many of the fixed-function parts of the GPU do not support double-precision floats,
 440 making it impractical to use them.
 441
 442 To see the issues, we rendered the edge of a circle (calculated by discarding pixels with $x^2 + y^2 > 1$) on several GPUs, as well as an \texttt{x86-64} CPU,
 443 as seen in \autoref{fig:circles}.
 444 Of these, the nVidia GPU came closest to the CPU rendering, whereas Intel's hardware clearly performs some optimisation which produces quite different artefacts.
 445 The diagonal distortion in the AMD rendering may be a result of different rounding across the two triangles across which the circle was rendered.
 446 \section{Distortion and Quantisation at the limits of precision}
 447 TODO: Fix these figures, explain properly.
 448 \begin{figure}[h]
 449
 450         \begin{subfigure}{.5\linewidth}
 451                 \includegraphics[width=\linewidth]{cpu0}
 452                 \subcaption{Intended rendering}
 453         \end{subfigure}
 454         \begin{subfigure}{.5\linewidth}
 455                 \includegraphics[width=\linewidth]{cpu100}
 456                 \subcaption{With artefacts}
 457         \end{subfigure}
 458         \caption{\label{fig:precisionerrors}Floating point errors affecting the rendering of an image}
 459
 460 \end{figure}
 461
 462 When trying to insert fine detail into a document using fixed-width floats, some precision is lost. In particular, the
 463 control points of the curves making up the image get rounded to the nearest\footnote{Other rounding modes are available,
 464 but they all suffer from similar artefacts.} representable point. \autoref{fig:precisionerrors} shows what happens when an
 465 image is small enough to be affected by this quantisation. The grid structure becomes very apparent.
 466
 467 These artefacts also become prevalent when an object is far from the origin, as more bits would be required to store the
 468 position, so the lower order bits of the position must be discarded. As the position of the viewport and the object will share
 469 many of the same initial digits, catastrophic cancellation\cite{goldberg1991whatevery} can occur.
 470
 471 %\subsection{Big Integers}
 472 %Computers typically operate on \emph{fixed-width} integers\footnote{The term ``machine word'' has traditionally been used
 473 %for a CPUs default numeric type, though for historical reasons this is sometimes also used to refer to a 16-bit type, regardless
 474 %of the preferred type of the underlying machine.} for which they have hardware adders.
 475
 476 %These types are represented by $n$ bits (typically $n$ is 32 or 64), and represents the
 477 %ring $\mathbb{Z}_n$: i.e.\ any number in the range $[0,2^{n})$ can be represented, and operations
 478 %are performed $\mod 2^n$.
 479 \chapter{View Reparenting and the Quadtree}
 480 One of the important parts of document rendering is that of coordinate transforms.
 481
 482 Traditionally, the document exists in its own coordinate system, $\mathbf{b}$.
 483 We then define a coordinate system $\mathbf{v}$ to represent the view.
 484
 485 To eliminate visible error due to floating point precision, we need to ensure that
 486 any point (including control points) must be representable as a float both in the
 487 document and in view coordinates, i.e:
 488 \begin{enumerate}
 489         \item have a limited magnitude,
 490         \item be precise enough to uniquely identify a pixel on the display.
 491 \end{enumerate}
 492
 493 Despite a point not being representable with floats in one coordinate system,
 494 it may be representable in another. We can therefore split a document up into
 495 several coordinate systems, such that each point is completely representable.
 496
 497 Similarly, the points making up the visible document need to all be representable
 498 (to at least one pixel's worth of precision). To achieve this, we use a quadtree where
 499 each node stores points within its bounds in its own coordinate system. Objects
 500 which span multiple nodes are clipped, such that no points\footnote{except some
 501 control points} lie outside the quadtree node.
 502
 503 Setting aside the possibility that an object might span multiple nodes for the time
 504 being, let's investigate how the coordinates of a point are affected by placing it
 505 in the quadtree.
 506
 507 Suppose $p = (p_x, p_y)$ is a point in a global document coordinate system, which we'll
 508 consider the root node of our quadtree. $p_x$ and $p_y$ are represented by a finite
 509 list of binary digits $x_0 \dots x_n$, $y_0 \dots y_n$.
 510 Consider now the pair $(x_0, y_0)$:
 511 \begin{align*}
 512         x_0 &= \begin{cases}
 513                 0 & \text{if }p_x\text{ is in the left half of }d\\
 514                 1 & \text{if }p_x\text{ is in the right half of }d
 515               \end{cases} \\
 516         y_0 &= \begin{cases}
 517                 0 & \text{if }p_y\text{ is in the bottom half of }d\\
 518                 1 & \text{if }p_y\text{ is in the top half of }d
 519               \end{cases}
 520 \end{align*}
 521
 522 We have therefore found which child node of our quadtree contains $p$. The coordinates
 523 of $p$ within that node are $(x_1 \dots x_n, y_1 \dots y_n)$. By the principle of mathematical
 524 induction, we can repeat the process to move more bits from the coordinates of $p$ into the
 525 structure of the quadtree, until the remaining coordinates may be precisely represented.
 526
 527 This implies that the quadtree is equivalent to an arbitrary precision integer datatype. By reversing
 528 the process, any point representable in the quadtree can be stored as a fixed-length bit string.
 529
 530 %It is natural to treat these coordinates as being between $0$ and $1$.
 531
 532 Furthermore, we can get approximations of points by inserting them into higher levels of the quadtree,
 533 with their coordinates rounded. By then viewing the document at a certain level of the quadtree, we
 534 then not only do not need to consider bits of precision which are not required to represent the point to
 535 pixel resolution, we can also cull objects outside the visible nodes at that level.
 536
 537 Indeed, we can guarantee that, by selecting the level of the quadtree we view such that
 538 the view width and height lie within the range $(0.5,1]$, we can ensure that at most
 539 four nodes need to be rendered.
 540
 541
 542 \section{Clipping cubic B\'eziers}
 543 In order to ensure that the quadtree maintains precision in this fashion, we need to ensure
 544 that all objects are entirely contained within a quadtree node. This is not always the case and,
 545 indeed, when zooming in on any object, eventually the object will span multiple quadtree nodes.
 546
 547 {\bf TODO: We can show this by showing that, given $(a,b) \in \mathbb{R}, \exists $ binary fraction $c$ such that $a < c < b$.}
 548
 549 We therefore need a way to subdivide objects into several objects, each of which is contained within one node.
 550 To do this, we need to \emph{clip} the cubic B\'ezier curves from which our document is formed to a rectangle.
 551
 552 Cubic B\'ezier curves are defined parametrically as two cubics: $x(t)$ and $y(t)$, with $0 \leq t \leq 1$.
 553 We clip these to a rectangular bounding box in stages. We first find the intersections of the curve with
 554 the clipping rectangle by finding the roots\footnote{While there is a method for solving cubics exactly\cite{cardano1545artis},
 555 we instead use numeric methods to avoid the need for square root operations, which cannot be done exactly
 556 on some of the numeric types we used.}
 557 of the cubic shifted to match the corners of the rectangle. This produces some spurious points, as it assumes the edges
 558 of the clipping rectangle are lines with infinite extent, but this at worst introduces some minor inefficiency in the process and does
 559 not affect the result.
 560
 561 Once the values of the parameter $t$ which intersect the clipping rectangle have
 562 been determined, the curve is split into segments. To do this, the values $0$ and $1$
 563 (representing the endpoints) are added to the list of intersecting $t$ values, which is then
 564 sorted. Adjacent values $t_0$ and $t_1$ form a segment. The midpoint of that segment (with the
 565 value $\frac{t_0 + t_1}{2}$) is evaluate and if it falls
 566 outside the clipping rectangle, the segment is discarded.
 567
 568 Finally, the segments are re-parametrised by subdividing the curve using De Casteljau's
 569 algorithm\cite{sederberg2007cad}. By re-parameterising the curves such that $0 \leq t \leq 1$,
 570 we ensure that the first and last coefficients have the endpoints' coodinates, and therefore
 571 lie in the quadtree node.
 572
 573 {\bf TODO: Prove that the other control points' magnitude is reduced, and try to quantify it, prove that
 574 it will never overflow.}
 575
 576 \section{Implementation Details}
 577 \begin{itemize}
 578         \item Store object ID ranges.
 579         \item Pointers to children and parent.
 580         \item Linked-list of ``overlay'' nodes for mutation.
 581         \item Have billions of bugs.
 582 \end{itemize}
 583
 584
 585
 586 \chapter{Experimental Results}
 587 These are all iPython-y at the moment.
 588
 589 Roughly 3s/frame for GMP rationals, 16ms for Quadtree which is still slightly broken.
 590 \section{Performance per object}
 591
 592 \section{Performance per onscreen object}
 593
 594 \section{Performance per zoom-level}
 595
 596 \section{Stability of performance}
 597
 598 \chapter{Further Work and Conclusion}
 599 The use of spatial data structures to specify documents with arbitrary precision
 600 has been validated as a viable alternative to the use of arbitrary precision numeric
 601 types where an arbitrary (rather than infinite) amount of precision is needed.
 602 Once constructed, they are faster in many circumstanced, and the structure
 603 can also be used for culling. When the viewport moves and scales smoothly, the cost
 604 of constructing new nodes is amortised over the movement.
 605 Unfortunately, the mutation of the quadtree is difficult and slow, and discontinuous
 606 movement can result in a large number of nodes needing to be created.
 607
 608 Quadtree seems to be viable and is really performant.
 609
 610 Loop-blinn shading.
 611
 612
 613 \bibliography{papers}
 614
 615 \end{document}