\documentclass[a4paper,10pt]{cshonours}
\bibliographystyle{acm}
\usepackage[utf8]{inputenc}
\usepackage{hyperref}
\usepackage{graphicx}
\usepackage{lastpage}
\usepackage{fancyhdr}
%\usepackage{listing}
\usepackage{subcaption}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{minted}
\pagestyle{fancyplain}
%opening
\title{Precision in vector documents: a spatial approach}
\author{David Gow}
\begin{document}
\cfoot{}
\lfoot{\thepage\ of \pageref{LastPage}}
\rfoot{DRAFT}
\rhead{David Gow (20513694)}
\lhead{\today}
\maketitle
\tableofcontents
\chapter{Introduction}
Documents are an important part of day-to-day life: we use them for business and pleasure,
to inform and entertain. We often use images or diagrams to illustrate our ideas, and to convey
spatial or visual information.
On the printed page, there is a practical limit to the size and resolution of these images.
In order to store, for example, maps of large areas with fine detail, we resort to an atlas:
splitting the map up into several pages, each of which covers a different part of the map.
With the advent of computers, we can begin to alleviate these problems. The fixed size and resolution
of the screen is worked around by having the screen be a \emph{viewport} into a larger document, letting
the document be \emph{panned} (translated laterally) and \emph{zoomed} (scaled to a particular magnification).
However, limits in the range and precision of the numbers used to store and manipulate documents
artificially restrict the size and zoom-level of documents. While changing the numeric types used
can solve these issues, here we investigate the use of a quadtree to take advantage of the spatial
nature of the data. This is akin to maintaining the atlas of several different pages with different views
of the map, but with the advantage of greater ease-of-use and continuity.
\chapter{Background}
\section{Rendering}
Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours;
and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs
and match how cameras, printers and monitors work.
\begin{figure}[h]
\centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example}
\caption{\label{vectorraster}A circle as a vector image and a $32 \times 32$ pixel raster image}
\end{figure}
However, bitmap devices do not handle zooming beyond their ``native'' resolution (the resolution where one document pixel maps
to one display pixel), exhibiting an artefact called pixelation (see Figure \ref{vectorraster}) where the pixel structure becomes evident. Attempts to use
interpolation to hide this effect are never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected.
Vector graphics avoid many of these problems: the representation is independent of the output resolution, and rather
an abstract description of what is being rendered, typically as a combination of simple geometric shapes like lines,
arcs and glyphs.
As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at
a given resolution. The resulting bitmap is then an approximation of the vector image
at that resolution. This bitmap is then displayed or printed.
% Project specific line
We will discuss the use of vector graphics, as they lend themselves more naturally to scaling,
though many of the techniques will also apply to raster graphics. Indeed, the use of quadtrees
to compress bitmapped data is well established\cite{salomon2007data}, and the view-reparenting
techniques have been used to facilitate an infinite zoom into a (procedurally generated) document\cite{munroe2014pixels}.
\subsection{Rasterizing Vector Graphics}
Before an vector document can be rasterized, the co-ordinates of any shapes must
be transformed into \emph{screen space} or \emph{viewport space}\cite{blinn1992trip}.
On a typical display, many of these screen-space coordinates require very little precision or range.
However, the co-ordinate transform must take care to ensure that precision is preserved during this transform.
After this transformation, the image is decomposed into its separate shapes, which are rasterized
and then composited together.
Most graphics formats support Porter-Duff compositing\cite{porter1984compositing}.
Porter-Duff compositing gives each element (typically a pixel) a ``coverage'' value,
denoted $\alpha$ which represents the contribution of that element to the final scene.
Completely transparent elements would have an $\alpha$ value of $0$, and completely opaque
elements have an $\alpha$ of $1$. This permits arbitrary shapes to be layered over one another
in the raster domain, while retaining soft-edges.
The rasterization process may then proceed on one object (or shape) at a time. There are special algorithms for rasterizing
different shapes.
\begin{description}
\item[Line Segment]
Straight lines between two points are easily rasterized using Bresenham's algorithm\cite{bresenham1965algorithm}.
Bresenham's algorithm draws a pixel for every point along the \emph{long} axis of the line, moving along the short
axis when the error exceeds $\frac{1}{2}$ a pixel.
Bresenham's algorithm only operates on lines whose endpoints lie on integer pixel coordinates. Due to this, line ``clipping''
may be performed to find endpoints of the line segment such that the entire line will be on-screen. However, if line clipping is
performed na\"ively without also setting the error accumulator correctly, the line's slope will be altered slightly, becoming dependent
on the viewport.
\item[B\'ezier Curve]
A B\'ezier curve is a smooth (i.e.\ infinitely differentiable) curve between two points, represented by a Bernstein polynomial.
The coefficients of this Bernstein polynomial are known as the ``control points.''
B\'ezier curves are typically rasterized using De Casteljau's algorithm\cite{foley1996computer}
Line Segments are a first-order B\'ezier curve.
\end{description}
%There are special algorithms
%for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier
%Curves\cite{goldman_thefractal}.
\subsection{GPU Rendering}
While traditionally, rasterization was done entirely in software, modern computers and mobile devices have hardware support for rasterizing
lines and triangles designed for use rendering 3D scenes. This hardware is usually programmed with an
API like \texttt{OpenGL}\cite{openglspec}.
More complex shapes like B\'ezier curves can be rendered by combining the use of bitmapped textures (possibly using signed-distance
fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) strtched over a triangle mesh
approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}.
Indeed, there are several implementations of entire vector graphics systems using OpenGL:
\begin{itemize}
\item The OpenVG standard\cite{robart2009openvg} has been implemented on top of OpenGL ES\cite{oh2007implementation};
\item the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz}
\item and the SVG/PostScript GPU renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}.
\end{itemize}
\section{Numeric formats}
On modern computer architectures, there are two basic number formats supported:
fixed-width integers and \emph{floating-point} numbers. Typically, computers
natively support integers of up to 64 bits, capable of representing all integers
between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers,
which have the same cardinality as their \emph{unsigned} counterparts, but which
represent integers in the range $[-(2^{63}), 2^{63} - 1]$}.
By introducing a fractional component (analogous to a decimal point), we can convert
integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater
precision. For example, a number in 4.4 fixed-point format would have four bits representing the integer
component, and four bits representing the fractional component:
\begin{equation}
\underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75
\end{equation}
Floating-point numbers\cite{goldberg1992thedesign, taocp2} are a superset of scientific notation,
originally used by the Babylonians (in base 60) perhaps as early as 1750 BC.
Each number $n$ (in a base $\beta$) consists of integers $e$ (the \emph{exponent}) and $m$ (the \emph{mantissa}) such that
\begin{equation}
n = \beta^{e} \times m_f
\end{equation}
Both $e$ and $m$ typically have a fixed width $q$ and $p$ respectively in base $\beta$. For notational convenience, we also
represent $m$ as a fraction $m_f = \beta^{-p} m$ so $|m_f| < 1$. We further call a floating point number
\emph{normalised} if the first digit of $m$ is nonzero. The exponent $e$ is usually allowed to be
either positive or negative (either by having a sign bit, or being offset a fixed amount). The value
of a floating point number $n$ must therefore be $-\beta^{e_max} < n < \beta^{e_max}$. The smallest
possible magnitude of a normalised floating point number is $\beta^{e_min-1}$, as $m_f$ must be at least $0.1$.
Non-normalised numbers may represent $0$, and many normalised floating-point include zero in some way, too.
The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types
which are used\footnote{Many systems implement the IEEE 754 standard's storage formats,
but do not implement arithmetic operations in accordance with this standard\cite{openglspec, ARBshaderprecision}.} by most
computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa, 1 sign bit) and
64-bit (11-bit exponent, 53-bit mantissa, 1 sign bit) formats\footnote{The 2008
revision to this standard\cite{ieee754std2008} adds some additional formats, but is
less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits
of precision respectively. The IEEE 754 standard also introduces an implicit $1$ bit in the most
significant place of the mantissa when the exponent is not $e_min$.
Floating-point numbers behave quite differently to integers or fixed-point numbers, as
the representable numbers are not evenly distributed. Large numbers are stored to a lesser
precision than numbers close to zero. This can present problems in documents when zooming in
on objects far from the origin. Furthermore, due to the limited precision, and the different
``alignment'' of operands in arithmetic, several algebraic properties of rings we take for
granted in the integers do not exist, including the property of assosciativity\cite{taocp2}.
IEEE floating-point has some interesting features as well, including values for negative zero,
positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{subnormal} values, which
trade precision for range when dealing with very small numbers by not normalising
numbers when the exponent is $e_min$. Indeed, with these values,
IEEE 754 floating-point equality does not form an equivalence relation, which can cause issues
when not considered carefully\cite{goldberg1991whatevery}.
There also exist formats for storing numbers with arbitrary precising and/or range.
Some programming languages support ``big integer''\cite{java_bigint} types which can
represent any integer that can fit in the system's memory. Similarly, there are
arbitrary-precision floating-point data types\cite{java_bigdecimal}\cite{boost_multiprecision}
which can represent any number of the form
\begin{equation}
\frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed.
\end{equation}
These types are typically built from several native data types such as integers and floats,
paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms}
Operations on these types, therefore, are usually slower than those performed on native types.
Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows
values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats,
this would have a recurring representation:
\begin{equation}
\underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots
\end{equation}
Whereas with a rational type, this is simply $\frac{1}{3}$.
Rationals do not have a unique representation for each value, typically the reduced fraction is used
as a characteristic element.
While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats,
modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64}
IEEE floats, though some features of IEEE floats, like denormals and NaNs are not always supported.
Note, however, that some parts of the GPU are not able to use all formats,
so precision will likely be truncated at some point before display.
Higher precision numeric types can be implemented or used on the GPU, but are
slow.\cite{emmart2010high}
\section{Document Formats}
Most existing document formats --- such as the venerable PostScript and PDF --- are, however, designed to imitate
existing paper documents, largely to allow for easy printing. In order to truly take advantage of the possibilities operating in the digital
domain opens up to us, we must look to new formats.
Formats such as \texttt{HTML} allow for a greater scope of interactivity and for a more data-driven model, allowing
the content of the document to be explored in ways that perhaps the author had not anticipated.\cite{hayes2012pixels}
However, these data-driven formats typically do not support fixed layouts, and the display differs from renderer to
renderer.
\subsection{A Taxonomy of Document formats}
The process of creating and displaying a document is a rather universal one (\ref{documenttimeline}), though
different document formats approach it slightly differently. A document often begins as raw content: text and images
(be they raster or vector) and it must end up as a stream of photons flying towards the reader's eyes.
\begin{figure}
\label{documenttimeline}
\centering \includegraphics[width=0.8\linewidth]{figures/documenttimeline}
\caption{\label{doclifecycle}The lifecycle of a document}
\end{figure}
There are two fundamental stages (as shown in Figure \ref{doclifecycle}) by which all documents --- digital or otherwise --- are produced and displayed:
\emph{layout} and \emph{rendering}. The \emph{layout} stage is where the positions and sizes of text and other graphics are
determined. The text will be \emph{flowed} around graphics, the positions of individual glyphs will be placed, ensuring
that there is no undesired overlap and that everything will fit on the page or screen.
The \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor.
Each graphical element is rasterized and composited into a single image of the target resolution.
Different document formats cover documents in different stages of this project. Bitmapped images,
for example, would represent the output of the final stage of the process, whereas markup languages typically specify
a document which has not yet been processed, ready for the layout stage.
Furthermore, some document formats treat the document as a program, written in
a (usually Turing complete) document language with instructions which emit shapes to be displayed. These shapes are either displayed
immediately, as in PostScript, or stored in another file, such as with \TeX or \LaTeX, which emit a \texttt{DVI} file. Most other
forms of document use a \emph{Document Object Model}, being a list or tree of objects to be rendered. \texttt{DVI}, \texttt{PDF},
\texttt{HTML}\footnote{Some of these formats --- most notably \texttt{HTML} --- implement a scripting lanugage such as JavaScript,
which permit the DOM to be modified while the document is being viewed.} and SVG\cite{svg2011-1.1}. Of these, only \texttt{HTML} and \TeX typically
store documents in pre-layout stages, whereas even Turing complete document formats such as PostScript typically encode documents
which already have their elements placed.
\begin{description}
\item[\TeX \, and \LaTeX]
Donald Knuth's typesetting language \TeX \, is one of the older computer typesetting systems, originally conceived in 1977\cite{texdraft}.
It implements a Turing-complete language and is human-readable and writable, and is still popular
due to its excellent support for typesetting mathematics.
\TeX only implements the ``layout'' stage of document display, and produces a typeset file,
traditionally in \texttt{DVI} format, though modern implementations will often target \texttt{PDF} instead.
This document was prepared in \LaTeXe.
\item[DVI]
\TeX \, traditionally outputs to the \texttt{DVI} (``DeVice Independent'') format: a binary format which consists of a
simple stack machine with instructions for drawing glyphs and curves\cite{fuchs1982theformat}.
A \texttt{DVI} file is a representation of a document which has been typeset, and \texttt{DVI}
viewers will rasterize this for display or printing, or convert it to another similar format like PostScript
to be rasterized.
\item[HTML]
The Hypertext Markup Language (HTML)\cite{html2rfc} is the widely used document format which underpins the
world wide web. In order for web pages to adapt appropriately to different devices, the HTML format simply
defined semantic parts of a document, such as headings, phrases requiring emphasis, references to images or links
to other pages, leaving the \emph{layout} up to the browser, which would also rasterize the final document.
The HTML format has changed significantly since its introduction, and most of the layout and styling is now controlled
by a set of style sheets in the CSS\cite{css2spec} format.
\item[PostScript]
Much like DVI, PostScript\cite{plrm} is a stack-based format for drawing vector graphics, though unlike DVI (but like \TeX), PostScript is
text-based and Turing complete. PostScript was traditionally run on a control board in laser printers, rasterizing pages at high resolution
to be printed, though PostScript interpreters for desktop systems also exist, and are often used with printers which do not support PostScript natively.\cite{ghostscript}
PostScript programs typically embody documents which have been typeset, though as a Turing-complete language, some layout can be performed by the document.
\item[PDF]
Adobe's Portable Document Format (PDF)\cite{pdfref17} takes the PostScript rendering model, but does not implement a Turing-complete language.
Later versions of PDF also extend the PostScript rendering model to support translucent regions via Porter-Duff compositing\cite{porter1984compositing}.
PDF documents represent a particular layout, and must be rasterized before display.
\item[SVG]
Scalable Vector Graphics (SVG) is a vector graphics document format\cite{svg2011-1.1} which uses the Document Object Model. It consists of a tree of matrix transforms,
with objects such as vector paths (made up of B\'ezier curves) and text at the leaves.
\end{description}
\subsection{Precision in Document Formats}
Existing document formats --- typically due to having been designed for documents printed on paper, which of course has
limited size and resolution --- use numeric types which can only represent a fixed range and precision.
While this works fine with printed pages, users reading documents on computer screens using programs
with ``zoom'' functionality are prevented from working beyond a limited scale factor, lest artefacts appear due
to issues with numeric precision.
\TeX uses a $14.16$ bit fixed point type (implemented as a 32-bit integer type, with one sign bit and one bit used to detect overflow)\cite{beebe2007extending}.
This can represent values in the range $[-(2^{14}), 2^{14} - 1]$ with 16 binary digits of fractional precision.
The DVI files \TeX \, produces may use ``up to'' 32-bit signed integers\cite{fuchs1982theformat} to specify the document, but there is no requirement that
implementations support the full 32-bit type. It would be permissible, for example, to have a DVI viewer support only 24-bit signed integers, though many
files which require greater range may fail to render correctly.
PostScript\cite{plrm} supports two different numeric types: \emph{integers} and \emph{reals}, both of which are specified as strings. The interpreter's representation of numbers
is not exposed, though the representation of integers can be divined by a program by the use of bitwise operations. The PostScript specification lists some ``typical limits''
of numeric types, though the exact limits may differ from implementation to implementation. Integers typically must fall in the range $[-2^{31}, 2^{31} - 1]$,
and reals are listed to have largest and smallest values of $\pm10^{38}$, values closest to $0$ of $\pm10^{-38}$ and approximately $8$ decimal digits of precision,
derived from the IEEE 754 single-precision floating-point specification.
Similarly, the PDF specification\cite{pdfref17} stores \emph{integers} and \emph{reals} as strings, though in a more restricted format than PostScript.
The PDF specification gives limits for the internal representation of values. Integer limits have not changed from the PostScript specification, but numbers
representable with the \emph{real} type have been specified differently: the largest representable values are $\pm 3.403\times 10^{38}$, the smallest non-zero representable values are
$\pm1.175 \times 10^{-38}$ with approximately $5$ decimal digits of precision \emph{in the fractional part}.
\footnote{The PDF specification mistakenly leaves out the negative in the exponent here.}
Adobe's implementation of PDF uses both IEEE 754 single precision floating-point numbers and (for some calculations, and in previous versions) 16.16 bit fixed-point values.
The SVG specification\cite{svg2011-1.1} specifies numbers as strings with a decimal representation of the number.
It is stated that a ``Conforming SVG Viewer'' must have ``all visual rendering accurate to within one device pixel to the mathematically correct result at the initial 1:1
zoom ratio'' and that ``it is suggested that viewers attempt to keep a high degree of accuracy when zooming.''
A ``Conforming High-Quality SVG Viewer'' must use ``double-precision floating point\footnote{Presumably the 64-bit IEEE 754 ``double'' type.}'' for computations involving
coordinate system transformations.
\section{Quadtrees}
When viewing or processing a small part of a large document, it may be helpful to
only process --- or \emph{cull} --- parts of the document which are not on-screen.
\begin{figure}[h]
\centering \includegraphics[width=0.4\linewidth]{figures/quadtree_example}
\caption{A simple quadtree.}
\end{figure}
The quadtree\cite{finkel1974quad}is a data structure --- one of a family of \emph{spatial}
data structures --- which recursively breaks down space into smaller subregions
which can be processed independently. Points (or other objects) are added to a single
node which (if certain criteria are met) is split into four equal-sized subregions, and
points attached to the region which contains them.
Quadtrees have been used in computer graphics for both culling --- excluding objects in
nodes which are not visible --- and ``level of detail'', where different levels of the quadtree store
different quality versions of objects or data\cite{zerbst2004game}.
Typically the number of points in a node
exceeding a maximum triggers this split, though in our case likely the level of precision required exceeding
that supported by the data type in use.
In this project, we will be experimenting with a form of quadtree in which each
node has its own independent coordinate system, allowing us to store some spatial
information\footnote{One bit per-coordinate, per-level of the quadtree} within the
quadtree structure, eliminating redundancy in the coordinates of nearby objects.
Other spatial data structures exist, such as the KD-tree\cite{bentley1975multidimensional},
which partitions the space on any axis-aligned line; or the BSP tree\cite{fuchs1980onvisible},
which splits along an arbitrary line which need not be axis aligned. We believe, however,
that the simpler conversion from binary coordinates to the quadtree's binary split make
it a better avenue for initial research to explore.
\chapter{\texttt{IPDF}: A Document Precision Playground}
In order to investigate the precision issues present in document formats and viewers,
we developed a document viewer \texttt{IPDF}. \texttt{IPDF} is a \texttt{C++} program
which supports rendering TrueType font outlines and a subset of the \texttt{SVG}
format.
At its core, \texttt{IPDF} breaks documents down into a set of \emph{objects}, each
with rectangular bounds $(x, y, w, h)$ and with several \emph{types}:
\begin{enumerate}
\item \textbf{Rectangle:} A simple, axis-aligned rectangle (either filled with a solid colour or
left as an outline) covering the object bounds.
\item \textbf{Ellipse:} A filled, axis-aligned ellipse, rendered parametrically.
\item \textbf{B\'ezier Curve:} A cubic B\'ezier curve. B\'ezier curve
control points are stored relative to the coordinate system provided by the
document bounds, and several objects can share these coefficients.
\end{enumerate}
\texttt{IPDF} can be compiled using different number representations, to allow for
comparison between not only different-precision floating point numbers, but also
arbitrary-precision types such as rationals.
Documents in \texttt{IPDF} may be rendered either on the CPU, using a custom software
rasterizer built on the numeric types supported, or on the GPU with OpenGL 3.2 shaders, using
the GPU's default numeric representation.
Furthermore, \texttt{IPDF} allows \texttt{SVG} files (and text rendered with TrueType fonts) to be inserted at any point and scale in
the document, allowing for the same images to be compared at different scales.
\section{GPU Floating Point Rounding Behaviour}
While the IEEE-754 standard specifies both the format of floating-point numbers and the operations they perform.
However, the OpenGL specification\cite{openglspec}, while requiring the same basic format for single-precision floats,
does not specify the behaviour of denormals, nor requires any support for NaN or infinite values. Similarly, no support
for floating point exceptions is required, with the note that no operation should ever halt the GPU.
However, an extension to the specification, \texttt{GL\_ARB\_shader\_precision}\cite{ARBshaderprecision} in 2010
allows programs to require stricter precision requirements. Notably, support for infinite values is required and
maximum relative error in ULPs.
Different GPU vendors and drivers have different rounding behaviour, and most hardware (both CPU and GPU) provide
ways of disabling support for some IEEE features to improve performance\cite{intelgpuspec, dawson2012notnormal}.
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{figures/circles_gpu_vs_cpu/comparison.pdf}
\caption{\label{fig:circles}The edges of a unit circle viewed through bounds (x,y,w,h) = (0.0869386,0.634194,2.63295e-07,2.63295e-07)}
\end{figure}
Many GPUs now support double-precision floating-point numbers\footnote{Some drivers, however, do not yet support this feature, including one of our test machines.}
as specified in the \texttt{GL\_ARB\_gpu\_shader\_fp64}\cite{arb_gpu_shader_fp64} OpenGL extension.
However, many of the fixed-function parts of the GPU do not support double-precision floats,
making it impractical to use them.
To see the issues, we rendered the edge of a circle (calculated by discarding pixels with $x^2 + y^2 > 1$) on several GPUs, as well as an \texttt{x86-64} CPU,
as seen in \autoref{fig:circles}.
Of these, the nVidia GPU came closest to the CPU rendering, whereas Intel's hardware clearly performs some optimisation which produces quite different artefacts.
The diagonal distortion in the AMD rendering may be a result of different rounding across the two triangles across which the circle was rendered.
\section{Distortion and Quantisation at the limits of precision}
TODO: Fix these figures, explain properly.
\begin{figure}[h]
\begin{subfigure}{.5\linewidth}
\includegraphics[width=\linewidth]{cpu0}
\subcaption{Intended rendering}
\end{subfigure}
\begin{subfigure}{.5\linewidth}
\includegraphics[width=\linewidth]{cpu100}
\subcaption{With artefacts}
\end{subfigure}
\caption{\label{fig:precisionerrors}Floating point errors affecting the rendering of an image}
\end{figure}
When trying to insert fine detail into a document using fixed-width floats, some precision is lost. In particular, the
control points of the curves making up the image get rounded to the nearest\footnote{Other rounding modes are available,
but they all suffer from similar artefacts.} representable point. \autoref{fig:precisionerrors} shows what happens when an
image is small enough to be affected by this quantisation. The grid structure becomes very apparent.
These artefacts also become prevalent when an object is far from the origin, as more bits would be required to store the
position, so the lower order bits of the position must be discarded. As the position of the viewport and the object will share
many of the same initial digits, catastrophic cancellation\cite{goldberg1991whatevery} can occur.
%\subsection{Big Integers}
%Computers typically operate on \emph{fixed-width} integers\footnote{The term ``machine word'' has traditionally been used
%for a CPUs default numeric type, though for historical reasons this is sometimes also used to refer to a 16-bit type, regardless
%of the preferred type of the underlying machine.} for which they have hardware adders.
%These types are represented by $n$ bits (typically $n$ is 32 or 64), and represents the
%ring $\mathbb{Z}_n$: i.e.\ any number in the range $[0,2^{n})$ can be represented, and operations
%are performed $\mod 2^n$.
\chapter{View Reparenting and the Quadtree}
One of the important parts of document rendering is that of coordinate transforms.
Traditionally, the document exists in its own coordinate system, $\mathbf{b}$.
We then define a coordinate system $\mathbf{v}$ to represent the view.
To eliminate visible error due to floating point precision, we need to ensure that
any point (including control points) must be representable as a float both in the
document and in view coordinates, i.e:
\begin{enumerate}
\item have a limited magnitude,
\item be precise enough to uniquely identify a pixel on the display.
\end{enumerate}
Despite a point not being representable with floats in one coordinate system,
it may be representable in another. We can therefore split a document up into
several coordinate systems, such that each point is completely representable.
Similarly, the points making up the visible document need to all be representable
(to at least one pixel's worth of precision). To achieve this, we use a quadtree where
each node stores points within its bounds in its own coordinate system. Objects
which span multiple nodes are clipped, such that no points\footnote{except some
control points} lie outside the quadtree node.
Setting aside the possibility that an object might span multiple nodes for the time
being, let's investigate how the coordinates of a point are affected by placing it
in the quadtree.
Suppose $p = (p_x, p_y)$ is a point in a global document coordinate system, which we'll
consider the root node of our quadtree. $p_x$ and $p_y$ are represented by a finite
list of binary digits $x_0 \dots x_n$, $y_0 \dots y_n$.
Consider now the pair $(x_0, y_0)$:
\begin{align*}
x_0 &= \begin{cases}
0 & \text{if }p_x\text{ is in the left half of }d\\
1 & \text{if }p_x\text{ is in the right half of }d
\end{cases} \\
y_0 &= \begin{cases}
0 & \text{if }p_y\text{ is in the bottom half of }d\\
1 & \text{if }p_y\text{ is in the top half of }d
\end{cases}
\end{align*}
We have therefore found which child node of our quadtree contains $p$. The coordinates
of $p$ within that node are $(x_1 \dots x_n, y_1 \dots y_n)$. By the principle of mathematical
induction, we can repeat the process to move more bits from the coordinates of $p$ into the
structure of the quadtree, until the remaining coordinates may be precisely represented.
This implies that the quadtree is equivalent to an arbitrary precision integer datatype. By reversing
the process, any point representable in the quadtree can be stored as a fixed-length bit string.
%It is natural to treat these coordinates as being between $0$ and $1$.
Furthermore, we can get approximations of points by inserting them into higher levels of the quadtree,
with their coordinates rounded. By then viewing the document at a certain level of the quadtree, we
then not only do not need to consider bits of precision which are not required to represent the point to
pixel resolution, we can also cull objects outside the visible nodes at that level.
Indeed, we can guarantee that, by selecting the level of the quadtree we view such that
the view width and height lie within the range $(0.5,1]$, we can ensure that at most
four nodes need to be rendered.
\section{Clipping cubic B\'eziers}
In order to ensure that the quadtree maintains precision in this fashion, we need to ensure
that all objects are entirely contained within a quadtree node. This is not always the case and,
indeed, when zooming in on any object, eventually the object will span multiple quadtree nodes.
{\bf TODO: We can show this by showing that, given $(a,b) \in \mathbb{R}, \exists $ binary fraction $c$ such that $a < c < b$.}
We therefore need a way to subdivide objects into several objects, each of which is contained within one node.
To do this, we need to \emph{clip} the cubic B\'ezier curves from which our document is formed to a rectangle.
Cubic B\'ezier curves are defined parametrically as two cubics: $x(t)$ and $y(t)$, with $0 \leq t \leq 1$.
We clip these to a rectangular bounding box in stages. We first find the intersections of the curve with
the clipping rectangle by finding the roots\footnote{While there is a method for solving cubics exactly\cite{cardano1545artis},
we instead use numeric methods to avoid the need for square root operations, which cannot be done exactly
on some of the numeric types we used.}
of the cubic shifted to match the corners of the rectangle. This produces some spurious points, as it assumes the edges
of the clipping rectangle are lines with infinite extent, but this at worst introduces some minor inefficiency in the process and does
not affect the result.
Once the values of the parameter $t$ which intersect the clipping rectangle have
been determined, the curve is split into segments. To do this, the values $0$ and $1$
(representing the endpoints) are added to the list of intersecting $t$ values, which is then
sorted. Adjacent values $t_0$ and $t_1$ form a segment. The midpoint of that segment (with the
value $\frac{t_0 + t_1}{2}$) is evaluate and if it falls
outside the clipping rectangle, the segment is discarded.
Finally, the segments are re-parametrised by subdividing the curve using De Casteljau's
algorithm\cite{sederberg2007cad}. By re-parameterising the curves such that $0 \leq t \leq 1$,
we ensure that the first and last coefficients have the endpoints' coodinates, and therefore
lie in the quadtree node.
{\bf TODO: Prove that the other control points' magnitude is reduced, and try to quantify it, prove that
it will never overflow.}
\section{Implementation Details}
\begin{itemize}
\item Store object ID ranges.
\item Pointers to children and parent.
\item Linked-list of ``overlay'' nodes for mutation.
\item Have billions of bugs.
\end{itemize}
\chapter{Experimental Results}
These are all iPython-y at the moment.
Roughly 3s/frame for GMP rationals, 16ms for Quadtree which is still slightly broken.
\section{Performance per object}
\section{Performance per onscreen object}
\section{Performance per zoom-level}
\section{Stability of performance}
\chapter{Further Work and Conclusion}
The use of spatial data structures to specify documents with arbitrary precision
has been validated as a viable alternative to the use of arbitrary precision numeric
types where an arbitrary (rather than infinite) amount of precision is needed.
Once constructed, they are faster in many circumstanced, and the structure
can also be used for culling. When the viewport moves and scales smoothly, the cost
of constructing new nodes is amortised over the movement.
Unfortunately, the mutation of the quadtree is difficult and slow, and discontinuous
movement can result in a large number of nodes needing to be created.
Quadtree seems to be viable and is really performant.
Loop-blinn shading.
\bibliography{papers}
\end{document}