X-Git-Url: https://git.ucc.asn.au/?p=ipdf%2Fdocuments.git;a=blobdiff_plain;f=LitReviewDavid.tex;h=5cde5c65ade51a6f11b9a5882cb456c3217eac4b;hp=e84b6f54d20b345798197381a0f4806f99bf03bd;hb=082cbce3061569d56f481aafa91dd47ec2ad0f10;hpb=66a3095577df07859b7ef9e8378238a4ef56d278 diff --git a/LitReviewDavid.tex b/LitReviewDavid.tex index e84b6f5..5cde5c6 100644 --- a/LitReviewDavid.tex +++ b/LitReviewDavid.tex @@ -1,6 +1,9 @@ \documentclass[a4paper,10pt]{article} \usepackage[utf8]{inputenc} \usepackage{hyperref} +\usepackage{graphicx} +\usepackage{amsmath} +\usepackage{amssymb} %opening \title{Literature Review} @@ -18,166 +21,337 @@ could be passed on from person to person without them ever meeting. And thus the document was born. -Traditionally, documents have been static: just marks on paper, but with the advent of computers many more possibilities open up. -Most existing document formats --- such as the venerable PostScript and PDF --- are, however, designed to imitate -existing paper documents, largely to allow for easy printing. In order to truly take advantage of the possibilities operating in the digital -domain opens up to us, we must look to new formats. +Traditionally, documents have been static: just marks on rock, parchment or paper, but with the advent of computers many more possibilities open up. -Formats such as \texttt{HTML} allow for a greater scope of interactivity and for a more data-driven model, allowing -the content of the document to be explored in ways that perhaps the author had not anticipated.\cite{hayes2012pixels} -However, these data-driven formats typically do not support fixed layouts, and the display differs from renderer to -renderer. - -Existing document formats, due to being designed to model paper, -have limited precision (8 decimal digits for PostScript\cite{plrm}, 5 decimal digits for PDF\cite{pdfref17}). -This matches the limited resolution of printers and ink, but is limited when compared to what aught to be possible -with ``zoom'' functionality, which is prevent from working beyond a limited scale factor, lest artefacts appear due -to issues with numeric precision. \section{Rendering} -As existing displays (and printers) are bit-mapped devices, one of the core problems which must be solved when -designing a document format is how it is to be \emph{rasterized} into a bitmap at a given resolution. - -\subsection{Compositing Digital Images\cite{porter1984compositing}} - - - -Perter and Duff's classic paper "Compositing Digital Images" lays the -foundation for digital compositing today. By providing an "alpha channel," -images of arbitrary shapes — and images with soft edges or sub-pixel coverage -information — can be overlayed digitally, allowing separate objects to be -rasterized separately without a loss in quality. - -Pixels in digital images are usually represented as 3-tuples containing -(red component, green component, blue component). Nominally these values are in -the [0-1] range. In the Porter-Duff paper, pixels are stored as $(R,G,B,\alpha)$ -4-tuples, where alpha is the fractional coverage of each pixel. If the image -only covers half of a given pixel, for example, its alpha value would be 0.5. - -To improve compositing performance, albeit at a possible loss of precision in -some implementations, the red, green and blue channels are premultiplied by the -alpha channel. This also simplifies the resulting arithmetic by having the -colour channels and alpha channels use the same compositing equations. - -Several binary compositing operations are defined: +Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours; +and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs +and match how cameras, printers and monitors work. + + +\begin{figure}[h] + \centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example} + \caption{A circle as a vector image and a $32 \times 32$ pixel raster image} +\end{figure} + + +However, bitmap devices do not handle zooming beyond their ``native'' resolution (the resolution where one document pixel maps +to one display pixel), exhibiting an artefact called pixelation where the pixel structure becomes evident. Attempts to use +interpolation to hide this effect are never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected. + +Vector graphics avoid many of these problems: the representation is independent of the output resolution, and rather +an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines, +arcs and glyphs. + +As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at +a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image +at that resolution. + +% Project specific line +This project will be based around vector graphics, as these properties make it more suited to experimenting with zoom +quality. + +\subsection{Rasterizing Vector Graphics} + +Before an vector document can be rasterized, the co-ordinates of any shapes must +be transformed into \emph{screen space} or \emph{viewport space}\cite{blinn1992trip}. +On a typical display, many of these screen-space coordinates require very little precision or range. +However, the co-ordinate transform must take care to ensure that precision is preserved during this transform. + +After this transformation, the image is decomposed into its separate shapes, which are rasterized +and then composited together. +Most graphics formats support Porter-Duff compositing\cite{porter1984compositing}. +Porter-Duff compositing gives each element (typically a pixel) a ``coverage'' value, +denoted $\alpha$ which represents the contribution of that element to the final scene. +Completely transparent elements would have an $\alpha$ value of $0$, and completely opaque +elements have an $\alpha$ of $1$. This permits arbitrary shapes to be layered over one another +in the raster domain, while retaining soft-edges. + +The rasterization process may then proceed on one object (or shape) at a time. There are special algorithms for rasterizing +different shapes. + +\begin{description} + \item[Line Segment] + Straight lines between two points are easily rasterized using Bresenham's algorithm\cite{bresenham1965algorithm}. + Bresenham's algorithm draws a pixel for every point along the \emph{long} axis of the line, moving along the short + axis when the error exceeds $\frac{1}{2}$ a pixel. + + Bresenham's algorithm only operates on lines whose endpoints lie on integer pixel coordinates. Due to this, line ``clipping'' + may be performed to find endpoints of the line segment such that the entire line will be on-screen. However, if line clipping is + performed na\"ively without also setting the error accumulator correctly, the line's slope will be altered slightly, becoming dependent + on the viewport. + + \item[B\'ezier Curve] + A B\'ezier curve is a smooth (i.e.\ infinitely differentiable) curve between two points, represented by a Bernstein polynomial. + The coefficients of this Bernstein polynomial are known as the ``control points.'' + + B\'ezier curves are typically rasterized using De Casteljau's algorithm\cite{foley1996computer} + Line Segments are a first-order B\'ezier curve. + + \item[B\'ezier Spline] + A spline of order $n$ is a $C^{n-1}$ smooth continuous piecewise function composed of polynomials of degree $\leq n$. + In a B\'ezier spline, these polynomials are Bernstein polynomials, hence the spline is a curve made by joining B\'ezier curves + end-to-end (in a manner which preserves some level of smoothness). + + Many vector graphics formats call B\'ezier splines of a given order (typically quadratic or cubic) ``paths'' and treat them as the + fundamental type from which shapes are formed. +\end{description} + + +%There are special algorithms +%for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier +%Curves\cite{goldman_thefractal}. + +\subsection{GPU Rendering} +While traditionally, rasterization was done entirely in software, modern computers and mobile devices have hardware support for rasterizing +lines and triangles designed for use rendering 3D scenes. This hardware is usually programmed with an +API like \texttt{OpenGL}\cite{openglspec}. + +More complex shapes like B\'ezier curves can be rendered by combining the use of bitmapped textures (possibly using signed-distance +fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) strtched over a triangle mesh +approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}. + +Indeed, there are several implementations of entire vector graphics systems using OpenGL: \begin{itemize} -\item over -\item in -\item out -\item atop -\item xor -\item plus + \item The OpenVG standard\cite{robart2009openvg} has been implemented on top of OpenGL ES\cite{oh2007implementation}; + \item the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz} + \item and the SVG/PostScript GPU renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}. \end{itemize} -The paper further provides some additional operations for implementing fades and -dissolves, as well as for changing the opacity of individual elements in a -scene. - -The method outlined in this paper is still the standard system for compositing -and is implemented almost exactly by modern graphics APIs such as \texttt{OpenGL}. It is -all but guaranteed that this is the method we will be using for compositing -document elements in our project. - -\subsection{Bresenham's Algorithm: Algorithm for computer control of a digital plotter\cite{bresenham1965algorithm}} -Bresenham's line drawing algorithm is a fast, high quality line rasterization -algorithm which is still the basis for most (aliased) line drawing today. The -paper, while originally written to describe how to control a particular plotter, -is uniquely suited to rasterizing lines for display on a pixel grid. - -Lines drawn with Bresenham's algorithm must begin and end at integer pixel -coordinates, though one can round or truncate the fractional part. In order to -avoid multiplication or division in the algorithm's inner loop, - -The algorithm works by scanning along the long axis of the line, moving along -the short axis when the error along that axis exceeds 0.5px. Because error -accumulates linearly, this can be achieved by simply adding the per-pixel -error (equal to (short axis/long axis)) until it exceeds 0.5, then incrementing -the position along the short axis and subtracting 1 from the error accumulator. - -As this requires nothing but addition, it is very fast, particularly on the -older CPUs used in Bresenham's time. Modern graphics systems will often use Wu's -line-drawing algorithm instead, as it produces antialiased lines, taking -sub-pixel coverage into account. Bresenham himself extended this algorithm to -produce Bresenham's circle algorithm. The principles behind the algorithm have -also been used to rasterize other shapes, including B\'{e}zier curves. - -\emph{GPU Rendering}\cite{loop2005resolution}, OpenVG implementation on GLES: \cite{oh2007implementation}, -\cite{robart2009openvg} -\emph{Existing implementations of document format rendering} - -\subsection{Xr: Cross-device Rendering for Vector Graphics\cite{worth2003xr}} - -Xr (now known as Cairo) is an implementation of the PDF v1.4 rendering model, -independent of the PDF or PostScript file formats, and is now widely used -as a rendering API. In this paper, Worth and Packard describe the PDF v1.4 rendering -model, and their PostScript-derived API for it. - -The PDF v1.4 rendering model is based on the original PostScript model, based around -a set of \emph{paths} (and other objects, such as raster images) each made up of lines -and B\'{e}zier curves, which are transformed by the ``Current Transformation Matrix.'' -Paths can be \emph{filled} in a number of ways, allowing for different handling of self-intersecting -paths, or can have their outlines \emph{stroked}. -Furthermore, paths can be painted with RGB colours and/or patterns derived from either -previously rendered objects or external raster images. -PDF v1.4 extends this to provide, amongst other features, support for layering paths and -objects using Porter-Duff compositing\cite{porter1984compositing}, giving each painted path -the option of having an $\alpha$ value and a choice of any of the Porter-Duff compositing -methods. - -The Cairo library approximates the rendering of some objects (particularly curved objects -such as splines) with a set of polygons. An \texttt{XrSetTolerance} function allows the user -of the library to set an upper bound on the approximation error in fractions of device pixels, -providing a trade-off between rendering quality and performance. The library developers found -that setting the tolerance to greater than $0.1$ device pixels resulted in errors visible to the -user. - -\subsection{Glitz: Hardware Accelerated Image Compositing using OpenGL\cite{nilsson2004glitz}} - -This paper describes the implementation of an \texttt{OpenGL} based rendering backend for -the \texttt{Cairo} library. - -The paper describes how OpenGL's Porter-Duff compositing is easily suited to the Cairo/PDF v1.4 -rendering model. Similarly, traditional OpenGL (pre-version 3.0 core) support a matrix stack -of the same form as Cairo. - -The ``Glitz'' backend will emulate support for tiled, non-power-of-two patterns/textures if -the hardware does not support it. - -Glitz can render both triangles and trapezoids (which are formed from pairs of triangles). -However, it cannot guarantee that the rasterization is pixel-precise, as OpenGL does not proveide -this consistently. - -Glitz also supports multi-sample anti-aliasing, convolution filters for raster image reads (implemented -with shaders). +\section{Numeric formats} + +On modern computer architectures, there are two basic number formats supported: +fixed-width integers and \emph{floating-point} numbers. Typically, computers +natively support integers of up to 64 bits, capable of representing all integers +between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers, +which have the same cardinality as their \emph{unsigned} counterparts, but which +represent integers in the range $[-(2^{63}), 2^{63} - 1]$}. + +By introducing a fractional component (analogous to a decimal point), we can convert +integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater +precision. For example, a number in 4.4 fixed-point format would have four bits representing the integer +component, and four bits representing the fractional component: +\begin{equation} + \underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75 +\end{equation} + + +Floating-point numbers\cite{goldberg1992thedesign} are the binary equivalent of scientific notation: +each number consisting of an exponent ($e$) and a mantissa ($m$) such that a number is given by +\begin{equation} + n = 2^{e} \times m +\end{equation} + +The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types +which are used\footnote{Many systems' implement the IEEE 754 standard's storage formats, +but do not implement arithmetic operations in accordance with this standard.} by most +computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa, 1 sign bit) and +64-bit (11-bit exponent, 53-bit mantissa, 1 sign bit) formats\footnote{The 2008 +revision to this standard\cite{ieee754std2008} adds some additional formats, but is +less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits +of precision respectively. + +Floating-point numbers behave quite differently to integers or fixed-point numbers, as +the representable numbers are not evenly distributed. Large numbers are stored to a lesser +precision than numbers close to zero. This can present problems in documents when zooming in +on objects far from the origin. + +IEEE floating-point has some interesting features as well, including values for negative zero, +positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{denormal} values, which +trade precision for range when dealing with very small numbers. Indeed, with these values, +IEEE 754 floating-point equality does not form an equivalence relation, which can cause issues +when not considered carefully.\cite{goldberg1991whatevery} + +There also exist formats for storing numbers with arbitrary precising and/or range. +Some programming languages support ``big integer''\cite{java_bigint} types which can +represent any integer that can fit in the system's memory. Similarly, there are +arbitrary-precision floating-point data types\cite{java_bigdecimal}\cite{boost_multiprecision} +which can represent any number of the form +\begin{equation} + \frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed. +\end{equation} +These types are typically built from several native data types such as integers and floats, +paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms} +These, therefore, are likely slower than the native types they are built on. + +Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows +values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats, +this would have a recurring representation: +\begin{equation} + \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots +\end{equation} +Whereas with a rational type, this is simply $\frac{1}{3}$. +Rationals do not have a unique representation for each value, typically the reduced fraction is used +as a characteristic element. + +While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats, +modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64} +IEEE floats, though some features of IEEE floats, like denormals and NaNs are not always supported. +Note, however, that some parts of the GPU are only able to use some formats, +so precision will likely be truncated at some point before display. +Higher precision numeric types can be implemented or used on the GPU, but are +slow.\cite{emmart2010high} -Performance was much improved over the software rasterization and over XRender accellerated rendering -on all except nVidia hardware. However, nVidia's XRender implementation did slow down significantly when -some transformations were applied. +\section{Document Formats} +Most existing document formats --- such as the venerable PostScript and PDF --- are, however, designed to imitate +existing paper documents, largely to allow for easy printing. In order to truly take advantage of the possibilities operating in the digital +domain opens up to us, we must look to new formats. -\textbf{Also look at \texttt{NV\_path\_rendering}} \cite{kilgard2012gpu} +Formats such as \texttt{HTML} allow for a greater scope of interactivity and for a more data-driven model, allowing +the content of the document to be explored in ways that perhaps the author had not anticipated.\cite{hayes2012pixels} +However, these data-driven formats typically do not support fixed layouts, and the display differs from renderer to +renderer. -\section{Floating-Point Precision} +\subsection{A Taxonomy of Document formats} + +The process of creating and displaying a document is a rather universal one (\ref{documenttimeline}), though +different document formats approach it slightly differently. A document often begins as raw content: text and images +(be they raster or vector) and it must end up as a stream of photons flying towards the reader's eyes. + +\begin{figure} + \label{documenttimeline} + \centering \includegraphics[width=0.8\linewidth]{figures/documenttimeline} + \caption{The lifecycle of a document} +\end{figure} + +There are two fundamental stages by which all documents --- digital or otherwise --- are produced and displayed: +\emph{layout} and \emph{rendering}. The \emph{layout} stage is where the positions and sizes of text and other graphics are +determined. The text will be \emph{flowed} around graphics, the positions of individual glyphs will be placed, ensuring +that there is no undesired overlap and that everything will fit on the page or screen. + +The \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor. +Each graphical element is rasterized and composited into a single image of the target resolution. + + +Different document formats cover documents in different stages of this project. Bitmapped images, +for example, would represent the output of the final stage of the process, whereas markup languages typically specify +a document which has not yet been processed, ready for the layout stage. + +Furthermore, some document formats treat the document as a program, written in +a (usually turing complete) document language with instructions which emit shapes to be displayed. These shapes are either displayed +immediately, as in PostScript, or stored in another file, such as with \TeX or \LaTeX, which emit a \texttt{DVI} file. Most other +forms of document use a \emph{Document Object Model}, being a list or tree of objects to be rendered. \texttt{DVI}, \texttt{PDF}, +\texttt{HTML}\footnote{Some of these formats --- most notably \texttt{HTML} --- implement a scripting lanugage such as JavaScript, +which permit the DOM to be modified while the document is being viewed.} and SVG\cite{svg2011-1.1}. Of these, only \texttt{HTML} and \TeX typically +store documents in pre-layout stages, whereas even turing complete document formats such as PostScript typically encode documents +which already have their elements placed. + +\begin{description} + \item[\TeX \, and \LaTeX] + Donald Knuth's typesetting language \TeX \, is one of the older computer typesetting systems, originally conceived in 1977\cite{texdraft}. + It implements a turing-complete language and is human-readable and writable, and is still popular + due to its excellent support for typesetting mathematics. + \TeX only implements the ``layout'' stage of document display, and produces a typeset file, + traditionally in \texttt{DVI} format, though modern implementations will often target \texttt{PDF} instead. + + This document was prepared in \LaTeXe. + + \item[DVI] + \TeX \, traditionally outputs to the \texttt{DVI} (``DeVice Independent'') format: a binary format which consists of a + simple stack machine with instructions for drawing glyphs and curves\cite{fuchs1982theformat}. + + A \texttt{DVI} file is a representation of a document which has been typeset, and \texttt{DVI} + viewers will rasterize this for display or printing, or convert it to another similar format like PostScript + to be rasterized. + + \item[HTML] + The Hypertext Markup Language (HTML)\cite{html2rfc} is the widely used document format which underpins the + world wide web. In order for web pages to adapt appropriately to different devices, the HTML format simply + defined semantic parts of a document, such as headings, phrases requiring emphasis, references to images or links + to other pages, leaving the \emph{layout} up to the browser, which would also rasterize the final document. + + The HTML format has changed significantly since its introduction, and most of the layout and styling is now controlled + by a set of style sheets in the CSS\cite{css2spec} format. + + \item[PostScript] + Much like DVI, PostScript\cite{plrm} is a stack-based format for drawing vector graphics, though unlike DVI (but like \TeX), PostScript is + text-based and turing complete. PostScript was traditionally run on a control board in laser printers, rasterizing pages at high resolution + to be printed, though PostScript interpreters for desktop systems also exist, and are often used with printers which do not support PostScript natively.\cite{ghostscript} + + PostScript programs typically embody documents which have been typeset, though as a turing-complete language, some layout can be performed by the document. + + \item[PDF] + Adobe's Portable Document Format (PDF)\cite{pdfref17} takes the PostScript rendering model, but does not implement a turing-complete language. + Later versions of PDF also extend the PostScript rendering model to support translucent regions via Porter-Duff compositing\cite{porter1984compositing}. + + PDF documents represent a particular layout, and must be rasterized before display. + + \item[SVG] + Scalable Vector Graphics (SVG) is a vector graphics document format\cite{svg2011-1.1} which uses the Document Object Model. It consists of a tree of matrix transforms, + with objects such as vector paths (made up of B\'ezier curves) and text at the leaves. + +\end{description} + +\subsection{Precision in Document Formats} + +Existing document formats --- typically due to having been designed for documents printed on paper, which of course has +limited size and resolution --- use numeric types which can only represent a fixed range and precision. +While this works fine with printed pages, users reading documents on computer screens using programs +with ``zoom'' functionality are prevented from working beyond a limited scale factor, lest artefacts appear due +to issues with numeric precision. -How floating-point works and what its behaviour is w/r/t range and precision -\cite{goldberg1991whatevery} -\cite{goldberg1992thedesign} +\TeX uses a $14.16$ bit fixed point type (implemented as a 32-bit integer type, with one sign bit and one bit used to detect overflow)\cite{beebe2007extending}. +This can represent values in the range $[-(2^14), 2^14 - 1]$ with 16 binary digits of fractional precision. -Arb. precision exists +The DVI files \TeX \, produces may use ``up to'' 32-bit signed integers\cite{fuchs1982theformat} to specify the document, but there is no requirement that +implementations support the full 32-bit type. It would be permissible, for example, to have a DVI viewer support only 24-bit signed integers, though many +files which require greater range may fail to render correctly. -Higher precision numeric types can be implemented or used on the GPU, but are -slow. -\cite{emmart2010high} +PostScript\cite{plrm} supports two different numeric types: \emph{integers} and \emph{reals}, both of which are specified as strings. The interpreter's representation of numbers +is not exposed, though the representation of integers can be divined by a program by the use of bitwise operations. The PostScript specification lists some ``typical limits'' +of numeric types, though the exact limits may differ from implementation to implementation. Integers typically must fall in the range $[-2^{31}, 2^{31} - 1]$, +and reals are listed to have largest and smallest values of $\pm10^{38}$, values closest to $0$ of $\pm10^{-38}$ and approximately $8$ decimal digits of precision, +derived from the IEEE 754 single-precision floating-point specification. +Similarly, the PDF specification\cite{pdfref17} stores \emph{integers} and \emph{reals} as strings, though in a more restricted format than PostScript. +The PDF specification gives limits for the internal representation of values. Integer limits have not changed from the PostScript specification, but numbers +representable with the \emph{real} type have been specified differently: the largest representable values are $\pm 3.403\times 10^{38}$, the smallest non-zero representable values are +\footnote{The PDF specification mistakenly leaves out the negative in the exponent here.} +$\pm1.175 \times 10^{-38}$ with approximately $5$ decimal digits of precision \emph{in the fractional part}. +Adobe's implementation of PDF uses both IEEE 754 single precision floating-point numbers and (for some calculations, and in previous versions) 16.16 bit fixed-point values. +The SVG specification\cite{svg2011-1.1} specifies numbers as strings with a decimal representation of the number. +It is stated that a ``Conforming SVG Viewer'' must have ``all visual rendering accurate to within one device pixel to the mathematically correct result at the initial 1:1 +zoom ratio'' and that ``it is suggested that viewers attempt to keep a high degree of accuracy when zooming.'' +A ``Conforming High-Quality SVG Viewer'' must use ``double-precision floating point\footnote{Presumably the 64-bit IEEE 754 ``double'' type.}'' for computations involving +coordinate system transformations. \section{Quadtrees} -The quadtree is a data structure which -\cite{finkel1974quad} - +When viewing or processing a small part of a large document, it may be helpful to +only processs --- or \emph{cull} --- parts of the document which are not on-screen. + +\begin{figure}[h] + \centering \includegraphics[width=0.4\linewidth]{figures/quadtree_example} + \caption{A simple quadtree.} +\end{figure} +The quadtree\cite{finkel1974quad}is a data structure --- one of a family of \emph{spatial} +data structures --- which recursively breaks down space into smaller subregions +which can be processed independently. Points (or other objects) are added to a single +node which (if certain criteria are met) is split into four equal-sized subregions, and +points attached to the region which contains them. + +Quadtrees have been used in computer graphics for both culling --- excluding objects in +nodes which are not visible --- and ``level of detail'', where different levels of the quadtree store +different quality versions of objects or data\cite{zerbst2004game}. +Typically the number of points in a node +exceeding a maximum triggers this split, though in our case likely the level of precision required exceeding +that supported by the data type in use. + +In this project, we will be experimenting with a form of quadtree in which each +node has its own independent coordinate system, allowing us to store some spatial +information\footnote{One bit per-coordinate, per-level of the quadtree} within the +quadtree structure, eliminating redundancy in the coordinates of nearby objects. + +Other spatial data structures exist, such as the KD-tree\cite{bentley1975multidimensional}, +which partitions the space on any axis-aligned line; or the BSP tree\cite{fuchs1980onvisible}, +which splits along an arbitrary line which need not be axis aligned. We believe, however, +that the simpler conversion from binary coordinates to the quadtree's binary split make +it a better avenue for initial research to explore. \bibliographystyle{unsrt} \bibliography{papers}