Tidy up a bit

[ipdf/sam.git] / chapters / Background.tex
diff --git a/chapters/Background.tex b/chapters/Background.tex

index 605b4a2..ee8da21 100644 (file)
--- a/chapters/Background.tex
+++ b/chapters/Background.tex
@@ -1,88 +1,165 @@
  \chapter{Literature Review}\label{Background}
  
-This chapter will \rephrase{review the literature. It will also include some figures created by us from our test programs to aid with conceptual understanding of the literature.}
+The first half of this chapter will be devoted to documents themselves, including: the representation and displaying of graphics primitives\cite{computergraphics2}, and how collections of these primitives are represented in document formats, focusing on widely used standards\cite{plrm, pdfref17, svg2011-1.1}.
  
-\rephrase{TODO: Decide exactly what figures to make, then go make them; so far I have some ideas for a few about Floating Point operations, but none about the other stuff}.
+We will find that although there has been a great deal of research into the rendering, storing, editing, manipulation, and extension of document formats, modern standards are content to specify at best single precision IEEE-754 floating point arithmetic.
  
-\rephrase{TODO: Actually (re)write this entire chapter}.
-\rephrase{????: Do I really want to make this go down to} \verb/\subsubsection/
+The research on arbitrary precision arithmetic applied to documents is rather sparse; however arbitrary precision arithmetic itself is a very active field of research. Therefore, the second half of this chapter will be devoted to considering fixed precision floating point numbers as specified by the IEEE-754 standard, possible limitations in precision, and alternative number representations for increased or arbitrary precision arithmetic.
  
-A paper by paper summary of the literature is also available at: \\ \url{http://szmoore.net/ipdf/documents/LiteratureNotes.pdf}.
+In Chapter \ref{Progress}, we will discuss our findings so far with regards to arbitrary precision arithmetic applied to document formats, and expand upon the goals outlined in Chapture \ref{Proposal}.
  
+\section{Raster and Vector Images}\label{Raster and Vector Images}
+\input{chapters/Background_Raster-vs-Vector}
  
-\rephrase{TODO: Actually make that readable or just remove the link}.
+\section{Rendering Vector Images}\label{Rasterising Vector Images}
  
-\section{Document Formats}
+Hearn and Baker's textbook ``Computer Graphics''\cite{computergraphics2} gives a comprehensive overview of graphics from physical display technologies through fundamental drawing algorithms to popular graphics APIs. This section will examine algorithms for drawing two dimensional geometric primitives on raster displays as discussed in ``Computer Graphics'' and the relevant literature. This section is by no means a comprehensive survey of the literature but intends to provide some idea of the computations which are required to render a document.
  
-\subsection{History}
+It is of some historical significance that vector display devices were popular during the 70s and 80s, and papers oriented towards drawing on these devices can be found\cite{brassel1979analgorithm}. Whilst curves can be drawn at high resolution on vector displays, a major disadvantage was shading; by the early 90s the vast majority of computer displays were raster based\cite{computergraphics2}.
  
-Since mankind climbed down from the trees... \rephrase{plagiarism alert!}
+\subsection{Straight Lines}\label{Rasterising Straight Lines}
+\input{chapters/Background_Lines}
  
-\subsection{Vector Graphics vs Raster Graphics}
+\subsection{Spline Curves and B{\'e}ziers}\label{Spline Curves}
+\input{chapters/Background_Spline}
  
-Raster Graphics: Stores the exact pixels as they would appear on a device. Causes obvious issues with scaling.
-Vector Graphics: Stores relative position of primitives - scales better. BUT still can't scale forever.
+\subsection{Font Glyphs}\label{Font Rendering}
+\input{chapters/Background_Fonts}
  
-\rephrase{Figures: Raster and Vector graphics at different scales}
+%\subsection{Shading}\label{Shading}
  
-\subsection{Document Format Categories}
  
-Main reference: Pixels or Perish\cite{hayes2012pixels}
+%\cite{brassel1979analgorithm}; %\cite{lane1983analgorithm}.
  
-\begin{enumerate}
-       \item DOM - eg: HTML/XMLish - defined in terms of elements that can contain other elements
-       \item Programming Language - eg: PostScript - programmer (or program) produces a program that is interpreted
-       \item Combination - eg: Javascript with HTML/XML - Program is interpreted that modifies the DOM. 
-       \begin{itemize}
-               \item The lines are becomming increasingly blurred between 1. and 2.
-               \item In the case of Javascript/HTML, a special DOM element \verb/<canvas>/ allows even lower level manipulation of graphics.
-       \end{itemize}
-\end{enumerate}
+\subsection{Compositing}\label{Compositing and the Painter's Model}
+
+%So far we have discussed techniques for rendering vector graphics primitives in isolation, with no regard to the overall structure of a document which may contain many thousands of primitives. A straight forward approach would be to render all elements sequentially to the display, with the most recently drawn pixels overwriting lower elements. Such an approach is particularly inconvenient for anti-aliased images where colours must appear to smoothly blur between the edge of a primitive and any drawn underneath it.
+
+Colour raster displays are based on an additive red-green-blue $(r,g,b)$ colour representation which matches the human eye's response to light\cite{computergraphics2}. In 1984, Porter and Duff introduced a fourth colour channel for rasterised images called the ``alpha'' channel, analogous to the transparency of a pixel\cite{porter1984compositing}. In compositing models, elements can be rendered seperately, with the four colour channels of successively drawn elements being combined according to one of several possible operations.
+
+In the ``painter's model'' as described by the SVG standard the ``over'' operation is used when rendering one primitive over another\cite{svg2011-1.1}.
+Given an existing pixel $P_1$ with colour values $(r_1, g_1, b_1, a_1)$ and a pixel $P_2$ with colours $(r_2, g_2, b_2, a_2)$ to be painted over $P_1$, the resultant pixel $P_T$ has colours given by:
+\begin{align}
+       a_T &= 1 - (1-a_1)(1-a_2) \\
+       r_T &= (1 - a_2)r_1 + r_2 \quad \text{(similar for $g_T$ and $b_T$)}
+\end{align}
+It should be apparent that alpha values of $1$ correspond to an opaque pixel; that is, when $a_2 = 1$ the resultant pixel $P_T$ is the same as $P_2$.
+When the final pixel is actually drawn on an rgb display, the $(r, g, b)$ components are $(r_T/a_T, g_T/a_T, b_T/a_T)$.
+
+The PostScript and PDF standards, as well as the OpenGL API also use a painter's model for compositing. However, PostScript does not include an alpha channel, so $P_T = P_2$ always\cite{plrm}. Figure \ref{SVG} illustrates the painter's model for partially transparent shapes as they would appear in both the SVG and PDF models.
+
+\subsection{Rasterisation on the CPU and GPU}
+
+Traditionally, vector images have been rasterized by the CPU before being sent to a specialised Graphics Processing Unit (GPU) for drawing\cite{computergraphics2}. Rasterisation of simple primitives such as lines and triangles have been supported directly by GPUs for some time through the OpenGL standard\cite{openglspec}. However complex shapes (including those based on B{\'e}zier curves such as font glyphs) must either be rasterised entirely by the CPU or decomposed into simpler primitives that the GPU itself can directly rasterise. There is a significant body of research devoted to improving the performance of rendering such primitives using the latter approach, mostly based around the OpenGL API\cite{robart2009openvg, leymarie1992fast, frisken2000adaptively, green2007improved, loop2005resolution, loop2007rendering}. Recently Mark Kilgard of the NVIDIA Corporation described an extension to OpenGL for NVIDIA GPUs capable of drawing and shading vector paths\cite{kilgard2012gpu,kilgard300programming}. From this development it seems that rasterization of vector graphics may eventually become possible upon the GPU.openglspec
+
+It is not entirely clear how well supported the IEEE-754 standard for floating point computation (which we will discuss in Section \ref{}) is amongst GPUs\footnote{Informal technical articles are prevelant on the internet --- Eg: Regarding the Dolphin Wii GPU Emulator: \url{https://dolphin-emu.org/blog} (accessed 2014-05-22)}. Although the OpenGL API does use IEEE-754 number representations, research by Hillesland and Lastra in 2004 suggested that many GPUs were not internally compliant with the standard\cite{hillesland2004paranoia}. %Arbitrary precision arithmetic, is provided by many software libraries for CPU based calculations
+
+ \pagebreak
+\section{Document Representations}\label{Document Representations}
+
+The representation of information, particularly for scientific purposes, has changed dramatically over the last few decades. For example, Brassel's 1979 paper referenced earlier\cite{brassel1979analgorithm} has been produced on a mechanical type writer. Although the paper discusses an algorithm for shading on computer displays, the figures illustrating this algorithm have not been generated by a computer, but drawn by Brassel's assistant. In contrast, modern papers such as Barnes et. al's 2013 paper on embedding 3d images in PDF documents\cite{barnes2013embedding} can themselves be an interactive proof of concept.
+
+Haye's 2012 article ``Pixels or Perish'' discusses the recent history and current state of the art in documents for scientific publications\cite{hayes2012pixels}. Hayes argued that there are currently two different approaches to representing a document: As a sequence of static sheets of paper (Programmed Documents) or as a dynamic and interactive way to convey information, using the Document Object Model. We will now explore these two approaches and the extent to which they overlap.
+
+
+\subsection{Programmed Documents}
+\input{chapters/Background_Interpreted}
+
+\pagebreak
+\subsection{Document Object Model}\label{Document Object Model}
+\input{chapters/Background_DOM}
+
+\subsection{The Portable Document Format}
+
+Adobe's Portable Document Format (PDF) is currently used almost universally for sharing documents; the ability to export or print to PDF can be found in most graphical document editors and even some plain text editors\cite{cheng2002finally}. 
+
+Hayes describes PDF as ``... essentially 'flattened' PostScript; it’s what’s left when you remove all the procedures and loops in a program, replacing them with sequences of simple drawing commands.''\cite{hayes2012pixels}. Consultation of the PDF 1.7 standard shows that this statement does not a give a complete picture --- despite being based on the Adobe PostScript model of a document as a series of ``pages'' to be printed by executing sequential instructions, from version 1.5 the PDF standard began to borrow some ideas from the Document Object Model. For example, interactive elements such as forms may be included as XHTML objects and styled using CSS. ``Actions'' are objects used to modify the data structure dynamically. In particular, it is possible to include Javascript Actions. Adobe defines the API for Javascript actions seperately to the PDF standard\cite{js_3d_pdf}. There is some evidence in the literature of attempts to exploit these features, with mixed success\cite{barnes2013embedding, hayes2012pixels}.
+
+%\subsection{Scientific Computation Packages}
  
-Can be either human readable\footnote{For some definition of human and some definition of readable} or binary\footnote{So, our viewer is basically a DOM style but stored in a binary format}
  
-\subsection{Modern Graphics Formats}
+\section{Precision required by Document Formats}
  
-PostScript: Not actually widely used now, but PDF is basically the same thing.
-PDF: A way for adobe to make money
-SVG: Based on the DOM model, vector format
-BMP,PNG,GIF: Are these even worth mentioning?
+We briefly summarise the requirements of the standards discussed so far in regards to the precision of mathematical operations.
  
-HTML: With the evolution of JavaScript and CSS, this has actually become a competitive document format; it can display much more complex dynamic content than eg: PDF.
+\subsection{PostScript}
+The PostScript reference describes a ``Real'' object for representing coordinates and values as follows: ``Real objects approximate mathematical real numbers within a much larger interval, but with limited precision; they are implemented as floating-point numbers''\cite{plrm}. There is no reference to the precision of mathematical operations, but the implementation limits \emph{suggest} a range of $\pm10^{38}$ ``approximate'' and the smallest values not rounded to zero are $\pm10^{-38}$ ``approximate''.
  
-Web based documents still suffer from precision issues (JavaScript is notorious for its representation of everything, even integers, as floats, so you get rounding errors even with integer maths), and it also has other limitations (requires reasonably skilled programmer to create Javascript and/or a disgusting GUI tool that auto generates 10000 lines to display a button (I have seen one of these)). \rephrase{Hate on javascript a bit less maybe.}
+\subsection{PDF}
+PDF defines ``Real'' objects in a similar way to PostScript, but suggests a range of $\pm3.403\times10^{38}$ and smallest non-zero values of $\pm1.175\times10^{38}$\cite{pdfref17}. A note in the PDF 1.7 manual mentions that Acrobat 6 now uses IEEE-754 single precision floats, but ``previous versions used 32-bit fixed point numbers'' and ``... Acrobat 6 still converts floating-point numbers to fixed point for some components''.
  
-\subsection{Precision Limitations}
+\subsection{\TeX and METAFONT}
  
-\section{Representation of Numbers}
+In ``The METAFONT book'' Knuth appears to describe coordinates as fixed point numbers: ``The computer works internally with coordinates that are integer multiples of $\frac{1}{65536} \approx 0.00002$ of the width of a pixel''\cite{knuth1983metafont}. There is no mention of precision in ``The \TeX book''. In 2007 Beebe claimed that {\TeX} uses a $14.16$ fixed point encoding, and that this was due to the lack of standardised floating point arithmetic on computers at the time; a problem that the IEEE-754 was designed to solve\cite{beebe2007extending}. Beebe also suggests that \TeX and METAFONT could now be modified to use IEEE-754 arithmetic.
  
-Although this project has been motivated by a desire for more flexible document formats, the fundamental source of limited precision in vector document formats such as PDF and PostScript is the use of floating point numbers to represent the coordinates of vertex positions. In particular, implementations of PostScript and PDF must by definition restrict themselves to IEEE binary32 ``single precision'' floating point number representations in order to conform to the standards\cite{plrm, pdfref17}.
+\subsection{SVG}
  
-Whilst David Gow will be focusing on how to structure a document format so as to avoid or reduce these limitations\cite{proposalGow}, the focus of our own research will be \rephrase{NUMBERS}.
+The SVG standard specifies a minimum precision equivelant to that of ``single precision floats'' (presumably referring to IEEE-754) with a range of \verb/-3.4e+38F/ to \verb/+3.4e+38F/, and states ``It is recommended that higher precision floating point storage and computation be performed on operations such as
+coordinate system transformations to provide the best possible precision and to prevent round-off errors.''\cite{svg2011-1.1} An SVG Viewer may refer to itself as ``High Quality'' if it uses a minimum of ``double precision'' floats.
  
-\subsection{Floating Point Number Representations}
+\subsection{Javascript}
+We include Javascript here due to its relation with the SVG, HTML5 and PDF standards.
+
+According to the EMCA-262 standard, ``The Number type has exactly 18437736874454810627 (that is, $2^64-^53+3$) values, 
+representing the double-precision 64-bit format IEEE 754 values as specified in the IEEE Standard for Binary Floating-Point Arithmetic''\cite{ecma-262}. 
+The Number type does differ slightly from IEEE-754 in that there is only a single valid representation of ``Not a Number'' (NaN). The EMCA-262 does not define an ``integer'' representation.
+       
+
+\section{Number Representations}
+
+
+\subsection{Integers and Fixed Point Numbers}
+
+
+
+\subsection{Floating Points}
+
+A floating point number $x$ is commonly represented by a tuple of values $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}:
  
  \begin{align*}
         x &= (-1)^{s} \times m \times B^{e}
  \end{align*}
  
-$B = 2$, although IEEE also defines decimal representations for $B = 10$ --- these are useful in financial software\cite{ieee2008-754}.
+Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent. Whilst $e$ is an integer in some range $\pm e_max$, the mantissa $m$ is actually a fixed point value in the range $0 < m < B$. The name ``floating point'' refers to the equivelance of the $\times B^e$ operation to a shifting of the ``fixed point'' along the mantissa.
+
+For example, the value $7.25$ can be expressed as:
+\begin{enumerate}
+       \item 
+       
+\end{enumerate}
+
+The choice of base $B = 2$, closely matches the nature of modern hardware. It has also been found that this base in general gives the smallest rounding errors\cite{HFP}. Early computers had in fact used a variety of representations including $B=3$ or even $B=7$\cite{goldman1991whatevery}, and the revised IEEE-754 standard specifies a decimal representation $B = 10$ intended for use in financial applications\cite{ieee754std2008}. From now on we will restrict ourselves to considering base 2 floats.
+
+Figure \ref{minifloat.pdf} shows the positive real numbers which can be represented exactly by an 8 bit floating point number encoded in the IEEE-754 format, and the distance between successive floating point numbers. We show two encodings using (1,2,5) and (1,3,4) bits to encode (sign, exponent, mantissa) respectively.
+
+For each distinct value of the exponent, the successive floating point representations lie on a straight line with constant slope. As the exponent increases, larger values are represented, but the distance between successive values increases\footnote{A plot of fixed point numbers or integers (which we omit for space considerations) would show points lying on a straight line with a constant slope between points}.
+
+In the graph of the difference between representations, a single isolated point should be visible; this is not an error, but due to the greater discontinuity between the denormalised and normalised values ($e = 0$ and $1$ respectively). 
+
+
+\begin{figure}[H]
+       \centering
+       \includegraphics[width=0.8\textwidth]{figures/minifloat.pdf} \\
+       \includegraphics[width=0.8\textwidth]{figures/minifloat_diff.pdf}
+       \caption{The mapping of 8 bit floats to reals}
+\end{figure}
+
+\subsection{Floating Point Operations}
  
-\rephrase{Aside: Are decimal representations for a document format eg: CAD also useful because you can then use metric coordinate systems?}
+Floating point operations can in principle be performed using integer operations, but specialised Floating Point Units (FPUs) are an almost universal component of modern processors\cite{kelley1997acmos}. The improvement of FPUs remains highly active in several areas including: efficiency\cite{seidel2001onthe}; accuracy of operations\cite{dieter2007lowcost}; and even the adaptation of algorithms originally used in software for reducing the overal error of a sequence of operations\cite{kadric2013accurate}. In this section we will briefly describe the algorithms for floating point operations without focusing on the hardware implementation of these algorithms.
  
-\subsubsection{Precision}
  
-The floats map an infinite set of real numbers onto a discrete set of representations.
+\subsection{Precision and Rounding} 
  
-\rephrase{Figure: 8 bit ``minifloats'' (all 255 of them) clearly showing the ``precision vs range'' issue}
+Real values which cannot be represented exactly in a floating point representation must be rounded to the nearest floating point value. The results of a floating point operation will in general be such values and thus there is a rounding error possible in any floating point operation. Referring to Figure \ref{minifloat.pdf} it can be seen that the largest possible rounding error, or ``units in last place'' (ulp) is half the distance between successive floats; this means that rounding errors increase as the value to be represented increases. The IEEE-754 standard specifies the rounding conventions for floating point arithmetic\cite{ieee754std2008}.
  
-The most a result can be rounded in conversion to a floating point number is the units in last place; $m_{N} \times B^{e}$.
  
-\rephrase{Even though that paper that claims double is the best you will ever need because the error can be as much as the size of a bacterium relative to the distance to the moon}\cite{} \rephrase{there are many cases where increased number of bits will not save you}.\cite{HFP}
+Goldberg's assertively titled 1991 paper ``What Every Computer Scientist Needs to Know about Floating Point Arithmetic''\cite{goldberg1991whatevery} provides a comprehensive overview of issues in floating point arithmetic and relates these to requirements of the IEEE-754 1985 standard\cite{ieee754std1985}. More recently, after the release of the revised IEEE-754 standard in 2008\cite{ieee754std2008}, a textbook ``Handbook Of Floating Point Arithmetic'' has been published which provides a thourough review of literature relating to floating point arithmetic in both software and hardware\cite{HFP}.
  
+William Kahan, one of the architects of the IEEE-754 standard in 1984 and a contributor to its revision in 2010, has also published many articles on his website explaining the more obscure features of the IEEE-754 standard and calling out software which fails to conform to the standard\footnote{In addition to encodings and acceptable rounding errors, the standard also specifies ``exceptions'' --- mechanisms by which a program can detect an error such as division by zero --- which are sometimes neglected, as in the ECMA-256}\cite{kahanweb, kahan1996ieee754}, as well as examples of the limitations of floating point computations\cite{kahan2007wrong}. 
  
-\subsection{Alternate Number Representations}
+\subsection{Arbitrary Precision Floating Point Numbers}
  
-\rephrase{They exist\cite{HFP}}.
+Fouse described