Submission time

[ipdf/sam.git] / chapters / Background.tex
diff --git a/chapters/Background.tex b/chapters/Background.tex

index 5d63d80..3cf44a8 100644 (file)
--- a/chapters/Background.tex
+++ b/chapters/Background.tex
@@ -1,10 +1,10 @@
  \chapter{Literature Review}\label{Background}
  
-The first half of this chapter will be devoted to documents themselves, including: the representation and displaying of graphics primitives\cite{computergraphics2}, and how collections of these primitives are represented in document formats, focusing on widely used standards\cite{plrm, pdfref17, svg2011-1.1}.
+The first part of this chapter will be devoted to documents themselves, including: the representation and displaying of graphics primitives, and how collections of these primitives are represented in document formats, focusing on widely used standards.
  
  We will find that although there has been a great deal of research into the rendering, storing, editing, manipulation, and extension of document formats, modern standards are content to specify at best single precision IEEE-754 floating point arithmetic.
  
-The research on arbitrary precision arithmetic applied to documents is rather sparse; however arbitrary precision arithmetic itself is a very active field of research. Therefore, the second half of this chapter will be devoted to considering fixed precision floating point numbers as specified by the IEEE-754 standard, possible limitations in precision, and alternative number representations for increased or arbitrary precision arithmetic.
+The research on arbitrary precision arithmetic applied to documents is rather sparse; however arbitrary precision arithmetic itself is a very active field of research. Therefore, remainder of this chapter will be devoted to considering fixed precision floating point numbers as specified by the IEEE-754 standard, possible limitations in precision, and alternative number representations for increased or arbitrary precision arithmetic.
  
  In Chapter \ref{Progress}, we will discuss our findings so far with regards to arbitrary precision arithmetic applied to document formats, and expand upon the goals outlined in Chapture \ref{Proposal}.
  
@@ -15,9 +15,9 @@ In Chapter \ref{Progress}, we will discuss our findings so far with regards to a
  
  Hearn and Baker's textbook ``Computer Graphics''\cite{computergraphics2} gives a comprehensive overview of graphics from physical display technologies through fundamental drawing algorithms to popular graphics APIs. This section will examine algorithms for drawing two dimensional geometric primitives on raster displays as discussed in ``Computer Graphics'' and the relevant literature. This section is by no means a comprehensive survey of the literature but intends to provide some idea of the computations which are required to render a document.
  
-It is of some historical significance that vector display devices were popular during the 70s and 80s, and papers oriented towards drawing on these devices can be found\cite{brassel1979analgorithm}. Whilst curves can be drawn at high resolution on vector displays, a major disadvantage was shading; by the early 90s the vast majority of computer displays were raster based\cite{computergraphics2}.
+It is of some historical significance that vector display devices were popular during the 70s and 80s, and papers oriented towards drawing on these devices can be found\cite{brassel1979analgorithm}. Whilst curves can be drawn at high resolution on vector displays, a major disadvantage was shading\cite{lane1983analgorithm}; by the early 90s the vast majority of computer displays were raster based\cite{computergraphics2}.
  
-\subsection{Straight Lines}\label{Rasterising Straight Lines}
+\subsection{Straight Lines}\label{Straight Lines}
  \input{chapters/Background_Lines}
  
  \subsection{Spline Curves and B{\'e}ziers}\label{Spline Curves}
@@ -31,7 +31,7 @@ It is of some historical significance that vector display devices were popular d
  
  %\cite{brassel1979analgorithm}; %\cite{lane1983analgorithm}.
  
-\subsection{Compositing}\label{Compositing and the Painter's Model}
+\subsection{Compositing}\label{Compositing}
  
  %So far we have discussed techniques for rendering vector graphics primitives in isolation, with no regard to the overall structure of a document which may contain many thousands of primitives. A straight forward approach would be to render all elements sequentially to the display, with the most recently drawn pixels overwriting lower elements. Such an approach is particularly inconvenient for anti-aliased images where colours must appear to smoothly blur between the edge of a primitive and any drawn underneath it.
  
@@ -50,9 +50,9 @@ The PostScript and PDF standards, as well as the OpenGL API also use a painter's
  
  \subsection{Rasterisation on the CPU and GPU}
  
-Traditionally, vector images have been rasterized by the CPU before being sent to a specialised Graphics Processing Unit (GPU) for drawing\cite{computergraphics2}. Rasterisation of simple primitives such as lines and triangles have been supported directly by GPUs for some time through the OpenGL standard\cite{openglspec}. However complex shapes (including those based on B{\'e}zier curves such as font glyphs) must either be rasterised entirely by the CPU or decomposed into simpler primitives that the GPU itself can directly rasterise. There is a significant body of research devoted to improving the performance of rendering such primitives using the latter approach, mostly based around the OpenGL API\cite{robart2009openvg, leymarie1992fast, frisken2000adaptively, green2007improved, loop2005resolution, loop2007rendering}. Recently Mark Kilgard of the NVIDIA Corporation described an extension to OpenGL for NVIDIA GPUs capable of drawing and shading vector paths\cite{kilgard2012gpu,kilgard300programming}. From this development it seems that rasterization of vector graphics may eventually become possible upon the GPU.openglspec
+Traditionally, vector images have been rasterized by the CPU before being sent to a specialised Graphics Processing Unit (GPU) for drawing\cite{computergraphics2}. Rasterisation of simple primitives such as lines and triangles have been supported directly by GPUs for some time through the OpenGL standard\cite{openglspec}. However complex shapes (including those based on B{\'e}zier curves such as font glyphs) must either be rasterised entirely by the CPU or decomposed into simpler primitives that the GPU itself can directly rasterise. There is a significant body of research devoted to improving the performance of rendering such primitives using the latter approach, mostly based around the OpenGL\cite{openglspec} API\cite{robart2009openvg, leymarie1992fast, frisken2000adaptively, green2007improved, loop2005resolution, loop2007rendering}. Recently Mark Kilgard of the NVIDIA Corporation described an extension to OpenGL for NVIDIA GPUs capable of drawing and shading vector paths\cite{kilgard2012gpu,kilgard300programming}. From this development it seems that rasterization of vector graphics may eventually become possible upon the GPU.
  
-It is not entirely clear how well supported the IEEE-754 standard for floating point computation (which we will discuss in Section \ref{}) is amongst GPUs\footnote{Informal technical articles are prevelant on the internet --- Eg: Regarding the Dolphin Wii GPU Emulator: \url{https://dolphin-emu.org/blog} (accessed 2014-05-22)}. Although the OpenGL API does use IEEE-754 number representations, research by Hillesland and Lastra in 2004 suggested that many GPUs were not internally compliant with the standard\cite{hillesland2004paranoia}. %Arbitrary precision arithmetic, is provided by many software libraries for CPU based calculations
+It is not entirely clear how well supported the IEEE-754 standard for floating point computation is amongst GPUs\footnote{Informal technical articles are abundant on the internet --- Eg: Regarding the Dolphin Wii GPU Emulator: \url{https://dolphin-emu.org/blog} (accessed 2014-05-22)}. Although the OpenGL API does use IEEE-754 number representations, research by Hillesland and Lastra in 2004 suggested that many GPUs were not internally compliant with the standard\cite{hillesland2004paranoia}. %Arbitrary precision arithmetic, is provided by many software libraries for CPU based calculations
  
   \pagebreak
  \section{Document Representations}\label{Document Representations}
@@ -71,7 +71,7 @@ Haye's 2012 article ``Pixels or Perish'' discusses the recent history and curren
  
  \subsection{The Portable Document Format}
  
-Adobe's Portable Document Format (PDF) is currently used almost universally for sharing documents; the ability to export or print to PDF can be found in most graphical document editors and even some plain text editors\cite{cheng2002finally}. 
+Adobe's Portable Document Format (PDF) is currently used almost universally for sharing documents; the ability to export or print to PDF can be found in most graphical document editors and even some plain text editors\cite{cheng2002portable}. 
  
  Hayes describes PDF as ``... essentially 'flattened' PostScript; it’s what’s left when you remove all the procedures and loops in a program, replacing them with sequences of simple drawing commands.''\cite{hayes2012pixels}. Consultation of the PDF 1.7 standard shows that this statement does not a give a complete picture --- despite being based on the Adobe PostScript model of a document as a series of ``pages'' to be printed by executing sequential instructions, from version 1.5 the PDF standard began to borrow some ideas from the Document Object Model. For example, interactive elements such as forms may be included as XHTML objects and styled using CSS. ``Actions'' are objects used to modify the data structure dynamically. In particular, it is possible to include Javascript Actions. Adobe defines the API for Javascript actions seperately to the PDF standard\cite{js_3d_pdf}. There is some evidence in the literature of attempts to exploit these features, with mixed success\cite{barnes2013embedding, hayes2012pixels}.
  
@@ -88,9 +88,9 @@ The PostScript reference describes a ``Real'' object for representing coordinate
  \subsection{PDF}
  PDF defines ``Real'' objects in a similar way to PostScript, but suggests a range of $\pm3.403\times10^{38}$ and smallest non-zero values of $\pm1.175\times10^{38}$\cite{pdfref17}. A note in the PDF 1.7 manual mentions that Acrobat 6 now uses IEEE-754 single precision floats, but ``previous versions used 32-bit fixed point numbers'' and ``... Acrobat 6 still converts floating-point numbers to fixed point for some components''.
  
-\subsection{\TeX and METAFONT}
+\subsection{{\TeX} and METAFONT}
  
-In ``The METAFONT book'' Knuth appears to describe coordinates as fixed point numbers: ``The computer works internally with coordinates that are integer multiples of $\frac{1}{65536} \approx 0.00002$ of the width of a pixel''\cite{knuth1983metafont}. There is no mention of precision in ``The \TeX book''. In 2007 Beebe claimed that {\TeX} uses a $14.16$ fixed point encoding, and that this was due to the lack of standardised floating point arithmetic on computers at the time; a problem that the IEEE-754 was designed to solve\cite{beebe2007extending}. Beebe also suggests that \TeX and METAFONT could now be modified to use IEEE-754 arithmetic.
+In ``The METAFONT book'' Knuth appears to describe coordinates as fixed point numbers: ``The computer works internally with coordinates that are integer multiples of $\frac{1}{65536} \approx 0.00002$ of the width of a pixel''\cite{knuth1983metafont}. \footnote{This corresponds to using $16$ bits for the fractional component of a fixed point representation} There is no mention of precision in ``The {\TeX} book''. In 2007 Beebe claimed that {\TeX} uses a $14.16$ fixed point encoding, and that this was due to the lack of standardised floating point arithmetic on computers at the time; a problem that the IEEE-754 was designed to solve\cite{beebe2007extending}. Beebe also suggested that {\TeX} and METAFONT could now be modified to use IEEE-754 arithmetic.
  
  \subsection{SVG}
  
@@ -98,40 +98,37 @@ The SVG standard specifies a minimum precision equivelant to that of ``single pr
  coordinate system transformations to provide the best possible precision and to prevent round-off errors.''\cite{svg2011-1.1} An SVG Viewer may refer to itself as ``High Quality'' if it uses a minimum of ``double precision'' floats.
  
  \subsection{Javascript}
-We include Javascript here due to its relation with the SVG, HTML5 and PDF standards.
-
+%We include Javascript here due to its relation with the SVG, HTML5 and PDF standards.
  According to the EMCA-262 standard, ``The Number type has exactly 18437736874454810627 (that is, $2^64-^53+3$) values, 
  representing the double-precision 64-bit format IEEE 754 values as specified in the IEEE Standard for Binary Floating-Point Arithmetic''\cite{ecma-262}. 
  The Number type does differ slightly from IEEE-754 in that there is only a single valid representation of ``Not a Number'' (NaN). The EMCA-262 does not define an ``integer'' representation.
         
  
-\section{Number Representations}
  
-Consider a value of $7.25 = 2^2 + 2^1 + 2^0 + 2^{-2}$. In binary, this can be written as $111.01_2$ Such a value would require 5 binary digits (bits) of memory to represent exactly. On the other hand a rational value such as $7\frac{1}{3}$ could not be represented exactly; the sequence of bits $111.0111 \text{ ...}_2$ never terminates.
+\section{Number Representations}\label{Number Representations}
  
-Integer representations do not allow for a ``point'' as shown in these examples, that is, no bits are reserved for the fractional part of a number. $7.25$ would be rounded to the nearest integer value, $7$. A fixed point representation keeps the decimal place at the same position for all values. Floating point representations can be thought of as analogous to scientific notation; an ``exponent'' and fixed point value are encoded, with multiplication by the exponent moving the position of the point.
+\subsection{Exact Representations}
  
-Modern computer hardware typically supports integer and floating-point number representations. Due to physical limitations, the size of these representations is limited; this is the fundamental source of both limits on range and precision in computer based calculations. 
+Consider a value of $7.25 = 2^2 + 2^1 + 2^0 + 2^{-2}$. In binary (base 2), this could be written as $111.01_2$ Such a value would require 5 binary digits (bits) of memory to represent exactly in computer hardware. Some values, for example $7.3$ can not be represented exactly in one base (decimal) but not another; in binary the sequence $111.010\text{...}_2$ will never terminate. A rational value such as $\frac{7}{3}$ could not be represented exactly in any base, but could be represented by the combination of a numerator $7 = 111_2$ and denominator $3 = 11_2$. Lastly, some values such as $e \approx 2.718\text{...}$ can only be expressed exactly using a symbolical system --- in this case as the result of an infinite summation $e = \displaystyle\sum_n^{\infty} 1/n!$
  
+Modern computer hardware typically supports integer and floating-point number representations and operations. Due to physical limitations, the size of these representations is limited; this is the fundamental source of both limits on range and precision in computer based calculations. 
  
-\subsection{Floating Point Definition}
+\subsection{Floating Point Definitions}
  
-A floating point number $x$ is commonly represented by a tuple of values $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}:
+Whilst a Fixed Point representation keeps the ``point'' at the same position in a string of bits, Floating point representations can be thought of as analogous to scientific notation; an ``exponent'' and fixed point value are encoded, with multiplication by the exponent moving the position of the point.
  
-\begin{align*}
-       x &= (-1)^{s} \times m \times B^{e}
-\end{align*}
+A floating point number $x$ is commonly represented by a tuple of values $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}: $x = (-1)^{s} \times m \times B^{e}$
  
  Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent. Whilst $e$ is an integer in some range $\pm e_max$, the mantissa $m$ is a fixed point value in the range $0 < m < B$. 
  
  
-The choice of base $B = 2$ in the original IEEE-754 standard matches the nature of modern hardware. It has also been found that this base in general gives the smallest rounding errors\cite{HFP}. Early computers had in fact used a variety of representations including $B=3$ or even $B=7$\cite{goldman1991whatevery}, and the revised IEEE-754 standard specifies a decimal representation $B = 10$ intended for use in financial applications\cite{ieee754std2008}. From now on we will restrict ourselves to considering base 2 floats.
+The choice of base $B = 2$ in the original IEEE-754 standard matches the nature of modern hardware. It has also been found that this base in general gives the smallest rounding errors\cite{HFP}. Early computers had in fact used a variety of representations including $B=3$ or even $B=7$\cite{goldman1991whatevery}, and the revised IEEE-754 standard specifies a decimal representation $B = 10$ intended for use in financial applications\cite{ieee754std2008}\footnote{Eg: The smallest valid unit of currency \$0.01 could not be represented exactly in base 2}. From now on we will restrict ourselves to considering base 2 floats.
  
  The IEEE-754 encoding of $s$, $e$ and $m$ requires a fixed number of continuous bits dedicated to each value. Originally two encodings were defined: binary32 and binary64. $s$ is always encoded in a single leading bit, whilst (8,23) and (11,53) bits are used for the (exponent, mantissa) encodings respectively. 
  
-$m$ as encoded in the IEEE-754 standard is not exactly equivelant to a fixed point value. By assuming an implicit leading bit (ie: restricting $1 \leq m < 2$) except for when $e = 0$, floating point values are gauranteed to have unique representations. When $e = 0$ the leading bit is not implied; these values are called ``denormals''. This idea, which allows for 1 extra bit of precision for the normalised values appears to have been considered by Goldberg as early as 1967\cite{goldbern1967twentyseven}.
+The encoding of $m$ in the IEEE-754 standard is not exactly equivelant to a fixed point value. By assuming an implicit leading bit (ie: restricting $1 \leq m < 2$) except for when $e = 0$, floating point values are gauranteed to have a unique representations; these representations are said to be ``normalised''. When $e = 0$ the leading bit is not implied; these representations are called ``denormals'' because multiple representations may map to the same real value. The idea of using an implicit bit appears to have been considered by Goldberg as early as 1967\cite{goldbern1967twentyseven}.
  
-Figure \ref{float.pdf} shows the positive real numbers which can be represented exactly by an 8 bit floating point number encoded in the IEEE-754 format\footnote{Not quite; we are ignoring the IEEE-754 definitions of NaN and Infinity for simplicity}, and the distance between successive floating point numbers. We show two encodings using (1,2,5) and (1,3,4) bits to encode (sign, exponent, mantissa) respectively. We have also plotted a fixed point representation. For each distinct value of the exponent, the successive floating point representations lie on a straight line with constant slope. As the exponent increases, larger values are represented, but the distance between successive values increases; this can be seen on the right. The marked single point discontinuity at \verb/0x10/ and \verb/0x20/ occur when $e$ leaves the denormalised region and the encoding of $m$ changes.
+Figure \ref{floats.pdf}\footnote{In a digital PDF viewer we suggest increasing the zoom level --- the graphs were created from SVG images} shows the positive real numbers which can be represented exactly by an 8 bit floating point number encoded in the IEEE-754 format\footnote{Not quite; we are ignoring the IEEE-754 definitions of NaN and Infinity for simplicity}, and the distance between successive floating point numbers. We show two encodings using (1,2,5) and (1,3,4) bits to encode (sign, exponent, mantissa) respectively. For each distinct value of the exponent, the successive floating point representations lie on a straight line with constant slope. As the exponent increases, larger values are represented, but the distance between successive values increases; this can be seen on the right. The marked single point discontinuity at \verb/0x10/ and \verb/0x20/ occur when $e$ leaves the denormalised region and the encoding of $m$ changes. We have also plotted a fixed point representation for comparison; fixed point and integer representations appear as straight lines - the distance between points is always constant.
  
  The earlier example $7.25$ would be converted to a (1,3,4) floating point representation as follows:
  \begin{enumerate}
@@ -165,7 +162,7 @@ This particular example can be encoded exactly; however as there are an infinite
  
  
  
-\subsection{Precision and Rounding} 
+\subsection{Precision and Rounding}\label{Precision and Rounding}
  
  Real values which cannot be represented exactly in a floating point representation must be rounded to the nearest floating point value. The results of a floating point operation will in general be such values and thus there is a rounding error possible in any floating point operation. Referring to Figure \ref{floats.pdf} it can be seen that the largest possible rounding error is half the distance between successive floats; this means that rounding errors increase as the value to be represented increases.
  
@@ -187,13 +184,13 @@ Floating point operations can in principle be performed using integer operations
  
  In 1971 Dekker formalised a series of algorithms including the Fast2Sum method for calculating the correction term due to accumulated rounding errors\cite{dekker1971afloating}. The exact result of $x + y$ may be expressed in terms of floating point operations with rounding as follows:
  \begin{align*}
-       z &= \text{RN}(x + y) w &= \text{RN}(z - x) \\
-       zz &= \text{RN}(y - w) \implies x + y &= zz
+       z = \text{RN}(x + y) &\quad w = \text{RN}(z - x) \\
+       zz = \text{RN}(y - w) &\quad \implies x + y = zz
  \end{align*}
  
  \subsection{Arbitrary Precision Floating Point Numbers}
  
-Arbitrary precision floating point numbers are implemented in a variety of software libraries which will dynamically allocate extra bits for the exponent or mantissa as required. An example is the GNU MPFR library discussed by Fousse in 2007\cite{fouse2007mpfr}. Although many arbitrary precision libraries already existed, MPFR intends to be fully compliant with some of the more obscure IEEE-754 requirements such as rounding rules and exceptions. 
+Arbitrary precision floating point numbers are implemented in a variety of software libraries which will dynamically allocate extra bits for the exponent or mantissa as required. An example is the GNU MPFR library discussed by Fousse in 2007\cite{fousse2007mpfr}. Although many arbitrary precision libraries already existed, MPFR intends to be fully compliant with some of the more obscure IEEE-754 requirements such as rounding rules and exceptions. 
  
  As we have seen, it is trivial to find real numbers that would require an infinite number of bits to represent exactly. Implementations of ``arbitrary'' precision must carefully determine at what point rounding should occur so as to balance performance with memory usage.