Tidy up a bit

[ipdf/sam.git] / chapters / Background.tex
diff --git a/chapters/Background.tex b/chapters/Background.tex

index bde2b7c..ee8da21 100644 (file)
--- a/chapters/Background.tex
+++ b/chapters/Background.tex
@@ -2,7 +2,7 @@
  
  The first half of this chapter will be devoted to documents themselves, including: the representation and displaying of graphics primitives\cite{computergraphics2}, and how collections of these primitives are represented in document formats, focusing on widely used standards\cite{plrm, pdfref17, svg2011-1.1}.
  
-We will find that although there has been a great deal of research into the rendering, storing, editing, manipulation, and extension of document formats, modern standards are content to specify --- at best --- single precision IEEE-754 floating point arithmetic.
+We will find that although there has been a great deal of research into the rendering, storing, editing, manipulation, and extension of document formats, modern standards are content to specify at best single precision IEEE-754 floating point arithmetic.
  
  The research on arbitrary precision arithmetic applied to documents is rather sparse; however arbitrary precision arithmetic itself is a very active field of research. Therefore, the second half of this chapter will be devoted to considering fixed precision floating point numbers as specified by the IEEE-754 standard, possible limitations in precision, and alternative number representations for increased or arbitrary precision arithmetic.
  
@@ -52,7 +52,7 @@ The PostScript and PDF standards, as well as the OpenGL API also use a painter's
  
  Traditionally, vector images have been rasterized by the CPU before being sent to a specialised Graphics Processing Unit (GPU) for drawing\cite{computergraphics2}. Rasterisation of simple primitives such as lines and triangles have been supported directly by GPUs for some time through the OpenGL standard\cite{openglspec}. However complex shapes (including those based on B{\'e}zier curves such as font glyphs) must either be rasterised entirely by the CPU or decomposed into simpler primitives that the GPU itself can directly rasterise. There is a significant body of research devoted to improving the performance of rendering such primitives using the latter approach, mostly based around the OpenGL API\cite{robart2009openvg, leymarie1992fast, frisken2000adaptively, green2007improved, loop2005resolution, loop2007rendering}. Recently Mark Kilgard of the NVIDIA Corporation described an extension to OpenGL for NVIDIA GPUs capable of drawing and shading vector paths\cite{kilgard2012gpu,kilgard300programming}. From this development it seems that rasterization of vector graphics may eventually become possible upon the GPU.openglspec
  
-It is not entirely clear how well supported the IEEE-754 standard for floating point computation (which we will discuss in Section \ref{}) is amongst GPUs. Although the OpenGL API does use IEEE-754 number representations, research by Hillesland and Lastra in 2004 suggested that many GPUs were not internally compliant with the standard\cite{hillesland2004paranoia}. Arbitrary precision arithmetic, whilst provided by many libraries for CPU based calculations, is virtually unheard of in the context of GPU rendering.
+It is not entirely clear how well supported the IEEE-754 standard for floating point computation (which we will discuss in Section \ref{}) is amongst GPUs\footnote{Informal technical articles are prevelant on the internet --- Eg: Regarding the Dolphin Wii GPU Emulator: \url{https://dolphin-emu.org/blog} (accessed 2014-05-22)}. Although the OpenGL API does use IEEE-754 number representations, research by Hillesland and Lastra in 2004 suggested that many GPUs were not internally compliant with the standard\cite{hillesland2004paranoia}. %Arbitrary precision arithmetic, is provided by many software libraries for CPU based calculations
  
   \pagebreak
  \section{Document Representations}\label{Document Representations}
@@ -90,7 +90,7 @@ PDF defines ``Real'' objects in a similar way to PostScript, but suggests a rang
  
  \subsection{\TeX and METAFONT}
  
-In ``The METAFONT book'' Knuth appears to describe coordinates as fixed point numbers: ``The computer works internally with coordinates that are integer multiples of $\frac{1}{65536} \approx 0.00002$ of the width of a pixel''\cite{knuth1983metafont}. There is no mention of precision in ``The \TeX book''. In 2007 Beebe claimed this was due to the lack of standardised floating point arithmetic on computers at the time; a problem that the IEEE-754 was designed to solve\cite{beebe2007extending}. Beebe also suggests that \TeX and METAFONT could now be modified to use IEEE-754 arithmetic.
+In ``The METAFONT book'' Knuth appears to describe coordinates as fixed point numbers: ``The computer works internally with coordinates that are integer multiples of $\frac{1}{65536} \approx 0.00002$ of the width of a pixel''\cite{knuth1983metafont}. There is no mention of precision in ``The \TeX book''. In 2007 Beebe claimed that {\TeX} uses a $14.16$ fixed point encoding, and that this was due to the lack of standardised floating point arithmetic on computers at the time; a problem that the IEEE-754 was designed to solve\cite{beebe2007extending}. Beebe also suggests that \TeX and METAFONT could now be modified to use IEEE-754 arithmetic.
  
  \subsection{SVG}
  
@@ -98,46 +98,44 @@ The SVG standard specifies a minimum precision equivelant to that of ``single pr
  coordinate system transformations to provide the best possible precision and to prevent round-off errors.''\cite{svg2011-1.1} An SVG Viewer may refer to itself as ``High Quality'' if it uses a minimum of ``double precision'' floats.
  
  \subsection{Javascript}
-Although Javascript is not a stand alone document format, we include it here due to its relation with the SVG, HTML5 and PDF standards.
-
+We include Javascript here due to its relation with the SVG, HTML5 and PDF standards.
  
  According to the EMCA-262 standard, ``The Number type has exactly 18437736874454810627 (that is, $2^64-^53+3$) values, 
  representing the double-precision 64-bit format IEEE 754 values as specified in the IEEE Standard for Binary Floating-Point Arithmetic''\cite{ecma-262}. 
  The Number type does differ slightly from IEEE-754 in that there is only a single valid representation of ``Not a Number'' (NaN). The EMCA-262 does not define an ``integer'' representation.
         
  
-\section{Real Number Representations}
-
-We have found that PostScript, PDF, and SVG document standards all restrict themselves to IEEE floating point number representations of coordinates. This is unsurprising as the IEEE standard has been successfully adopted almost universally by hardware manufactures and programming language standards since the early 1990s. In the traditional view of a document as a static, finite sheet of paper, there is little motivation for enhanced precision.
+\section{Number Representations}
  
-In this section we will begin by investigating floating point numbers as defined in the IEEE standard and their limitations. We will then consider alternative number representations including fixed point numbers, arbitrary precision floats, rational numbers, p-adic numbers and symbolic representations. \rephrase{Oh god I am still writing about IEEE floats let alone all those other things}
  
-\rephrase{Reorder to start with Integers, General Floats, then go to IEEE, then other things}
+\subsection{Integers and Fixed Point Numbers}
  
-\subsection{IEEE Floating Points}
  
-Although the concept of a floating point representation has been attributed to various early computer scientists including Charles Babbage\cite{citationneeded}, it is widely accepted that William Kahan and his colleagues working on the IEEE-754 standard in the 1980s are the ``fathers of modern floating point computation''\cite{citationneeded}. The original IEEE-754 standard specified the encoding, number of bits, rounding methods, and maximum acceptable errors for the basic floating point operations for base $B = 2$ floats. It also specifies ``exceptions'' --- mechanisms by which a program can detect an error such as division by zero\footnote{Kahan has argued that exceptions in IEEE-754 are conceptually different to Exceptions as defined in several programming languages including C++ and Java. An IEEE exception is intended to prevent an error by its detection, whilst an exception in those languages is used to indicate an error has already occurred\cite{}}. We will restrict ourselves to considering $B = 2$, since it was found that this base in general gives the smallest rounding errors\cite{HFP}, although it is worth noting that different choices of base had been used historically\cite{goldman1991whatevery}, and the IEEE-854 and later the revised IEEE-754 standard specify a decimal representation $B = 10$ intended for use in financial applications.
  
-\subsection{Floating Point Definition}
+\subsection{Floating Points}
  
-A floating point number $x$ is commonly represented by a tuple of integers $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}:
+A floating point number $x$ is commonly represented by a tuple of values $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}:
  
  \begin{align*}
         x &= (-1)^{s} \times m \times B^{e}
  \end{align*}
  
-Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent.
-The name ``floating point'' refers to the equivelance of the $\times B^e$ operation to a shifting of a decimal point along the mantissa. This contrasts with a ``fixed point'' representation where $x$ is the sum of two fixed size numbers representing the integer and fractional part.
+Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent. Whilst $e$ is an integer in some range $\pm e_max$, the mantissa $m$ is actually a fixed point value in the range $0 < m < B$. The name ``floating point'' refers to the equivelance of the $\times B^e$ operation to a shifting of the ``fixed point'' along the mantissa.
  
-In the IEEE-754 standard, for a base of $B = 2$, numbers are encoded in continuous memory by a fixed number of bits, with $s$ occupying 1 bit, followed by $e$ and $m$ occupying a number of bits specified by the precision; 5 and 10 for a binary16 or ``half precision'' float, 8 and 23 for a binary32 or ``single precision'' and 15 and 52 for a binary64 or ``double precision'' float\cite{HFP, ieee2008-754}.
+For example, the value $7.25$ can be expressed as:
+\begin{enumerate}
+       \item 
+       
+\end{enumerate}
  
+The choice of base $B = 2$, closely matches the nature of modern hardware. It has also been found that this base in general gives the smallest rounding errors\cite{HFP}. Early computers had in fact used a variety of representations including $B=3$ or even $B=7$\cite{goldman1991whatevery}, and the revised IEEE-754 standard specifies a decimal representation $B = 10$ intended for use in financial applications\cite{ieee754std2008}. From now on we will restrict ourselves to considering base 2 floats.
  
-\subsection{Precision and Rounding}
+Figure \ref{minifloat.pdf} shows the positive real numbers which can be represented exactly by an 8 bit floating point number encoded in the IEEE-754 format, and the distance between successive floating point numbers. We show two encodings using (1,2,5) and (1,3,4) bits to encode (sign, exponent, mantissa) respectively.
  
-Real values which cannot be represented exactly in a floating point representation must be rounded. The results of a floating point operation will in general be such values and thus there is a rounding error possible in any floating point operation. Goldberg's assertively titled 1991 paper ``What Every Computer Scientist Needs to Know about Floating Point Arithmetic'' provides a comprehensive overview of issues in floating point arithmetic and relates these to the 1984 version of the IEEE-754 standard\cite{goldberg1991whatevery}. More recently, after the release of the revised IEEE-754 standard in 2008, a textbook ``Handbook Of Floating Point Arithmetic'' has been published which provides a thourough review of literature relating to floating point arithmetic in both software and hardware\cite{HFP}.
+For each distinct value of the exponent, the successive floating point representations lie on a straight line with constant slope. As the exponent increases, larger values are represented, but the distance between successive values increases\footnote{A plot of fixed point numbers or integers (which we omit for space considerations) would show points lying on a straight line with a constant slope between points}.
  
+In the graph of the difference between representations, a single isolated point should be visible; this is not an error, but due to the greater discontinuity between the denormalised and normalised values ($e = 0$ and $1$ respectively). 
  
-Figure \ref{minifloat.pdf} shows the positive real numbers which can be represented exactly by an 8 bit base $B = 2$ floating point number; and illustrates that a set of fixed precision floating point numbers forms a discrete approximation of the reals. There are only $2^7 = 256$ numbers in this set, which means it is easier to see some of the properties of floats that would be unclear using one of the IEEE-754 encodings. The first set of points corresponds to using 2 and 5 bits to encode $e$ and $m$ whilst the second set of points corresponds to a 3 and 4 bit encoding. This allows us to see the trade off between the precision and range of real values represented. 
  
  \begin{figure}[H]
         \centering
@@ -148,45 +146,20 @@ Figure \ref{minifloat.pdf} shows the positive real numbers which can be represen
  
  \subsection{Floating Point Operations}
  
-Floating point operations can in principle be performed using integer operations, but specialised Floating Point Units (FPUs) are an almost universal component of modern processors\cite{citationneeded}. The improvement of FPUs remains highly active in several areas including: efficiency\cite{seidel2001onthe}; accuracy of operations\cite{dieter2007lowcost}; and even the adaptation of algorithms originally used in software for reducing the overal error of a sequence of operations\cite{kadric2013accurate}. In this section we will consider the algorithms for floating point operations without focusing on the hardware implementation of these algorithms.
-
-
-\subsection{Some sort of Example(s) or Floating Point Mayhem}
-
-\rephrase{Eg: $f(x) = |x|$ calculated from sqrt and squaring}
+Floating point operations can in principle be performed using integer operations, but specialised Floating Point Units (FPUs) are an almost universal component of modern processors\cite{kelley1997acmos}. The improvement of FPUs remains highly active in several areas including: efficiency\cite{seidel2001onthe}; accuracy of operations\cite{dieter2007lowcost}; and even the adaptation of algorithms originally used in software for reducing the overal error of a sequence of operations\cite{kadric2013accurate}. In this section we will briefly describe the algorithms for floating point operations without focusing on the hardware implementation of these algorithms.
  
-\rephrase{Eg: Massive rounding errors from calculatepi}
  
-\rephrase{Eg: Actual graphics things :S}
+\subsection{Precision and Rounding} 
  
+Real values which cannot be represented exactly in a floating point representation must be rounded to the nearest floating point value. The results of a floating point operation will in general be such values and thus there is a rounding error possible in any floating point operation. Referring to Figure \ref{minifloat.pdf} it can be seen that the largest possible rounding error, or ``units in last place'' (ulp) is half the distance between successive floats; this means that rounding errors increase as the value to be represented increases. The IEEE-754 standard specifies the rounding conventions for floating point arithmetic\cite{ieee754std2008}.
  
-\subsection{Limitations Imposed By Graphics APIs and/or GPUs}
  
-Traditionally algorithms for drawing vector graphics are performed on the CPU; the image is rasterised and then sent to the GPU for rendering\cite{}. Recently there has been a great deal of literature relating to implementation of algorithms such as B{\'e}zier curve rendering\cite{} or shading\cite{} on the GPU. As it seems the trend is to move towards GPU 
+Goldberg's assertively titled 1991 paper ``What Every Computer Scientist Needs to Know about Floating Point Arithmetic''\cite{goldberg1991whatevery} provides a comprehensive overview of issues in floating point arithmetic and relates these to requirements of the IEEE-754 1985 standard\cite{ieee754std1985}. More recently, after the release of the revised IEEE-754 standard in 2008\cite{ieee754std2008}, a textbook ``Handbook Of Floating Point Arithmetic'' has been published which provides a thourough review of literature relating to floating point arithmetic in both software and hardware\cite{HFP}.
  
-\rephrase{6. Here are ways GPU might not be IEEE-754 --- This goes *somewhere* in here but not sure yet}
-
-\begin{itemize}
-       \item Internal representations are GPU dependent and may not match IEEE\cite{hillesland2004paranoia}
-       \item OpenGL standards specify: binary16, binary32, binary64
-       \item OpenVG aims to become a standard API for SVG viewers but the API only uses binary32 and hardware implementations may use less than this internally\cite{rice2008openvg}
-       \item It seems that IEEE has not been entirely successful; although all modern CPUs and GPUs are able to read and write IEEE floating point types, many do not conform to the IEEE standard in how they represent floating point numbers internally. 
-       \item \rephrase{Blog post alert} \url{https://dolphin-emu.org/blog/2014/03/15/pixel-processing-problems/}
-\end{itemize}
-
-
-
-\rephrase{7. Sod all that, let's just use an arbitrary precision library (AND THUS WE FINALLY GET TO THE POINT)}
+William Kahan, one of the architects of the IEEE-754 standard in 1984 and a contributor to its revision in 2010, has also published many articles on his website explaining the more obscure features of the IEEE-754 standard and calling out software which fails to conform to the standard\footnote{In addition to encodings and acceptable rounding errors, the standard also specifies ``exceptions'' --- mechanisms by which a program can detect an error such as division by zero --- which are sometimes neglected, as in the ECMA-256}\cite{kahanweb, kahan1996ieee754}, as well as examples of the limitations of floating point computations\cite{kahan2007wrong}. 
  
  \subsection{Arbitrary Precision Floating Point Numbers}
  
-An arbitrary precision floating point number simply uses extra bits to store extra precision. Do it all using MFPR\cite{fousse2007mpfr}, she'll be right.
-
-\rephrase{8. Here is a brilliant summary of sections 7- above}
-
-Dear reader, thankyou for your persistance in reading this mangled excuse for a Literature Review.
-Hopefully we have brought together the radically different areas of interest together in some sort of coherant fashion.
-In the next chapter we will talk about how we have succeeded in rendering a rectangle. It will be fun. I am looking forward to it.
+Fouse described 
  
-\rephrase{Oh dear this is not going well}