LiteratureNotes.tex

   1 \documentclass[8pt]{extreport}
   2 \usepackage{graphicx}
   3 \usepackage{caption}
   4 \usepackage{amsmath} % needed for math align
   5 \usepackage{bm} % needed for maths bold face
   6  \usepackage{graphicx}    % needed for including graphics e.g. EPS, PS
   7 \usepackage{fancyhdr}   % needed for header
   8 %\usepackage{epstopdf} % Converts eps to pdf before including. Do it manually instead.
   9 \usepackage{float}
  10 \usepackage{hyperref}
  11
  12  \topmargin -1.5cm        % read Lamport p.163
  13  \oddsidemargin -0.04cm   % read Lamport p.163
  14  \evensidemargin -0.04cm  % same as oddsidemargin but for left-hand pages
  15  \textwidth 16.59cm
  16  \textheight 21.94cm
  17  %\pagestyle{empty}       % Uncomment if don't want page numbers
  18  \parskip 8.2pt           % sets spacing between paragraphs
  19  %\renewcommand{\baselinestretch}{1.5}  % Uncomment for 1.5 spacing between lines
  20  \parindent 0pt           % sets leading space for paragraphs
  21 \linespread{1}
  22
  23
  24 \newcommand{\vect}[1]{\boldsymbol{#1}} % Draw a vector
  25 \newcommand{\divg}[1]{\nabla \cdot #1} % divergence
  26 \newcommand{\curl}[1]{\nabla \times #1} % curl
  27 \newcommand{\grad}[1]{\nabla #1} %gradient
  28 \newcommand{\pd}[3][ ]{\frac{\partial^{#1} #2}{\partial #3^{#1}}} %partial derivative
  29 %\newcommand{\d}[3][ ]{\frac{d^{#1} #2}{d #3^{#1}}} %full derivative
  30 \newcommand{\phasor}[1]{\tilde{#1}} % make a phasor
  31 \newcommand{\laplacian}[1]{\nabla^2 {#1}} % The laplacian operator
  32
  33 \usepackage{color}
  34 \usepackage{listings}
  35
  36 \definecolor{darkgray}{rgb}{0.95,0.95,0.95}
  37 \definecolor{darkred}{rgb}{0.75,0,0}
  38 \definecolor{darkblue}{rgb}{0,0,0.75}
  39 \definecolor{pink}{rgb}{1,0.5,0.5}
  40 \lstset{language=XML}
  41 \lstset{backgroundcolor=\color{darkgray}}
  42 %\lstset{numbers=left, numberstyle=\tiny, stepnumber=1, numbersep=5pt}
  43 %\lstset{keywordstyle=\color{darkred}\bfseries}
  44 %\lstset{commentstyle=\color{darkblue}}
  45 %\lstset{stringsyle=\color{red}}
  46 %\lstset{showstringspaces=false}
  47 %\lstset{basicstyle=\small}
  48
  49 \newcommand{\shell}[1]{\texttt{#1}}
  50 \newcommand{\code}[1]{\texttt{#1}}
  51
  52 \begin{document}
  53
  54
  55 \pagestyle{fancy}
  56 \fancyhead{}
  57 \fancyfoot{}
  58
  59 \fancyhead[LO, L]{}
  60 \fancyfoot[CO, C]{\thepage}
  61
  62 \begin{center}
  63         B.Eng. Final Year Project \par
  64         {\bf \Large Literature Notes} \par
  65         Sam Moore, David Gow \\
  66         Faculty of Engineering, Computing and Mathematics, University of Western Australia \\
  67         March 2014
  68 \end{center}
  69
  70 \tableofcontents
  71
  72 \chapter{Literature Summaries}
  73
  74 \section{Postscript Language Reference Manual  \cite{plrm}}
  75
  76 Adobe's official reference manual for PostScript.
  77
  78 It is big.
  79
  80 \begin{itemize}
  81         \item First version was published BEFORE the IEEE standard and used smaller floats than binary32
  82         \item Now uses binary32 floats.
  83 \end{itemize}
  84
  85 \section{Portable Document Format Reference Manual  \cite{pdfref17}}
  86
  87 Adobe's official reference for PDF.
  88
  89 It is also big.
  90
  91 \begin{itemize}
  92         \item Early versions did not use IEEE binary32 but 16-16 exponent/mantissa encodings (Why?)
  93         \item Current standard is restricted to binary32
  94         \item It specifically says PDF creators must use at most binary32 because higher precision is not supported by Adobe Reader.
  95 \end{itemize}
  96
  97 \section{IEEE Standard for Floating-Point Arithmetic \cite{ieee2008-754}}
  98
  99 The IEEE (revised) 754 standard.
 100
 101 It is also big.
 102
 103 Successes:
 104 \begin{itemize}
 105         \item Has been adopted by CPUs
 106         \item Standardised floats for programmers --- accomplishes goal of allowing non-numerical experts to write reasonably sophisticated platform independent programs that may perform complex numerical operations
 107 \end{itemize}
 108
 109 Failures:
 110 \begin{itemize}
 111         \item Adoption by GPUs slower\cite{hillesland2004paranoia}
 112         \item It specifies the maximum errors for operations using IEEE types but nothing about internal representations
 113         \item Many hardware devices (GPUs and CPUs) use non-IEEE representations internally and simply truncate/round the result
 114         \begin{itemize}
 115                 \item This isn't so much of a problem when the device uses additional bits but it is misleading when GPUs use less than binary32 and act as if they are using binary32 from the programmer's perspective.
 116                 \item Devices using {\bf less} bits internally aren't IEEE compliant
 117         \end{itemize}
 118         \item Thus the same program compiled and run on different architectures may give completely different results\cite{HFP}
 119         \begin{itemize}
 120                 \item The ultimate goal of allowing people to write numerical programs in total ignorance of the hardware is not entirely realised
 121         \end{itemize}
 122         \item This is the sort of thing that makes people want to use a virtual machine, and thus Java
 123         \begin{itemize}
 124                 \item Objectively I probably shouldn't say that using Java is in itself a failure
 125         \end{itemize}
 126         \item Standards such as PostScript and PDF were slow to adopt IEEE representations
 127         \item The OpenVG standard accepts IEEE binary32 in the API but specifically states that hardware may use less than this\cite{rice2008openvg}
 128 \end{itemize}
 129
 130
 131
 132 \pagebreak
 133
 134 \section{Portable Document Format (PDF) --- Finally...  \cite{cheng2002portable}}
 135
 136 This is not spectacularly useful, is basically an advertisement for Adobe software.
 137
 138 {\bf Intro}
 139 \begin{itemize}
 140         \item Visual communications has been revolutionised by computing
 141         \item BUT there have always been problems in exchanging formats
 142         \item Filetypes like text, rich text, IGES, DXF, TIFF, JPEG, GIFF solve problems for particular types of files only
 143         \item PDF solves everything for everyone; can include text, images, animation, sound, etc
 144 \end{itemize}
 145 {\bf PDF Features}
 146 \begin{itemize}
 147         \item Raster Image Process (RIP) --- For printing (presumably also displaying on screen)
 148         \item Originally needed to convert to PS then RIP, with PS 3 can now RIP directly.
 149         \item Reduced filesize due to compression
 150         \item Four major applications - Stoy 1999\cite{stoy1999}
 151                 \begin{enumerate}
 152                         \item Download files from internet
 153                         \item Files on CDs
 154                         \item Files for outputing to printers
 155                         \item Conventional [commercial scale?] printing
 156                 \end{enumerate}
 157         \item List of various (Adobe) PDF related software
 158         \begin{itemize}
 159                 \item Includes software for PS that converts to/from PDF
 160                 \item So PS was obviously pretty popular before PDF
 161         \end{itemize}
 162         \item Can Optimize for screen/printer [not clear how]
 163         \item Can compress for size
 164 \end{itemize}
 165
 166 \pagebreak
 167 \section{Pixels or Perish  \cite{hayes2012pixels}}
 168
 169 ``The art of scientific illustration will have to adapt to the new age of online publishing''
 170 And therefore, JavaScript libraries ($\text{D}^3$) are the future.
 171
 172 The point is that we need to change from thinking about documents as paper to thinking of them as pixels.
 173 This kind of makes it related to our paper, because it is the same way we are justifying our project. It does mention precision, but doesn't say we need to get more of it.
 174
 175 I get the feeling from this that Web based documents are a whole bunch of completely different design philosophies hacked together
 176 with JavaScript.
 177
 178 This paper uses Metaphors a lot. I never met a phor that didn't over extend itself.
 179
 180
 181 {\bf Intro}
 182 \begin{itemize}
 183         \item Drawings/Pictures are ornaments in science but they are not just ornamental
 184         \item Processes have changed a lot; eg: photographic plates $\to$ digital images
 185         \item ``we are about to turn the page --- if not close the book --- on yet another chapter in publishing history.'' (HO HO HO)
 186         \item It would be cool to have animated figures in documents (eg: Population pyramid; changes with time); not just as ``supplements''
 187         \item In the beginning, there was PostScript, 1970s and 1980s, John Warnock and Charles Geschke, Adobe Systems
 188         \item PS is a language for vector graphics; objects are constructed from geometric primitives rather than a discrete array of pixels
 189         \item PS is a complete programming language; an image is also a program; can exploit this to control how images are created based on data (eg: Faces)
 190         \item PDF is ``flattened'' PS. No longer programable. Aspires to be ``virtual paper''.
 191         \item But why are we using such powerful computing machines just to emulate sheets paper? (the author asks)
 192 \end{itemize}
 193
 194 {\bf Web based Documents}
 195 \begin{itemize}
 196         \item HTML, CSS, JavaScript - The Axis of Web Documents
 197         \begin{itemize}
 198                 \item HTML - Defines document structure
 199                 \item CSS - Defines presentation of elements in document
 200                 \item JavaScript - Encodes actions, allows dynamic content (change the HTML/CSS)
 201         \end{itemize}
 202         \item \texttt{<canvas>} will let you draw anything (So in principle don't even need all of HTML/CSS)
 203         \begin{itemize}
 204                 \item Not device independent
 205                 \item ``Coordinates can be specified with precision finer than pixel resolution'' {\bf (TODO: Investigate this?)}
 206                 \item JavaScript operators to draw things on canvas are very similar to the PostScript model
 207         \end{itemize}
 208         \item SVG --- Same structure (Document Object Model (DOM)) as HTML
 209         \begin{itemize}
 210                 \item ``Noun language''
 211                 \item Nouns define lines/curves etc, rather than paragraphs/lists
 212                 \item Also borrows things from PostScript (eg: line caps and joints)
 213                 \item IS device independent, ``very high precision'' {\bf (TODO: Investigate)}
 214                 \item JavaScript can be used to interact with SVG too
 215         \end{itemize}
 216         \item $\text{D}^{3}$ (Data Driven Documents) - A JavaScript library
 217         \begin{itemize}
 218                 \item Idea is to create or modify elements of a DOM document using supplied data
 219                 \item \url{https://github.com/mbostock/d3/wiki}
 220         \end{itemize}
 221         \item We are in a new Golden Age of data visualisation
 222         \item Why do we still use PDFs?
 223         \begin{itemize}
 224                 \item PDFs are ``owned'' by the author/reader; you download it, store it, you can print it, etc
 225                 \item HTML documents are normally on websites. They are not self contained. They often rely on remote content from other websites (annoying to download the whole document).
 226         \end{itemize}
 227         \item {\bf Conclusion} Someone should open up PDF to accept things like $\text{D}^{3}$ and other graphics formats (links nicely with \cite{barnes2013embedding})
 228         \item Also, Harry Potter reference
 229
 230 \end{itemize}
 231
 232 \section{Embedding and Publishing Interactive, 3D Figures in PDF Files  \cite{barnes2013embedding}}
 233
 234 \begin{itemize}
 235         \item Linkes well with \cite{hayes2012pixels}; I heard you liked figures so I put a figure in your PDF
 236         \item Title pretty much summarises it; similar to \cite{hayes2012pixels} except these guys actually did something practical
 237 \end{itemize}
 238
 239 \section{27 Bits are not enough for 8 digit accuracy  \cite{goldberg1967twentyseven}}
 240
 241 Proves with maths, that rounding errors mean that you need at least $q$ bits for $p$ decimal digits. $10^p < 2^{q-1}$
 242
 243 \begin{itemize}
 244         \item Eg: For 8 decimal digits, since $10^8 < 2^{27}$ would expect to be able to represent with 27 binary digits
 245         \item But: Integer part requires digits bits (regardless of fixed or floating point represenetation)
 246         \item Trade-off between precision and range
 247         \begin{itemize}
 248                 \item 9000000.0 $\to$ 9999999.9 needs 24 digits for the integer part $2^{23} = 83886008$
 249         \end{itemize}
 250         \item Floating point zero = smallest possible machine exponent
 251         \item Floating point representation:
 252         \begin{align*}
 253                 y &= 0.y_1 y_2 \text{...} y_q \times 2^{n}
 254         \end{align*}
 255         \item Can eliminate a bit by considering whether $n = -e$ for $-e$ the smallest machine exponent (???)
 256         \begin{itemize}
 257                 \item Get very small numbers with the same precision
 258                 \item Get large numbers with the extra bit of precision
 259         \end{itemize}
 260 \end{itemize}
 261
 262 \section{What every computer scientist should know about floating-point arithmetic  \cite{goldberg1991whatevery}}
 263
 264 \begin{itemize}
 265         \item Book: \emph{Floating Point Computation} by Pat Sterbenz (out of print... in 1991)
 266     \item IEEE floating point standard becoming popular (introduced in 1987, this is 1991)
 267     \begin{itemize}
 268                 \item As well as structure, defines the algorithms for addition, multiplication, division and square root
 269                 \item Makes things portable because results of operations are the same on all machines (following the standard)
 270                 \item Alternatives to floating point: Floating slasi and Signed Logarithm (TODO: Look at these, although they will probably not be useful)
 271
 272         \end{itemize}
 273         \item Base $\beta$ and precision $p$ (number of digits to represent with) - powers of the base can be represented exactly.
 274         \item Largest and smallest exponents $e_{min}$ and $e_{max}$
 275         \item Need bits for exponent and fraction, plus one for sign
 276         \item ``Floating point number'' is one that can be represented exactly.
 277         \item Representations are not unique! $0.01 \times 10^1 = 1.00 \times 10^{-1}$ Leading digit of one $\implies$ ``normalised''
 278         \item Requiring the representation to be normalised makes it unique, {\bf but means it is impossible to represent zero}.
 279         \begin{itemize}
 280                 \item Represent zero as $1 \times \beta^{e_{min}-1}$ - requires extra bit in the exponent
 281         \end{itemize}
 282         \item {\bf Rounding Error}
 283         \begin{itemize}
 284                 \item ``Units in the last place'' eg: 0.0314159 compared to 0.0314 has ulp error of 0.159
 285                 \item If calculation is the nearest floating point number to the result, it will still be as much as 1/2 ulp in error
 286                 \item Relative error corresponding to 1/2 ulp can vary by a factor of $\beta$ ``wobble''. Written in terms of $\epsilon$
 287                 \item Maths $\implies$ {\bf Relative error is always bounded by $\epsilon = (\beta/2)\beta^{-p}$}
 288                 \item Fixed relative error $\implies$ ulp can vary by a factor of $\beta$ . Vice versa
 289                 \item Larger $\beta \implies$ larger errors
 290         \end{itemize}
 291         \item {\bf Guard Digits}
 292         \begin{itemize}
 293                 \item In subtraction: Could compute exact difference and then round; this is expensive
 294                 \item Keep fixed number of digits but shift operand right; discard precision. Lead to relative error up to $\beta - 1$
 295                 \item Guard digit: Add extra digits before truncating. Leads to relative error of less than $2\epsilon$. This also applies to addition
 296         \end{itemize}
 297         \item {\bf Catastrophic Cancellation} - Operands are subject to rounding errors - multiplication
 298         \item {\bf Benign Cancellation} - Subtractions. Error $< 2\epsilon$
 299         \item Rearrange formula to avoid catastrophic cancellation
 300         \item Historical interest only - speculation on why IBM used $\beta = 16$ for the system/370 - increased range? Avoids shifting
 301         \item Precision: IEEE defines extended precision (a lower bound only)
 302         \item Discussion of the IEEE standard for operations (TODO: Go over in more detail)
 303         \item NaN allow continuing with underflow and Infinity with overflow
 304         \item ``Incidentally, some people think that the solution to such anomalies is never to compare floating-point numbers for equality but instead to consider them equal if they are within some error bound E. This is hardly a cure all, because it raises as many questions as it answers.'' - On equality of floating point numbers
 305
 306 \end{itemize}
 307
 308
 309 %%%%
 310 % David's Stuff
 311 %%%%
 312 \section{Compositing Digital Images  \cite{porter1984compositing}}
 313
 314
 315
 316 Perter and Duff's classic paper "Compositing Digital Images" lays the
 317 foundation for digital compositing today. By providing an "alpha channel,"
 318 images of arbitrary shapes — and images with soft edges or sub-pixel coverage
 319 information — can be overlayed digitally, allowing separate objects to be
 320 rasterized separately without a loss in quality.
 321
 322 Pixels in digital images are usually represented as 3-tuples containing
 323 (red component, green component, blue component). Nominally these values are in
 324 the [0-1] range. In the Porter-Duff paper, pixels are stored as $(R,G,B,\alpha)$
 325 4-tuples, where alpha is the fractional coverage of each pixel. If the image
 326 only covers half of a given pixel, for example, its alpha value would be 0.5.
 327
 328 To improve compositing performance, albeit at a possible loss of precision in
 329 some implementations, the red, green and blue channels are premultiplied by the
 330 alpha channel. This also simplifies the resulting arithmetic by having the
 331 colour channels and alpha channels use the same compositing equations.
 332
 333 Several binary compositing operations are defined:
 334 \begin{itemize}
 335 \item over
 336 \item in
 337 \item out
 338 \item atop
 339 \item xor
 340 \item plus
 341 \end{itemize}
 342
 343 The paper further provides some additional operations for implementing fades and
 344 dissolves, as well as for changing the opacity of individual elements in a
 345 scene.
 346
 347 The method outlined in this paper is still the standard system for compositing
 348 and is implemented almost exactly by modern graphics APIs such as \texttt{OpenGL}. It is
 349 all but guaranteed that this is the method we will be using for compositing
 350 document elements in our project.
 351
 352 {\bf Some issues with the statements made here...}
 353
 354 When introducing Bresenham's algorithm below you say modern graphics systems ``will often use Wu's line-drawing algorithm instead, \emph{as it produces antialiased lines}'' (and don't give a citation). Here you say OpenGL uses Porter-Duff compositing ``almost exactly''. But in their introduction they say: ``
 355 \begin{enumerate}
 356 \
 357
 358 \section{Bresenham's Algorithm: Algorithm for computer control of a digital plotter  \cite{bresenham1965algorithm}}
 359 Bresenham's line drawing algorithm is a fast, high quality line rasterization
 360 algorithm which is still the basis for most (aliased) line drawing today. The
 361 paper, while originally written to describe how to control a particular plotter,
 362 is uniquely suited to rasterizing lines for display on a pixel grid.
 363
 364 Lines drawn with Bresenham's algorithm must begin and end at integer pixel
 365 coordinates, though one can round or truncate the fractional part. In order to
 366 avoid multiplication or division in the algorithm's inner loop,
 367
 368 The algorithm works by scanning along the long axis of the line, moving along
 369 the short axis when the error along that axis exceeds 0.5px. Because error
 370 accumulates linearly, this can be achieved by simply adding the per-pixel
 371 error (equal to (short axis/long axis)) until it exceeds 0.5, then incrementing
 372 the position along the short axis and subtracting 1 from the error accumulator.
 373
 374 As this requires nothing but addition, it is very fast, particularly on the
 375 older CPUs used in Bresenham's time. Modern graphics systems will often use Wu's
 376 line-drawing algorithm instead, as it produces antialiased lines, taking
 377 sub-pixel coverage into account. Bresenham himself extended this algorithm to
 378 produce Bresenham's circle algorithm. The principles behind the algorithm have
 379 also been used to rasterize other shapes, including B\'{e}zier curves.
 380
 381 \section{Quad Trees: A Data Structure for Retrieval on Composite Keys  \cite{finkel1974quad}}
 382
 383 This paper introduces the ``quadtree'' spatial data structure. The quadtree structure is
 384 a search tree in which every node has four children representing the north-east, north-west,
 385 south-east and south-west quadrants of its space.
 386
 387 \section{Xr: Cross-device Rendering for Vector Graphics  \cite{worth2003xr}}
 388
 389 Xr (now known as Cairo) is an implementation of the PDF v1.4 rendering model,
 390 independent of the PDF or PostScript file formats, and is now widely used
 391 as a rendering API. In this paper, Worth and Packard describe the PDF v1.4 rendering
 392 model, and their PostScript-derived API for it.
 393
 394 The PDF v1.4 rendering model is based on the original PostScript model, based around
 395 a set of \emph{paths} (and other objects, such as raster images) each made up of lines
 396 and B\'{e}zier curves, which are transformed by the ``Current Transformation Matrix.''
 397 Paths can be \emph{filled} in a number of ways, allowing for different handling of self-intersecting
 398 paths, or can have their outlines \emph{stroked}.
 399 Furthermore, paths can be painted with RGB colours and/or patterns derived from either
 400 previously rendered objects or external raster images.
 401 PDF v1.4 extends this to provide, amongst other features, support for layering paths and
 402 objects using Porter-Duff compositing\cite{porter1984compositing}, giving each painted path
 403 the option of having an $\alpha$ value and a choice of any of the Porter-Duff compositing
 404 methods.
 405
 406 The Cairo library approximates the rendering of some objects (particularly curved objects
 407 such as splines) with a set of polygons. An \texttt{XrSetTolerance} function allows the user
 408 of the library to set an upper bound on the approximation error in fractions of device pixels,
 409 providing a trade-off between rendering quality and performance. The library developers found
 410 that setting the tolerance to greater than $0.1$ device pixels resulted in errors visible to the
 411 user.
 412
 413 \section{Glitz: Hardware Accelerated Image Compositing using OpenGL  \cite{nilsson2004glitz}}
 414
 415 This paper describes the implementation of an \texttt{OpenGL} based rendering backend for
 416 the \texttt{Cairo} library.
 417
 418 The paper describes how OpenGL's Porter-Duff compositing is easily suited to the Cairo/PDF v1.4
 419 rendering model. Similarly, traditional OpenGL (pre-version 3.0 core) support a matrix stack
 420 of the same form as Cairo.
 421
 422 The ``Glitz'' backend will emulate support for tiled, non-power-of-two patterns/textures if
 423 the hardware does not support it.
 424
 425 Glitz can render both triangles and trapezoids (which are formed from pairs of triangles).
 426 However, it cannot guarantee that the rasterization is pixel-precise, as OpenGL does not proveide
 427 this consistently.
 428
 429 Glitz also supports multi-sample anti-aliasing, convolution filters for raster image reads (implemented
 430 with shaders).
 431
 432 Performance was much improved over the software rasterization and over XRender accellerated rendering
 433 on all except nVidia hardware. However, nVidia's XRender implementation did slow down significantly when
 434 some transformations were applied.
 435
 436 In \cite{kilgard2012gpu}, Kilgard mentions that Glitz has been abandoned. He describes it as ''GPU assisted'' rather than GPU accelerated, since it used the XRender (??) extension.
 437
 438 %% Sam again
 439
 440 \section{Boost Multiprecision Library  \cite{boost_multiprecision}}
 441
 442 \begin{itemize}
 443         \item ``The Multiprecision Library provides integer, rational and floating-point types in C++ that have more range and precision than C++'s ordinary built-in types.''
 444         \item Specify number of digits for precision as a template argument.
 445         \item Precision is fixed... {\bf possible approach to project:} Use \verb/boost::mpf_float<N>/ and increase \verb/N/ as more precision is required?
 446 \end{itemize}
 447
 448
 449 % Some hardware related sounding stuff...
 450
 451 \section{A CMOS Floating Point Unit  \cite{kelley1997acmos}}
 452
 453 The paper describes the implentation of a FPU for PowerPC using a particular Hewlett Packard process (HP14B 0.5$\mu$m, 3M, 3.3V).
 454 It implements a ``subset of the most commonly used double precision floating point instructions''. The unimplemented operations are compiled for the CPU.
 455
 456 The paper gives a description of the architecture and design methods.
 457 This appears to be an entry to a student design competition.
 458
 459 Standard is IEEE 754, but the multiplier tree is a 64-bit tree instead of a 54 bit tree.
 460 `` The primary reason for implementing a larger tree is for future additions of SIMD [Single Instruction Multiple Data (?)] instructions similar to Intel's MMX and Sun's VIS instructions''.
 461
 462 HSPICE simulations used to determine transistor sizing.
 463
 464 Paper has a block diagram that sort of vaguely makes sense to me.
 465 The rest requires more background knowledge.
 466
 467 \section{Simply FPU\cite{filiatreault2003simply}}
 468
 469 This is a webpage at one degree of seperation from wikipedia.
 470
 471 It talks about FPU internals, but mostly focuses on the instruction sets.
 472 It includes FPU assembly code examples (!)
 473
 474 It is probably not that useful, I don't think we'll end up writing FPU assembly?
 475
 476 FPU's typically have 80 bit registers so they can support REAL4, REAL8 and REAL10 (single, double, extended precision).
 477
 478 Note: Presumably this is referring to the x86 80 bit floats that David was talking about?
 479
 480
 481 \section{Floating Point Package User's Guide  \cite{bishop2008floating}}
 482
 483 This is a technical report describing floating point VHDL packages \url{http://www.vhdl.org/fphdl/vhdl.html}
 484
 485 In theory I know VHDL (cough) so I am interested in looking at this further to see how FPU hardware works.
 486 It might be getting a bit sidetracked from the ``document formats'' scope though.
 487
 488 The report does talk briefly about the IEEE standard and normalised / denormalised numbers as well.
 489
 490 See also: Java Optimized Processor\cite{jop} (it has a VHDL implementation of a FPU).
 491
 492 \section{Low-Cost Microarchitectural Support for Improved Floating-Point Accuracy\cite{dieter2007lowcost}}
 493
 494 Mentions how GPUs offer very good floating point performance but only for single precision floats. (NOTE: This statement seems to contradict \cite{hillesland2004paranoia}.
 495
 496 Has a diagram of a Floating Point adder.
 497
 498 Talks about some magical technique called "Native-pair Arithmetic" that somehow makes 32-bit floating point accuracy ``competitive'' with 64-bit floating point numbers.
 499
 500 \section{Accurate Floating Point Arithmetic through Hardware Error-Free Transformations  \cite{kadric2013accurate}}
 501
 502 From the abstract: ``This paper presents a hardware approach to performing ac-
 503 curate floating point addition and multiplication using the idea of error-
 504 free transformations. Specialized iterative algorithms are implemented
 505 for computing arbitrarily accurate sums and dot products.''
 506
 507 The references for this look useful.
 508
 509 It also mentions VHDL.
 510
 511 So whenever hardware papers come up, VHDL gets involved...
 512 I guess it's time to try and work out how to use the Opensource VHDL implementations.
 513
 514 This is about reduction of error in hardware operations rather than the precision or range of floats.
 515 But it is probably still relevant.
 516
 517 This has the Fast2Sum algorithm but for the love of god I cannot see how you can compute anything other than $a + b = 0 \forall a,b$ using the algorithm as written in their paper. It references Dekker\cite{dekker1971afloating} and Kahn; will look at them instead.
 518
 519 \section{Floating Point Unit from JOP  \cite{jop}}
 520
 521 This is a 32 bit floating point unit developed for JOP in VHDL.
 522 I have been able to successfully compile it and the test program using GHDL\cite{ghdl}.
 523
 524 Whilst there are constants (eg: \verb/FP_WIDTH = 32, EXP_WIDTH = 8, FRAC_WIDTH = 23/) defined, the actual implementation mostly uses magic numbers, so
 525 some investigation is needed into what, for example, the "52" bits used in the sqrt units are for.
 526
 527 \section{GHDL  \cite{ghdl}}
 528
 529 GHDL is an open source GPL licensed VHDL compiler written in Ada. It had packages in debian up until wheezy when it was removed. However the sourceforge site still provides a \shell{deb} file for wheezy.
 530
 531 This reference explains how to use the \shell{ghdl} compiler, but not the VHDL language itself.
 532
 533 GHDL is capable of compiling a ``testbench'' - essentially an executable which simulates the design and ensures it meets test conditions.
 534 A common technique is using a text file to provide the inputs/outputs of the test. The testbench executable can be supplied an argument to save a \shell{vcd} file which can be viewed in \shell{gtkwave} to see timing diagrams.
 535
 536 Sam has successfully compiled the VHDL design for an FPU in JOP\cite{jop} into a ``testbench'' executable which uses standard i/o instead of a regular file.
 537 Using unix domain sockets we can execute the FPU as a child process and communicate with it from our document viewing test software. This means we can potentially simulate alternate hardware designs for FPUs and witness the effect they will have on precision in the document viewer.
 538
 539 Using \shell{ghdl} the testbench can also be linked as part a C/C++ program and run using a function; however there is still no way to communicate with it other than forking a child process and using a unix domain socket anyway. Also, compiling the VHDL FPU as part of our document viewer would clutter the code repository and probably be highly unportable. The VHDL FPU has been given a seperate repository.
 540
 541 \section{On the design of fast IEEE floating-point adders  \cite{seidel2001onthe}}
 542
 543 This paper gives an overview of the ``Naive'' floating point addition/subtraction algorithm and gives several optimisation techniques:
 544
 545 TODO: Actually understand these...
 546
 547 \begin{itemize}
 548         \item Use parallel paths (based on exponent)
 549         \item Unification of significand result ranges
 550         \item Reduction of IEEE rounding modes
 551         \item Sign-magnitude computation of a difference
 552         \item Compound Addition
 553         \item Approximate counting of leading zeroes
 554         \item Pre-computation of post-normalization shift
 555 \end{itemize}
 556
 557 They then give an implementation that uses these optimisation techniques including very scary looking block diagrams.
 558
 559 They simulated the FPU. Does not mention what simulation method was used directly, but cites another paper (TODO: Look at this. I bet it was VHDL).
 560
 561 The paper concludes by summarising the optimisation techniques used by various adders in production (in 2001).
 562
 563 This paper does not specifically mention the precision of the operations, but may be useful because a faster adder design might mean you can increase the precision.
 564
 565 \section{Re: round32 ( round64 ( X ) ) ?= round32 ( X )  \cite{beebe2011round32}}
 566
 567 I included this just for the quote by Nelson H. F. Beebe:
 568
 569 ``It is too late now to repair the mistakes of the past that are present
 570 in millions of installed systems, but it is good to know that careful
 571 research before designing hardware can be helpful.''
 572
 573 This is in regards to the problem of double rounding. It provides a reference for a paper that discusses a rounding mode that eliminates the problem, and a software implementation.
 574
 575 It shows that the IEEE standard can be fallible!
 576
 577 Not sure how to work this into our literature review though.
 578
 579 % Back to software
 580 \section{Basic Issues in Floating Point Arithmetic and Error Analysis  \cite{demmel1996basic}}
 581
 582 These are lecture notes from U.C Berkelye CS267 in 1996.
 583
 584
 585 \section{Charles Babbage  \cite{dodge_babbage, nature1871babbage}}
 586
 587 Tributes to Charles Babbage. Might be interesting for historical background. Don't mention anything about floating point numbers though.
 588
 589 \section{GPU Floating-Point Paranoia  \cite{hillesland2004paranoia}}
 590
 591 This paper discusses floating point representations on GPUs. They have reproduced the program \emph{Paranoia} by William Kahan for characterising floating point behaviour of computers (pre IEEE) for GPUs.
 592
 593
 594 There are a few remarks about GPU vendors not being very open about what they do or do not do with
 595
 596
 597 Unfortunately we only have the extended abstract, but a pretty good summary of the paper (written by the authors) is at: \url{www.cs.unc.edu/~ibr/projects/paranoia/}
 598
 599 From the abstract:
 600
 601 ``... [GPUs are often similar to IEEE] However, we have found
 602 that GPUs do not adhere to IEEE standards for floating-point op-
 603 erations, nor do they give the information necessary to establish
 604 bounds on error for these operations ... ''
 605
 606 and ``...Our goal is to determine the error bounds on floating-point op-
 607 eration results for quickly evolving graphics systems. We have cre-
 608 ated a tool to measure the error for four basic floating-point opera-
 609 tions: addition, subtraction, multiplication and division.''
 610
 611 The implementation is only for windows and uses glut and glew and things.
 612 Implement our own version?
 613
 614 \section{A floating-point technique for extending the available precision  \cite{dekker1971afloating}}
 615
 616 This is Dekker's formalisation of the Fast2Sum algorithm originally implemented by Kahn.
 617
 618 \begin{align*}
 619         z &= \text{RN}(x + y) \\
 620         w &= \text{RN}(z - x) \\
 621         zz &= \text{RN}(y - w) \\
 622         \implies z + zz &= x + y
 623 \end{align*}
 624
 625 There is a version for multiplication.
 626
 627 I'm still not quite sure when this is useful. I haven't been able to find an example for $x$ and $y$ where $x + y \neq \text{Fast2Sum}(x, y)$.
 628
 629 \section{Handbook of Floating-Point Arithmetic \cite{HFP}}
 630
 631 This book is amazingly useful and pretty much says everything there is to know about Floating Point numbers.
 632 It is much easier to read than Goldberg or Priest's papers.
 633
 634 I'm going to start working through it and compile their test programs.
 635
 636 \subsection{A sequence that seems to converge to a wrong limit  - pgs 9-10, \cite{HFP}}
 637
 638 \begin{align*}
 639         u_n &= \left\{ \begin{array}{c} u_0 = 2 \\ u_1 = -4 \\ u_n = 111 - \frac{1130}{u_{n-1}} + \frac{3000}{u_{n-1}u_{n-2}}\end{array}\right.
 640 \end{align*}
 641
 642 The limit of the series should be $6$ but when calculated with IEEE floats it is actually $100$
 643 The authors show that the limit is actually $100$ for different starting values, and the error in floating point arithmetic causes the series to go to that limit instead.
 644
 645 \begin{figure}[H]
 646         \centering
 647         \includegraphics[width=0.8\textwidth]{figures/handbook1-1.pdf}
 648         \caption{Output of Program 1.1 from \emph{Handbook of Floating-Point Arithmetic}\cite{HFP} for various IEEE types}
 649         \label{HFP-1-1}
 650 \end{figure}
 651
 652 \subsection{Mr Gullible and the Chaotic Bank Society pgs 10-11 \cite{HFP}}
 653
 654 This is an example of a sequence involving $e$. Since $e$ cannot be represented exactly with FP, even though the sequence should go to $0$ for $a_0 = e - 1$, the representation of $a_0 \neq e - 1$ so the sequence goes to $\pm \infty$.
 655
 656 To eliminate these types of problems we'd need an \emph{exact} representation of all real numbers.
 657 For \emph{any} FP representation, regardless of precision (a finite number of digits) there will be numbers that can't be represented exactly hence you could find a similar sequence that would explode.
 658
 659 IE: The more precise the representation, the slower things go wrong, but they still go wrong, {\bf even with errorless operations}.
 660
 661
 662 \subsection{Rump's example pg 12 \cite {HFP}}
 663
 664 This is an example where the calculation of a function $f(a,b)$ is not only totally wrong, it gives completely different results depending on the CPU. Despite the CPU conforming to IEEE.
 665
 666 \section{Scalable Vector Graphics (SVG) 1.1 (Second Edition) \cite{svg2011-1.1}}
 667
 668 Scalable Vector Graphics (SVG) is a XML language for describing two dimensional graphics. This document is \url{http://www.w3.org/TR/2011/REC-SVG11-20110816/}, the latest version of the standard at the time of writing.
 669
 670 \subsubsection{Three types of object}
 671 \begin{enumerate}
 672         \item Vector graphics shapes (paths)
 673         \item Images (embedded bitmaps)
 674         \item Text
 675 \end{enumerate}
 676
 677 \subsubsection{Rendering Model and Precision}
 678
 679 SVG uses the ``painter's model''. Paint is applied to regions of the page in an order, covering the paint below it according to rules for alpha blending.
 680
 681 ``Implementations of SVG are expected to behave as though they implement a rendering (or imaging) model cor-
 682 responding to the one described in this chapter. A real implementation is not required to implement the model in
 683 this way, but the result on any device supported by the implementation shall match that described by this model.''
 684
 685 SVG uses {\bf single precision floats}. Unlike PDF and PostScript, the standard specifies this as a {\bf minimum} range from \verb/-3.4e+38F/ to \verb/+3.4e+38F/
 686
 687 ``It is recommended that higher precision floating point storage and computation be performed on operations
 688 such as coordinate system transformations to provide the best possible precision and to prevent round-off errors.''
 689
 690 There is also a ``High Quality SVG Viewers'' requirement to use at least {\bf double} precision floats.
 691
 692 \section{OpenVG Specification 1.1 \cite{rice2008openvg}}
 693 ``OpenVG is an application programming interface (API) for hardware-accelerated two-
 694 dimensional vector and raster graphics developed under the auspices of the Khronos
 695 Group \url{www.khronos.org}.''
 696
 697 Specifically, provides the same drawing functionality required by a high-performance SVG document viewer (SVG Tiny 1.2)
 698 TODO: Look at that $\neq$ SVG 1.1
 699
 700 It is designed to be similar to OpenGL.
 701
 702 \subsubsection{Precision}
 703 ``All floating-point values are specified in standard IEEE 754 format. However,
 704 implementations may clamp extremely large or small values to a restricted
 705 range, and internal processing may be performed with lesser precision. At least
 706 16 bits of mantissa, 6 bits of exponent, and a sign bit must be present, allowing
 707 values from $\pm 2^{-30}$ to $\pm2^{31}$ to be represented with a fractional precision of at least 1
 708 in $2^{16}$.''
 709
 710 IEEE but with a non standard number of bits.
 711
 712 Presumably the decreased precision is for increased efficiency the standard states that example applications are for ebooks.
 713
 714
 715 \section{Document Object Model --- pugixml 1.4 manual \cite{pugixmlDOM}}
 716
 717 Pugixml is a C++ library for parsing XML documents\cite{kapoulkine2014pugixml}. XML is based on the Document Object Model (DOM); this is explained pretty well by the pugixml manual.
 718
 719 The document is the root node of the tree. Each child node has a type. These may
 720 \begin{itemize}
 721         \item Have attributes
 722         \item Have child nodes of their own
 723         \item Contain data
 724 \end{itemize}
 725
 726 In the case of HTML/SVG an XML parser within the browser/viewer creates the DOM tree from the XML (plain text) and then interprets that tree to produce the objects that will be rendered.
 727
 728 Example of XML $\to$ DOM tree given at\cite{pugixmlDOM}.
 729 Example of XML parsing using pugixml is in \shell{code/src/tests/xml.cpp}
 730
 731
 732 \begin{figure}[H]
 733         \centering
 734 \begin{lstlisting}[language=XML,basicstyle=\ttfamily]
 735 <?xml version="1.0"?>
 736 <mesh name="mesh_root">
 737     <!-- here is a mesh node -->
 738     some text
 739     <![CDATA[someothertext]]>
 740     some more text
 741     <node attr1="value1" attr2="value2" />
 742     <node attr1="value2">
 743         <innernode/>
 744     </node>
 745 </mesh>
 746 <?include somedata?>
 747 \end{lstlisting}
 748
 749         \includegraphics[width=0.6\textwidth]{references/pugixmlDOM-dom_tree.png}
 750         \caption{Tree representation of the above listing \cite{pugixmlDOM}}
 751 \end{figure}
 752
 753 \section{An Algorithm For Shading of Regions on Vector Display Devices \cite{brassel1979analgorithm}}
 754
 755 All modern display devices are raster based and therefore this paper is mainly of historical interest. It provides some references for shading on a raster display.
 756
 757 The algorithm described will shade an arbitrary simply-connected polygon using one or two sets of parallel lines.
 758
 759 The ``traditional'' method is:
 760 \begin{enumerate}
 761         \item Start with a $N$ vertex polygon, rotate coords by the shading angle
 762         \item Determine a bounding rectangle
 763         \item For $M$ equally spaced parallel lines, compute the intersections with the boundaries of the polygon
 764         \item Rotate coordinates back
 765         \item Render the $M$ lines
 766 \end{enumerate}
 767
 768 This is pretty much exactly how an artist would shade a pencil drawing. It is $O(M\times N)$.
 769
 770 The algorithm in this paper does:
 771 \begin{enumerate}
 772         \item Rotate polygon coords by shading angle
 773         \item Subdivide the polygon into trapezoids (special case triangle)
 774         \item Shade the trapezoids independently
 775         \item Rotate it all back
 776 \end{enumerate}
 777 It is more complicated than it seems. The subdivision requires a sort to be performed on the vertices of the polygon based on their rotated $x$ and $y$ coordinates.
 778
 779 \section{An Algorithm For Filling Regions on Graphics Display Devices \cite{lane1983analgorithm}}
 780
 781 This gives an algorithm for for polygons (which may have ``holes'').
 782 It requires the ability to ``subtract'' fill from a region; this is (as far as I can tell) difficult for vector graphics devices but simple on raster graphics devices, so the paper claims it is oriented to the raster graphics devices.
 783
 784 If the polygon is defined by $(x_i, y_i)$ then this algorithm iterates from $i = 2$ and alternates between filling and erasing the triangles $[(x_i, y_i), (x_{i+1}, y_{i+1}), (x_1, y_1)]$. It requires no sorting of the points.
 785
 786 The paper provides a proof that the algorithm is correct and is ``optimal in the number of pixel updates required for convex polygons''.
 787 In the conclusion it is noted that trapezoids could be used from a fixed line and edge of the polygon, but this is not pixel optimal.
 788
 789 This paper doesn't have a very high citation count but it is cited by the NVIDIA article \cite{kilgard2012gpu}.
 790 Apparently someone else adapted this algorithm for use with the stencil buffer.
 791
 792 \section{GPU-accelerated path rendering \cite{kilgard2012gpu, kilgard300programming}}
 793
 794 Vector graphics on the GPU; an NVIDIA extension. \cite{kilgard300programming} is the API.
 795
 796 Motivations:
 797 \begin{itemize}
 798         \item The focus has been on 3D acceleration in GPUs; most path rendering is done by the CPU.
 799         \item Touch devices allow the screen to be transformed rapidly; CPU rastering of the path becomes inefficient
 800         \begin{itemize}
 801                 \item The source of the ugly pixelated effects on a smartphone when scaling?
 802         \end{itemize}
 803         \item Especially when combined with increased resolution of these devices
 804         \item Standards such as HTML5, SVG, etc, expose path rendering
 805         \item Javascript is getting fast enough that we can't blame it anymore (the path rendering is the bottleneck not the JS)
 806         \item GPU is more power efficient than the CPU
 807 \end{itemize}
 808
 809 Results show the extension is faster than almost every renderer it was compared with for almost every test image.
 810
 811 Comparisons to other attempts:
 812 \begin{itemize}
 813         \item Cairo and Glitz \cite{nilsson2004glitz} (abandoned)
 814 \       \item Direct2D from Microsoft uses CPU to tesselate trapezoids and then renders these on the GPU
 815         \item Skia in Android/Chrome uses CPU but now has Ganesh which is also hybrid CPU/GPU
 816         \item Khronos Group created OpenVG\cite{rice2008openvg} with several companies creating hardware units to implement the standard. Performance is not as good as ``what we report''
 817 \end{itemize}
 818
 819
 820 \section{A Multiple Precision Binary Floating Point Library With Correct Rounding \cite{fousse2007mpfr}}
 821
 822 This is what is now the GNU MPFR C library; it has packages in debian wheezy.
 823
 824 The library was motivated by the lack of existing arbitrary precision libraries which conformed to IEEE rounding standards.
 825 Examples include Mathematica, GNU MP (which this library is actually built upon), Maple (which is an exception but buggy).
 826
 827 TODO: Read up on IEEE rounding to better understand the first section
 828
 829 Data:
 830 \begin{itemize}
 831         \item $(s, m, e)$ where $e$ is signed
 832         \item Precision $p$ is number of bits in $m$
 833         \item $\frac{1}{2} \leq m < 1$
 834         \item The leading bit of the mantissa is always $1$ but it is not implied
 835         \item There are no denormalised numbers
 836         \item Mantissa is stored as an array of ``limbs'' (unsigned integers) as in GMP
 837 \end{itemize}
 838
 839 The paper includes performance comparisons with several other libraries, and a list of literature using the MPFR (the dates indicating that the library has been used reasonably widely before this paper was published).
 840
 841 \section{Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic \cite{kahan1996ieee754}}
 842
 843 11 years after the IEEE 754 standard becomes official, Kahan discusses various misunderstood features of the standard and laments at the failure of compilers and microprocessors to conform to some of these.
 844
 845 I am not sure how relevant these complaints are today, but it makes for interesting reading as Kahan is clearly very passionate about the need to conform \emph{exactly} to IEEE 754.
 846
 847 Issues considered are: Rounding rules, Exception handling and NaNs (eg: The payload of NaNs is not used in any software Kahan is aware of), Bugs in compilers (mostly Fortran) where an expression contains floats of different precisions (the compiler may attempt to optimise an expression resulting in a failure to round a double to a single), Infinity (which is unsupported by many compilers though it is supported in hardware)...
 848
 849
 850 An example is this Fortran compiler ``nasty bug'' where the compiler replaces $s$ with $x$ in line 4 and thus a rounding operation is lost.
 851 \begin{lstlisting}[language=Fortran, basicstyle=\ttfamily]
 852         real(8) :: x, y % double precision (or extended as real(12))
 853         real(4) :: s, t % single precision
 854         s = x % s should be rounded
 855         t = (s - y) / (...) % Compiler incorrectly replaces s with x in this line
 856 \end{lstlisting}
 857
 858 \subsection{The Baleful Influence of Benchmarks \cite{kahan1996ieee754} pg 20}
 859
 860 This section discusses the tendency to rate hardware (or software) on their speed performance and neglect accuracy.
 861
 862 Is this complaint still relevant now when we consider the eagerness to perform calculations on the GPU?
 863 ie: Doing the coordinate transforms on the GPU is faster, but less accurate (see our program).
 864
 865 Benchmarks need to be well designed when considering accuracy; how do we know an inaccurate computer hasn't gotten the right answer for a benchmark by accident?
 866
 867 A proposed benchmark for determining the worst case rounding error is given. This is based around computing the roots to a quadratic equation.
 868 A quick scan of the source code for paranoia does not reveal such a test.
 869
 870 As we needed benchmarks for CPUs perhaps we need benchmarks for GPUs. The GPU Paranoia paper\cite{hillesland2004paranoia} is the only one I have found so far.
 871
 872 \section{Prof W. Kahan's Web Pages \cite{kahanweb}}
 873
 874 William Kahan, architect of the IEEE 754-1985 standard, creator of the program ``paranoia'', the ``Kahan Summation Algorithm'' and contributor to the revised standard.
 875
 876 Kahan's web page has more examples of errors in floating point arithmetic (and places where things have not conformed to the IEEE 754 standard) than you can poke a stick at.
 877
 878 Wikipedia's description is pretty accurate: ``He is an outspoken advocate of better education of the general computing population about floating-point issues, and regularly denounces decisions in the design of computers and programming languages that may impair good floating-point computations.''
 879
 880 Kahan's articles are written with almost religious zeal but they are backed up by examples and results, a couple of which I have confirmed.\footnote{One that doesn't work is an example of wierd arithmetic in Excel 2000 when repeated in LibreOffice Calc 4.1.5.3} This is the unpublished work. I haven't read any of the published papers yet.
 881
 882 The articles are filled sporadically with lamentation for the decline of experts in numerical analysis (and error) which is somewhat ironic; if there were no IEEE 754 standard meaning any man/woman and his/her dog could write floating point arithmetic and expect it to produce platform independent results\footnote{Even if, as Kahan's articles often point out, platforms aren't actually following IEEE 754} this decline would probably not have happened.
 883
 884 These examples would be of real value if the topic of the project were on accuracy of numerical operations. They also explain some of the more bizarre features of IEEE 754 (in a manner attacking those who dismiss these features as being too bizarre to care about of course).
 885
 886 It's somewhat surprising he hasn't written anything (that I could see from scanning the list) about the lack of IEEE in PDF/PostScript (until Adobe Reader 6) and further that the precision is only binary32.
 887
 888 I kind of feel really guilty saying this, but since our aim is to obtain arbitrary scaling, it will be inevitable that we break from the IEEE 754 standard. Or at least, I (Sam) will. Probably. Also there is no way I will have time to read and understand the thousands of pages that Kahan has written.
 889
 890 Therefore we might end up doing some things Kahan would not approve of.
 891
 892 Still this is a very valuable reference to go in the overview of floating point arithmetic section.
 893
 894 \chapter{General Notes}
 895
 896 \section{The DOM Model}
 897
 898 A document is modelled as a tree. The root of the tree is the document. This has nodes of varying types. Some nodes may have children, attributes, and data.
 899
 900 \section{Floating-Point Numbers\cite{HFP,goldberg1991whatevery,goldberg1992thedesign,priest1991algorithms}}
 901
 902 A set of FP numbers is characterised by:
 903 \begin{enumerate}
 904         \item Radix (base) $\beta \geq 2$
 905         \item Precision %$p \req 2$ ``number of sig digits''
 906         \item Two ``extremal`` exponents $e_min < 0 < e_max$ (generally, don't have to have the $0$ in there)
 907 \end{enumerate}
 908
 909 Numbers are represented by {\bf integers}: $(M, e)$ such that $x = M \times \beta^{e - p + 1}$
 910
 911 Require: $|M| \leq \beta^{p}-1$ and $e_min \leq e \leq e_max$.
 912
 913 Representations are not unique; set of equivelant representations is a cohort.
 914
 915 $\beta^{e-p+1}$ is the quantum, $e-p+1$ is the quantum exponent.
 916
 917 Alternate represetnation: $(s, m, e)$ such that $x = (-1)^s \times m \times \beta^{e}$
 918 $m$ is the significand, mantissa, or fractional part. Depending on what you read.
 919
 920
 921
 922
 923
 924 \section{Rounding Errors}
 925
 926
 927 They happen. There is ULP and I don't mean a political party.
 928
 929 TODO: Probably say something more insightful. Other than "here is a graph that shows errors and we blame rounding".
 930
 931 \subsection{Results of calculatepi}
 932
 933 We can calculate pi by numerically solving the integral:
 934 \begin{align*}
 935         \int_0^1 \left(\frac{4}{1+x^2}\right) dx &= \pi
 936 \end{align*}
 937
 938 Results with Simpson Method:
 939 \begin{figure}[H]
 940         \centering
 941         \includegraphics[width=0.8\textwidth]{figures/calculatepi.pdf}
 942         \caption{Example of accumulated rounding errors in a numerical calculation}
 943 \end{figure}
 944
 945 Tests with \verb/calculatepi/ show it's not quite as simple as just blindly replacing all your additions with Fast2Sum from Dekker\cite{dekker1971afloating}.
 946 ie: The graph looks exactly the same for single precision. \verb/calculatepi/ obviously also has multiplication ops in it which I didn't change. Will look at after sleep maybe.
 947
 948
 949 \pagebreak
 950 \bibliographystyle{unsrt}
 951 \bibliography{papers}
 952
 953 \end{document}
 954