From: Sam Moore Date: Thu, 22 May 2014 17:29:10 +0000 (+0800) Subject: That's it I give up X-Git-Url: https://git.ucc.asn.au/?a=commitdiff_plain;h=23e6982a39b77ea440300bfeafbccb8a15f48c5d;p=ipdf%2Fsam.git That's it I give up I might try and find the spelling/grammar errors if I am alive. But I am not going to write any more even though the second "half" is woefully horrible. I don't care about page limits if they don't like it they can just... not read the extra pages. No really, the quality definitely goes down towards the end. --- diff --git a/chapters/Background.tex b/chapters/Background.tex index ee8da21..5d63d80 100644 --- a/chapters/Background.tex +++ b/chapters/Background.tex @@ -107,12 +107,14 @@ The Number type does differ slightly from IEEE-754 in that there is only a singl \section{Number Representations} +Consider a value of $7.25 = 2^2 + 2^1 + 2^0 + 2^{-2}$. In binary, this can be written as $111.01_2$ Such a value would require 5 binary digits (bits) of memory to represent exactly. On the other hand a rational value such as $7\frac{1}{3}$ could not be represented exactly; the sequence of bits $111.0111 \text{ ...}_2$ never terminates. -\subsection{Integers and Fixed Point Numbers} +Integer representations do not allow for a ``point'' as shown in these examples, that is, no bits are reserved for the fractional part of a number. $7.25$ would be rounded to the nearest integer value, $7$. A fixed point representation keeps the decimal place at the same position for all values. Floating point representations can be thought of as analogous to scientific notation; an ``exponent'' and fixed point value are encoded, with multiplication by the exponent moving the position of the point. +Modern computer hardware typically supports integer and floating-point number representations. Due to physical limitations, the size of these representations is limited; this is the fundamental source of both limits on range and precision in computer based calculations. -\subsection{Floating Points} +\subsection{Floating Point Definition} A floating point number $x$ is commonly represented by a tuple of values $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}: @@ -120,46 +122,78 @@ A floating point number $x$ is commonly represented by a tuple of values $(s, e, x &= (-1)^{s} \times m \times B^{e} \end{align*} -Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent. Whilst $e$ is an integer in some range $\pm e_max$, the mantissa $m$ is actually a fixed point value in the range $0 < m < B$. The name ``floating point'' refers to the equivelance of the $\times B^e$ operation to a shifting of the ``fixed point'' along the mantissa. +Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent. Whilst $e$ is an integer in some range $\pm e_max$, the mantissa $m$ is a fixed point value in the range $0 < m < B$. -For example, the value $7.25$ can be expressed as: -\begin{enumerate} - \item - -\end{enumerate} -The choice of base $B = 2$, closely matches the nature of modern hardware. It has also been found that this base in general gives the smallest rounding errors\cite{HFP}. Early computers had in fact used a variety of representations including $B=3$ or even $B=7$\cite{goldman1991whatevery}, and the revised IEEE-754 standard specifies a decimal representation $B = 10$ intended for use in financial applications\cite{ieee754std2008}. From now on we will restrict ourselves to considering base 2 floats. +The choice of base $B = 2$ in the original IEEE-754 standard matches the nature of modern hardware. It has also been found that this base in general gives the smallest rounding errors\cite{HFP}. Early computers had in fact used a variety of representations including $B=3$ or even $B=7$\cite{goldman1991whatevery}, and the revised IEEE-754 standard specifies a decimal representation $B = 10$ intended for use in financial applications\cite{ieee754std2008}. From now on we will restrict ourselves to considering base 2 floats. -Figure \ref{minifloat.pdf} shows the positive real numbers which can be represented exactly by an 8 bit floating point number encoded in the IEEE-754 format, and the distance between successive floating point numbers. We show two encodings using (1,2,5) and (1,3,4) bits to encode (sign, exponent, mantissa) respectively. +The IEEE-754 encoding of $s$, $e$ and $m$ requires a fixed number of continuous bits dedicated to each value. Originally two encodings were defined: binary32 and binary64. $s$ is always encoded in a single leading bit, whilst (8,23) and (11,53) bits are used for the (exponent, mantissa) encodings respectively. -For each distinct value of the exponent, the successive floating point representations lie on a straight line with constant slope. As the exponent increases, larger values are represented, but the distance between successive values increases\footnote{A plot of fixed point numbers or integers (which we omit for space considerations) would show points lying on a straight line with a constant slope between points}. +$m$ as encoded in the IEEE-754 standard is not exactly equivelant to a fixed point value. By assuming an implicit leading bit (ie: restricting $1 \leq m < 2$) except for when $e = 0$, floating point values are gauranteed to have unique representations. When $e = 0$ the leading bit is not implied; these values are called ``denormals''. This idea, which allows for 1 extra bit of precision for the normalised values appears to have been considered by Goldberg as early as 1967\cite{goldbern1967twentyseven}. -In the graph of the difference between representations, a single isolated point should be visible; this is not an error, but due to the greater discontinuity between the denormalised and normalised values ($e = 0$ and $1$ respectively). +Figure \ref{float.pdf} shows the positive real numbers which can be represented exactly by an 8 bit floating point number encoded in the IEEE-754 format\footnote{Not quite; we are ignoring the IEEE-754 definitions of NaN and Infinity for simplicity}, and the distance between successive floating point numbers. We show two encodings using (1,2,5) and (1,3,4) bits to encode (sign, exponent, mantissa) respectively. We have also plotted a fixed point representation. For each distinct value of the exponent, the successive floating point representations lie on a straight line with constant slope. As the exponent increases, larger values are represented, but the distance between successive values increases; this can be seen on the right. The marked single point discontinuity at \verb/0x10/ and \verb/0x20/ occur when $e$ leaves the denormalised region and the encoding of $m$ changes. + +The earlier example $7.25$ would be converted to a (1,3,4) floating point representation as follows: +\begin{enumerate} + \item Determine the fixed point representation $7.25 = 111.01_2$ + \item Determine the sign bit; in this case $s = 0$ + \item Calculate the exponent by shifting the point $111.01_2 = 1.1101_2 \times 2^2 \implies e = 2 = 10_2$ + \item Determine the exponent encoding; in IEEE-754 equal to the number of exponent bits is added so $e_{enc} = e+3 = 5 = 101_2$ + \item Remove the implicit bit if the encoded exponent $\neq 0$; $1.1101_2 \to .1101_2$ + \item Combine the three bit strings$0,101,1101$ + \item The final encoding is $01011101 \equiv \text{0x5D}$ +\end{enumerate} +This particular example can be encoded exactly; however as there are an infinite number of real values and only a finite number of floats, in general a value must be $7.26$ must be rounded or truncated at Step 3. \begin{figure}[H] \centering - \includegraphics[width=0.8\textwidth]{figures/minifloat.pdf} \\ - \includegraphics[width=0.8\textwidth]{figures/minifloat_diff.pdf} - \caption{The mapping of 8 bit floats to reals} +\begin{minipage}[t]{0.45\textwidth} + \begin{figure}[H] + \centering + \includegraphics[width=1\textwidth]{figures/floats.pdf} \\ + \end{figure} +\end{minipage} +\begin{minipage}[t]{0.45\textwidth} + \begin{figure}[H] + \centering + \includegraphics[width=1\textwidth]{figures/floats_diff.pdf} \\ + \end{figure} +\end{minipage} + \caption{8 bit float and fixed point representations a) As mapped to real values b) The distance between each representation}\label{floats.pdf} \end{figure} -\subsection{Floating Point Operations} - -Floating point operations can in principle be performed using integer operations, but specialised Floating Point Units (FPUs) are an almost universal component of modern processors\cite{kelley1997acmos}. The improvement of FPUs remains highly active in several areas including: efficiency\cite{seidel2001onthe}; accuracy of operations\cite{dieter2007lowcost}; and even the adaptation of algorithms originally used in software for reducing the overal error of a sequence of operations\cite{kadric2013accurate}. In this section we will briefly describe the algorithms for floating point operations without focusing on the hardware implementation of these algorithms. \subsection{Precision and Rounding} -Real values which cannot be represented exactly in a floating point representation must be rounded to the nearest floating point value. The results of a floating point operation will in general be such values and thus there is a rounding error possible in any floating point operation. Referring to Figure \ref{minifloat.pdf} it can be seen that the largest possible rounding error, or ``units in last place'' (ulp) is half the distance between successive floats; this means that rounding errors increase as the value to be represented increases. The IEEE-754 standard specifies the rounding conventions for floating point arithmetic\cite{ieee754std2008}. - +Real values which cannot be represented exactly in a floating point representation must be rounded to the nearest floating point value. The results of a floating point operation will in general be such values and thus there is a rounding error possible in any floating point operation. Referring to Figure \ref{floats.pdf} it can be seen that the largest possible rounding error is half the distance between successive floats; this means that rounding errors increase as the value to be represented increases. Goldberg's assertively titled 1991 paper ``What Every Computer Scientist Needs to Know about Floating Point Arithmetic''\cite{goldberg1991whatevery} provides a comprehensive overview of issues in floating point arithmetic and relates these to requirements of the IEEE-754 1985 standard\cite{ieee754std1985}. More recently, after the release of the revised IEEE-754 standard in 2008\cite{ieee754std2008}, a textbook ``Handbook Of Floating Point Arithmetic'' has been published which provides a thourough review of literature relating to floating point arithmetic in both software and hardware\cite{HFP}. -William Kahan, one of the architects of the IEEE-754 standard in 1984 and a contributor to its revision in 2010, has also published many articles on his website explaining the more obscure features of the IEEE-754 standard and calling out software which fails to conform to the standard\footnote{In addition to encodings and acceptable rounding errors, the standard also specifies ``exceptions'' --- mechanisms by which a program can detect an error such as division by zero --- which are sometimes neglected, as in the ECMA-256}\cite{kahanweb, kahan1996ieee754}, as well as examples of the limitations of floating point computations\cite{kahan2007wrong}. +William Kahan, one of the architects of the IEEE-754 standard in 1984 and a contributor to its revision in 2010, has also published many articles on his website explaining the more obscure features of the IEEE-754 standard and calling out software which fails to conform to the standard\footnote{In addition to encodings and acceptable rounding behaviour, the standard also specifies ``exceptions'' --- mechanisms by which a program can detect and report an error such as division by zero}\cite{kahanweb, kahan1996ieee754}, as well as examples of the limitations of floating point computations\cite{kahan2007wrong}. + +In Figure \ref{calculatepi.pdf} we show the effect of accumulated rounding errors on the computation of $\pi$ through a numerical integration\footnote{This is not intended to be an example of a good way to calculate $\pi$} using 32 bit ``single precision'' floats and 64 bit ``double precision'' floats. + +\begin{figure}[H] + \centering + \includegraphics[width=0.6\textwidth]{figures/calculatepi.pdf} + \caption{Numerical calculation of $\pi$}\label{calculatepi.pdf} +\end{figure} + +\subsection{Floating Point Operations} + +Floating point operations can in principle be performed using integer operations, but specialised Floating Point Units (FPUs) are an almost universal component of modern processors\cite{kelley1997acmos}. The improvement of FPUs remains highly active in several areas including: efficiency\cite{seidel2001onthe}; accuracy of operations\cite{dieter2007lowcost}; and even the adaptation of algorithms originally used in software, such as Kahan's Fast2Sum algorithm\cite{kadric2013accurate}. + +In 1971 Dekker formalised a series of algorithms including the Fast2Sum method for calculating the correction term due to accumulated rounding errors\cite{dekker1971afloating}. The exact result of $x + y$ may be expressed in terms of floating point operations with rounding as follows: +\begin{align*} + z &= \text{RN}(x + y) w &= \text{RN}(z - x) \\ + zz &= \text{RN}(y - w) \implies x + y &= zz +\end{align*} \subsection{Arbitrary Precision Floating Point Numbers} -Fouse described +Arbitrary precision floating point numbers are implemented in a variety of software libraries which will dynamically allocate extra bits for the exponent or mantissa as required. An example is the GNU MPFR library discussed by Fousse in 2007\cite{fouse2007mpfr}. Although many arbitrary precision libraries already existed, MPFR intends to be fully compliant with some of the more obscure IEEE-754 requirements such as rounding rules and exceptions. +As we have seen, it is trivial to find real numbers that would require an infinite number of bits to represent exactly. Implementations of ``arbitrary'' precision must carefully determine at what point rounding should occur so as to balance performance with memory usage. diff --git a/chapters/Progress.tex b/chapters/Progress.tex index a88e002..93b9b76 100644 --- a/chapters/Progress.tex +++ b/chapters/Progress.tex @@ -1,10 +1,9 @@ \chapter{Progress Report}\label{Progress} -We describe the current state of our research in relation to the aims outlined in Chapter \ref{Introduction}. +We describe the current state of our project in relation to the aims outlined in Chapter \ref{Introduction}. At this stage work on the project has been done in collaboration with David Gow; however the Project Proposals and Literature Reviews were produced individually. \section{Literature Review} - -We have examined a range of literature that can be broadly classed into three different areas (with major references indicated): +The literature examined in Chapter\ref{Background} can broadly classed into three different areas (with major references indicated): \begin{enumerate} \item Rendering Vector Graphics \cite{computergraphics2, knuth1983metafont, kilgard2012gpu} \begin{itemize} @@ -22,17 +21,21 @@ We have examined a range of literature that can be broadly classed into three di \item Most document standards either specify, suggest, or imply a IEEE-754 floating point representation ({\TeX} is an exception) \item IEEE-754 is widely used, although there are instances of languages or processors which do not conform exactly to the standard \item Some GPUs in particular may not conform to IEEE-754, possibly trading some accuracy for performance + \item Arbitrary precision floating point arithmetic is possible through several libraries \end{itemize} \end{enumerate} To improve the Literature Review we could consider the following topics in more detail: \begin{enumerate} - \item Additional approaches to arbitrary or infinite precision, possibly including symbolic computation + \item Additional approaches to arbitrary precision possibly including symbolic computation \item Floating point errors in the context of computing B\'{e}zier Curves or similar + \item Algorithms for reducing overall error other than Fast2Sum + \item Alternative number representations such as rationals (eg: $\frac{1}{3}$) \item How well GPUs conform or do not conform to IEEE-754 in more detail \item Additional aspects of rendering vector documents including shading \end{enumerate} + \section{Development of Testbed Software} We have produced a basic Document Viewer capable of rendering simple primitives under translation and scaling. The OpenGL 3.1 API is used to interface with graphics hardware. This software has the following features: @@ -83,9 +86,9 @@ Deadlines enforced by the faculty of Engineering Computing and Mathematics are \ $26^{\text{th}}$ May & \emph{Progress Report and Literature Review due.}\\ \hline $9^{\text{th}}$ June & Demonstrations of limitations of floating point precision in the Testbed software. \\ - $1^{\text{st}}$ July & At least one implementation of infinite precision for basic primitives (lines, polygons, curves) completed. Other implementations, advanced features, and areas for more detailed research identified. \\ + $1^{\text{st}}$ July & At least one implementation of arbitrary precision for basic primitives (lines, polygons, curves) completed. Other implementations, advanced features, and areas for more detailed research identified. \\ \hline - $1^{\text{st}}$ August & Experiments and comparison of various infinite precision implementations completed. \\ + $1^{\text{st}}$ August & Experiments and comparison of various arbitrary precision implementations completed. \\ \hline $1^{\text{st}}$ September & Advanced features implemented and tested, work underway on Final Report. \\ \hline diff --git a/figures/calculatepi.pdf b/figures/calculatepi.pdf new file mode 100644 index 0000000..98be1ff Binary files /dev/null and b/figures/calculatepi.pdf differ diff --git a/figures/floats.pdf b/figures/floats.pdf new file mode 100644 index 0000000..7ee07ca Binary files /dev/null and b/figures/floats.pdf differ diff --git a/figures/floats.svg b/figures/floats.svg new file mode 100644 index 0000000..96db4d5 --- /dev/null +++ b/figures/floats.svg @@ -0,0 +1,508 @@ + + + + +Gnuplot +Produced by GNUPLOT 4.6 patchlevel 0 + + + + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + 5 + + + 10 + + + 15 + + + 20 + + + 25 + + + 30 + + + 35 + + + 0x00 + + + 0x14 + + + 0x28 + + + 0x3c + + + 0x50 + + + 0x64 + + + 0x78 + + + Real Value + + + Representation in Hexadecimal + + + 8 Bit Floats + + + + gnuplot_plot_2 + + + (1,3,4 encoding) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + gnuplot_plot_3 + + + (4.4 fixed point) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + gnuplot_plot_1 + + + (1,2,5 encoding) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/figures/floats_diff.pdf b/figures/floats_diff.pdf new file mode 100644 index 0000000..2db7447 Binary files /dev/null and b/figures/floats_diff.pdf differ diff --git a/figures/floats_diff.svg b/figures/floats_diff.svg new file mode 100644 index 0000000..c69cd07 --- /dev/null +++ b/figures/floats_diff.svg @@ -0,0 +1,2277 @@ + + + + + + image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Gnuplot + Produced by GNUPLOT 4.6 patchlevel 0 + + + + + + + + + + + + + + + + + + + + + + + + 0 + + + + 0.1 + + + + 0.2 + + + + 0.3 + + + + 0.4 + + + + 0.5 + + + + 0.6 + + + + 0.7 + + + + 0.8 + + + + 0.9 + + + + 1 + + + + 0x00 + + + + 0x14 + + + + 0x28 + + + + 0x3c + + + + 0x50 + + + + 0x64 + + + + 0x78 + + + + Difference + + + Representation in Hexadecimal + + + Difference Between Successive 8 Bit Floats + + from previous + + + gnuplot_plot_3 + + + (1,3.4 fixed point) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + gnuplot_plot_2 + + + (1,3,4 encoding) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + gnuplot_plot_1 + + + (1,2,5 encoding) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1/16 + 1/8 + 1/4 + + 1/2 + 1 + 1/8 + 1/32 + 1/64 + 1/128 + + + + 0x10 + 0x20 + diff --git a/meta/Abstract.tex b/meta/Abstract.tex index 31289eb..b3239f2 100644 --- a/meta/Abstract.tex +++ b/meta/Abstract.tex @@ -1,6 +1,6 @@ \section*{Abstract} -At the fundamental level, a document is a means to convey information. The limitations on a digital document format therefore restrict the types and quality of information that can be communicated. Whilst modern document formats are now able to include increasingly complex dynamic content, they still suffer from early views of a document as a static page; to be viewed at a fixed scale and position. In this report, we focus on the limitations of modern document formats (including PDF, PostScript, SVG) with regards to the level of detail, or precision at which primatives can be drawn. We propose a research project to investigate whether it is possible to obtain an ``infinite precision'' document format, capable of including primitives created at an arbitrary level of zoom. +At the fundamental level, a document is a means to convey information. The limitations on a digital document format therefore restrict the types and quality of information that can be communicated. Whilst modern document formats are now able to include increasingly complex dynamic content, they still suffer from early views of a document as a static page; to be viewed at a fixed scale and position. In this report, we focus on the limitations of modern document formats (including PDF, PostScript, SVG) with regards to the level of detail, or precision at which primatives can be drawn. We propose a research project to investigate whether it is possible to obtain an ``arbitrary precision'' document format, capable of including primitives created at an arbitrary level of zoom. {\bf Keywords:} \emph{document formats, precision, floating point, vector images, graphics, OpenGL, VHDL, PostScript, PDF, {\TeX}, SVG, HTML5, Javascript } diff --git a/thesis.pdf b/thesis.pdf index b6e93a1..4fa29e3 100644 Binary files a/thesis.pdf and b/thesis.pdf differ