From: Sam Moore Date: Mon, 27 Oct 2014 11:21:08 +0000 (+0800) Subject: THE FINAL COUNTDOWN X-Git-Url: https://git.ucc.asn.au/?p=ipdf%2Fsam.git;a=commitdiff_plain;h=e699a78987125e89a9f976067ecfb149409bb423 THE FINAL COUNTDOWN After all this, I still have to come back next semester anyway... --- diff --git a/chapters/Background/FixedPoint.tex b/chapters/Background/FixedPoint.tex index 15e005a..6d7a15f 100644 --- a/chapters/Background/FixedPoint.tex +++ b/chapters/Background/FixedPoint.tex @@ -24,15 +24,14 @@ individual digits. In practice we will still be limited by the memory and proces For example, we can represent $5682_{10}$ as a single 16 bit digit or as the sum of two 8 bit digits. Each digit is being written in base 2 or 10 because there is not a universal base with $\ge 2^8$ unique symbols. \begin{align*} - 5682_{10} &= 1011000110010_2 = 10110_2 \times 2^{8} + 110010_{2} \times 2^{0} + 5682_{10} &= 1011000110010_2 = 10110_2 \times 2^{8} + 110010_{2} \times 2^{0} % = 22_{10} \times 256^{1} + 50_{10} \times 256^{0} \end{align*} When performing an operation involving two $m$ digit integers, the result will in general require at most $2m$ digits. A straight forward big integer implementation merely needs to allocate memory for leading zeroes -Big Integers are implemented on the CPU as part of the standard for several languages including Python\cite{python_pep0237} and Java\cite{java_bigint}. Most implementations are based on the GNU Multiple Precision library (GMP) \cite{gmp2014}. There have also been implementations of Big Integer arithmetic for GPUs\cite{zhao2010GPUMP}. - - During this project a custom Big Integer type was implemented, but was found to be vastly inferior to the GMP implementation\cite{documentsArbitraryIntegers}. +Big Integers are implemented on the CPU as part of the standard for several languages including Python\cite{python_pep0237} and Java\cite{java_bigint}. Most implementations are based on the GNU Multiple Precision library (GMP) \cite{granlund2004GMP}. There have also been implementations of Big Integer arithmetic for GPUs\cite{zhao2010GPUMP}. + During this project a custom Big Integer type was implemented, but was found to be vastly inferior to the GMP implementation\cite{documentsArbitraryIntegers} %{\bf FIXME} Add Maths reference (Cantor's Diagonal argument) without going into all the Pure maths details diff --git a/chapters/Background/Floats/Definition.tex b/chapters/Background/Floats/Definition.tex index 5002963..76b3551 100644 --- a/chapters/Background/Floats/Definition.tex +++ b/chapters/Background/Floats/Definition.tex @@ -5,7 +5,7 @@ Whilst a Fixed Point representation keeps the ``point'' (the location considered -A floating point number $x$ is commonly represented by a tuple of values $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}: $x = (-1)^{s} \times m \times B^{e}$ +A floating point number $x$ is commonly represented by a tuple of values $(s, e, m)$ in base $B$ as\cite{HFP, ieee754std2008}: $x = (-1)^{s} \times m \times B^{e}$ Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent. Whilst $e$ is an integer in some range $\pm e_max$, the mantissa $m$ is a fixed point value in the range $0 < m < B$. The choice of base $B = 2$ in the original IEEE-754 standard matches the nature of modern hardware. It has also been found that this base in general gives the smallest rounding errors\cite{HFP}. @@ -15,3 +15,5 @@ The IEEE-754 encoding of $s$, $e$ and $m$ requires a fixed number of continuous The encoding of $m$ in the IEEE-754 standard is not exactly equivelant to a fixed point value. By assuming an implicit leading bit (ie: restricting $1 \leq m < 2$) except for when $e = 0$, floating point values are gauranteed to have a unique representations; these representations are said to be ``normalised''. When $e = 0$ the leading bit is not implied; these representations are called ``denormals'' because multiple representations may map to the same real value. The idea of using an implicit bit appears to have been considered by Goldberg as early as 1967\cite{goldbern1967twentyseven}, and it leads to an increase of precision near the origin. +The IEEE-754 also defines $e$ with a biased encoding and allows representation of the special values $\pm \infty$ and different types of \texttt{NaN} (Not a number) which can occur due to invalid operations (such as division by zero). A more detailed overview of IEEE-754 can be found in the ``Handbook of Floating Point Arithmetic'' \cite{HFP}. + diff --git a/chapters/Background/Standards/Precision.tex b/chapters/Background/Standards/Precision.tex index 2d2a3b4..f533b98 100644 --- a/chapters/Background/Standards/Precision.tex +++ b/chapters/Background/Standards/Precision.tex @@ -19,7 +19,7 @@ coordinate system transformations to provide the best possible precision and to %\begin{comment} \subsection{Javascript} We include Javascript here due to its relation with the SVG, HTML5 and PDF standards. -According to the EMCA-262 standard, ``The Number type has exactly 18437736874454810627 (that is, $2^64-^53+3$) values, +According to the EMCA-262 standard, ``The Number type has exactly 18437736874454810627 (that is, $2^{64}-2^{53}+3$) values, representing the double-precision 64-bit format IEEE 754 values as specified in the IEEE Standard for Binary Floating-Point Arithmetic''\cite{ecma-262}. The Number type does differ slightly from IEEE-754 in that there is only a single valid representation of ``Not a Number'' (NaN). The EMCA-262 does not define an ``integer'' representation. %\end{comment} diff --git a/chapters/Process.tex b/chapters/Process.tex index db56861..dab9146 100644 --- a/chapters/Process.tex +++ b/chapters/Process.tex @@ -48,7 +48,7 @@ All results presented in Chapter \ref{Results and Discussion} were obtained on a \begin{figure}[H] \centering \includegraphics[width=0.3\textwidth]{figures/controlpanel_screenshot.png} - \caption{The Qt4 Control Panel provides basic interactivity} \label{controlpanel_screenshot.png} + \caption{The Qt4 Control Panel provides basic interactivity - inserting an SVG} \label{controlpanel_screenshot.png} \end{figure} diff --git a/chapters/Results.tex b/chapters/Results.tex index 47b46fa..f13a33b 100644 --- a/chapters/Results.tex +++ b/chapters/Results.tex @@ -17,7 +17,7 @@ In this case, the precision loss occurs when the test SVG is added to the docume X = V_{w} \times \text{SVG}_x + V_{x} \end{align*} -Where $V$ represents the view, $X$ is the coordinate in the document, and $\text{SVG}_x$ is the coordinate in the test SVG at original scale. In Figure \ref{qualitative-rendering-fox}, the multiplication $V_{w} \times \text{SVG}_x$ has a smaller exponent than $V_{x}$. The error of the addition operation is comparable to one ulp, ie: $\frac{V_{x}}{2}$. In this case, the rounding error is dominating the calculation. The division by $V_{w} = 10^{6}$ in \eqref{view-transformation} is merely increasing this rounding error. +Where $V$ represents the view, $X$ is the coordinate in the document, and $\text{SVG}_x$ is the coordinate in the test SVG at original scale. In Figure \ref{qualitative-rendering-fox}, the multiplication $V_{w} \times \text{SVG}_x$ has a smaller exponent than $V_{x}$. The error of the addition operation is comparable to one ulp, ie: $\frac{V_{x}}{2}$. In this case, the rounding error is dominating the calculation. The division by $V_{w} = 10^{6}$ in \eqref{view-transformation} is merely increasing this rounding error as the coordinates are converted to display space. \begin{figure}[H] \centering @@ -28,7 +28,7 @@ Where $V$ represents the view, $X$ is the coordinate in the document, and $\text \subsection{Applying cumulative transformations to all B\'{e}ziers}\label{cumulative_transform} -Rather than applying \eqref{view-transformation} to object coordinates specified relative to the document, we can store the bounds of objects relative to the view and modify these bounds according to the transformations discussed in Section \ref{Coordinate Systems and Transformations} as the view is changed. This is convenient for an interactive document, as detail is typically added by inserting objects into the document within the view rectangle. As a result this approach makes the rendering of detail added to the document independent of the view coordinates --- until the view is moved. +Rather than applying \eqref{view-transformation} to object coordinates specified relative to the document, we can store the bounds of objects in display space (relative to the view) and modify these bounds according to the transformations discussed in Section \ref{Coordinate Systems and Transformations} as the view is changed. This is convenient for an interactive document, as detail is typically added by inserting objects into the document within the view rectangle. As a result this approach makes the rendering of detail added to the document independent of the view coordinates --- until the view is moved. Repeated transformations on the view will cause an accumulated error on the coordinates of object bounds. This is most noticable when zooming \emph{out} and then back into the document; the object coordinates will gradually underflow and eventually round to zero. An example of this effect is shown in Figure \ref{qualitative-rendering-fox-cumulative} b) %label start @@ -48,7 +48,7 @@ Repeated transformations on the view will cause an accumulated error on the coor \subsection{Applying cumulative transformations to Paths}\label{path_transform} -In Figure \ref{qualitative-rendering-fox}, transformations are applied to the bounds of each B\'{e}zier. Figure \ref{qualitative-rendering-fox-cumulative-relative} a) shows the effect of introducing an intermediate coordinate system expressing B\'{e}zier coordinates relative to the path which contains them. In this case, the rendering of a single path is accurate, but the overall positions of the paths drift as the view is moved. +In Figure \ref{qualitative-rendering-fox}, transformations are applied to the bounds of each B\'{e}zier. Figure \ref{qualitative-rendering-fox-cumulative-relative} a) shows the effect of introducing an intermediate coordinate system expressing B\'{e}zier bounding box coordinates relative to the path which contains them. In this case, the rendering of a single path is accurate, but the overall positions of the paths drift as the view is moved. We can correct this drift whilst maintaining performance by using an arbitrary or high precision number representation to express the coordinates of the paths - but maintaining the floating point coordinates for B\'{e}zier curves relative to their path. This is shown in Figure \ref{qualitative-rendering-fox-cumulative-relative} b). @@ -88,7 +88,7 @@ We should note that with the view top left corner close to $(0,0)$ as in Figure By counting the number of distinctly representable lines within a particular view, we can show the degradation of precision quantitatively. The test grid is added to each view rectangle with increasingly smaller width and height. -Figure \ref{loss_of_precision_grid_0.5.pdf} shows how precision degrades with $(V_x, V_y) = (0.5,0.5)$ for different precision settings using MPFR floating point values to represent the view coordinates. A constant line at $1401$ grid locations indicates no loss of precision. From this figure it should be clear how merely setting the precision of the floating point representation to a higher (but fixed) value will not allow insertion of detail at an arbitrary point; using 1024 bits of precision will still leave no lines representable above magnifications of $10^{300}$. +Figure \ref{loss_of_precision_grid_0.5.pdf} shows how precision degrades with $(V_x, V_y) = (0.5,0.5)$ for different precision settings using MPFR floating point values to represent the view coordinates. A constant line at $1401$ grid locations indicates no loss of precision. From this figure it should be clear how merely setting the precision of the floating point representation to a higher (but fixed) value will not allow insertion of detail at an arbitrary point; using 1024 bits of precision will still leave no lines representable above magnifications of $M \approx10^{310}$. \begin{figure}[H] @@ -102,10 +102,11 @@ Figure \ref{loss_of_precision_grid_0.5.pdf} shows how precision degrades with $( Using the cumulative transformation approach discussed in Section \ref{cumulative_transform} means that detail inserted into a fixed view will always render correctly. A fairer test of this approach is to test the rendering accuracy after applying repeated scaling to the document. -Figure \ref{cumulative_error_grid.pdf} shows the total error in the coordinates of each line in the grid after the view is scaled (zooming \emph{out}) by repeated transformations. A constant line at $0$ would indicate no accumulated error. +Figure \ref{cumulative_error_grid.pdf} shows the total error in the coordinates of each line in the grid after the view is scaled by repeated transformations (zooming \emph{out} and then back in by the same amount). A constant line at $0$ would indicate no accumulated error. In this case, using an arbitrary precision representation such as GMP Rationals (\texttt{path-rat}) does not totally eliminate error. This is simply because the final coordinate transformation requires the conversion of rationals to IEEE-754 floats before rendering. Since the total final error for $1042$ lines is less than $10^{-2}$, and the width of the display is $1$, this would represent a negligable difference in the rendering of the grid. +The legend of Figure \ref{cumulative_error_grid.pdf} should be interpreted as follows: A prefix of \texttt{path} indicates use of intermediate Path coordinate systems (Section \ref{path_transform}), \texttt{cumul} indicates cumulative transforms applied to B\'{e}ziers (Section \ref{cumulative_transform}) and no prefix indicates the direct approach (Section \ref{direct_transform}). The type of number representation used is also indicated. In the case of the Path transformations, only the bounds of the Path are expressed with the indicated representation; all other operations are done using IEEE-754 single precision floats. These results agree with those discussed qualitatively above. \begin{figure}[H] @@ -120,7 +121,7 @@ In this case, using an arbitrary precision representation such as GMP Rationals \subsection{Performance of Static Detail at Different View Locations} As discussed above, we succeeded in preserving rendering accuracy as defined above for extremely large ranges of coordinates in the document. -However this comes at a performance cost, as the size of the Rational number representation must grow accordingly. Figures \ref{memory.pdf} a) and b) were obtained by repeatedly resetting the document, scaling, and adding a fixed number of B\'{e}zier curves. It appears that the GMP representation increases memory usage linearly, with the speed decreasing faster than linear. The \texttt{mpfr-1024} number representation performs much better in terms of a static memory usage and speed; however as discussed in Section \ref{Precision for Fixed View}, due to the fixed precision it cannot represent detail seperated by a truly arbitrary distance. +However this comes at a performance cost, as the size of the Rational number representation must grow accordingly. Figures \ref{memory.pdf} a) and b) were obtained by repeatedly resetting the document, scaling, and adding a fixed number of B\'{e}zier curves. It appears that the GMP representation increases memory usage linearly, with the speed decreasing faster than linear. The \texttt{mpfr-1024} number representation performs much better in terms of a fixed memory usage and a slower increase in time taken; however as discussed in Section \ref{Precision for Fixed View}, due to the fixed precision it cannot represent detail seperated by a truly arbitrary distance. \begin{figure}[H] @@ -134,7 +135,7 @@ However this comes at a performance cost, as the size of the Rational number rep For a static document containing only a few imported test SVGs, the use of GMP rationals for path coordinates was not a noticable performance detriment compared to the implementations using floating point coordinates. Figure \ref{adding_things} measures the time taken for a script to scale the document to a point at which it will insert an additional copy of a test SVG (Figure \ref{turtle.pdf}). -We have included the Na\"{i}ve approach discussed in Section \ref{Naive Approach} with GMP rationals (\texttt{Gmprat}) and MPFR using 1024 bits of precision (\texttt{mpfr-1024}) to illustrate its impracticality. The \texttt{Gmprat} is removed from Figure \ref{adding_things} b). +We have included the Na\"{i}ve approach discussed in Section \ref{Naive Approach} with GMP rationals (\texttt{Gmprat}) and MPFR using 1024 bits of precision (\texttt{mpfr-1024}) to illustrate its impracticality. The \texttt{Gmprat} data is removed from Figure \ref{adding_things} b). \begin{figure}[H] \centering diff --git a/meta/Abstract.tex b/meta/Abstract.tex index 4b1af23..1a45561 100644 --- a/meta/Abstract.tex +++ b/meta/Abstract.tex @@ -20,7 +20,7 @@ IEEE-754 floats. {\bf Keywords:} \emph{document formats, precision, floating point, vector images, graphics, OpenGL, SDL2, PostScript, PDF, {\TeX}, SVG, HTML5, Javascript } -{\bf Note:} This report is best viewed digitally as a PDF. The digital version is available at \url{http://szmoore.net/ipdf/sam/thesis.pdf} +{\bf Note:} This report is best viewed digitally as a PDF. The digital version is available at \\ \url{http://szmoore.net/ipdf/sam/thesis.pdf} {\bf Word Count: } 7620 (9335 with appendices) diff --git a/meta/letter.pdf b/meta/letter.pdf index 0724e4c..3b86a47 100644 Binary files a/meta/letter.pdf and b/meta/letter.pdf differ diff --git a/meta/letter.svg b/meta/letter.svg new file mode 100644 index 0000000..955c54a --- /dev/null +++ b/meta/letter.svg @@ -0,0 +1,225 @@ + + + +image/svg+xmlSamuel Z. Moore +45 Wheyland Street +Willagee, WA, 6156 +27th October, 2014 +Winthrop Professor John Dell +Dean +Faculty of Engineering, Computing and Mathematics +University of Western Australia +35 Stirling Highway +Crawley, WA, 6009 +Dear Professor Dell, +I am pleased to submit this thesis, entitled "Number Representations and Precision in Vector +Graphics", as part of the requirement for the Engineering component of the degree of Bachelor of +Science and Engineering. +Yours Sincerely, +Samuel Z. Moore +20503628 + \ No newline at end of file diff --git a/presentation/cumulative_float_SHORT.mkv b/presentation/cumulative_float_SHORT.mkv deleted file mode 100644 index b7d9ffd..0000000 Binary files a/presentation/cumulative_float_SHORT.mkv and /dev/null differ diff --git a/presentation/float.mkv b/presentation/float.mkv deleted file mode 100644 index 9a19dea..0000000 Binary files a/presentation/float.mkv and /dev/null differ diff --git a/presentation/path-Gmprat.mkv b/presentation/path-Gmprat.mkv deleted file mode 100644 index 9db5fce..0000000 Binary files a/presentation/path-Gmprat.mkv and /dev/null differ diff --git a/thesis.pdf b/thesis.pdf index 06e8f5c..6c4b888 100644 Binary files a/thesis.pdf and b/thesis.pdf differ diff --git a/thesis.tex b/thesis.tex index 28ca99b..a3f7010 100644 --- a/thesis.tex +++ b/thesis.tex @@ -20,6 +20,7 @@ \usepackage{amsmath, amsthm,amssymb} \usepackage{mathrsfs} \usepackage{hyperref} +\usepackage{url} \usepackage{epstopdf} \usepackage{float} \usepackage{anyfontsize} @@ -85,7 +86,7 @@ \newcommand{\phasor}[1]{\tilde{#1}} % make a phasor \newcommand{\laplacian}[1]{\nabla^2 {#1}} % The laplacian operator -\renewcommand{\url}[1]{$\langle$\href{#1}{\underline{\color{blue}{#1}}}$\rangle$} +%\renewcommand{\url}[1]{$\langle$\href{#1}{\underline{\color{blue}{#1}}}$\rangle$} \newcommand{\rephrase}[1]{ \textcolor{red}{#1}} %\usepackage{endfloat} diff --git a/videos/cumulative_float_SHORT.mkv b/videos/cumulative_float_SHORT.mkv new file mode 100644 index 0000000..b7d9ffd Binary files /dev/null and b/videos/cumulative_float_SHORT.mkv differ diff --git a/videos/float.mkv b/videos/float.mkv new file mode 100644 index 0000000..9a19dea Binary files /dev/null and b/videos/float.mkv differ diff --git a/videos/path-Gmprat.mkv b/videos/path-Gmprat.mkv new file mode 100644 index 0000000..9db5fce Binary files /dev/null and b/videos/path-Gmprat.mkv differ