From: Sam Moore Date: Wed, 30 Apr 2014 04:41:03 +0000 (+0800) Subject: Uncommitted notes on HFPA and symlink to Sam's Lit Review X-Git-Url: https://git.ucc.asn.au/?p=ipdf%2Fdocuments.git;a=commitdiff_plain;h=f1069d4e2ac6473ff167bb7385f22b446a91b2dc Uncommitted notes on HFPA and symlink to Sam's Lit Review Sam's Lit Review is in the ipdf/sam repository so as not to sully this repository with its disgusting git commit messages. But it might be easier for Tim to find if it is in this one. --- diff --git a/LitReviewSam.pdf b/LitReviewSam.pdf new file mode 120000 index 0000000..7c77013 --- /dev/null +++ b/LitReviewSam.pdf @@ -0,0 +1 @@ +../sam/thesis.pdf \ No newline at end of file diff --git a/LiteratureNotes.pdf b/LiteratureNotes.pdf index 4a11913..20d22eb 100644 Binary files a/LiteratureNotes.pdf and b/LiteratureNotes.pdf differ diff --git a/LiteratureNotes.tex b/LiteratureNotes.tex index 0e423ca..a9292dc 100644 --- a/LiteratureNotes.tex +++ b/LiteratureNotes.tex @@ -585,10 +585,65 @@ It is much easier to read than Goldberg or Priest's papers. I'm going to start working through it and compile their test programs. +\subsection{A sequence that seems to converge to a wrong limit - pgs 9-10, \cite{HFP}} + +\begin{align*} + u_n &= \left\{ \begin{array}{c} u_0 = 2 \\ u_1 = -4 \\ u_n = 111 - \frac{1130}{u_{n-1}} + \frac{3000}{u_{n-1}u_{n-2}}\end{array}\right. +\end{align*} + +The limit of the series should be $6$ but when calculated with IEEE floats it is actually $100$ +The authors show that the limit is actually $100$ for different starting values, and the error in floating point arithmetic causes the series to go to that limit instead. + +\begin{figure}[H] + \centering + \includegraphics[width=0.8\textwidth]{figures/handbook1-1.pdf} + \caption{Output of Program 1.1 from \emph{Handbook of Floating-Point Arithmetic}\cite{HFP} for various IEEE types} + \label{HFP-1-1} +\end{figure} + +\subsection{Mr Gullible and the Chaotic Bank Society pgs 10-11 \cite{HFP}} + +This is an example of a sequence involving $e$. Since $e$ cannot be represented exactly with FP, even though the sequence should go to $0$ for $a_0 = e - 1$, the representation of $a_0 \neq e - 1$ so the sequence goes to $\pm \infty$. + +To eliminate these types of problems we'd need an \emph{exact} representation of all real numbers. +For \emph{any} FP representation, regardless of precision (a finite number of digits) there will be numbers that can't be represented exactly hence you could find a similar sequence that would explode. + +IE: The more precise the representation, the slower things go wrong, but they still go wrong, {\bf even with errorless operations}. + + +\subsection{Rump's example pg 12 \cite {HFP}} + +This is an example where the calculation of a function $f(a,b)$ is not only totally wrong, it gives completely different results depending on the CPU. Despite the CPU conforming to IEEE. + \chapter{General Notes} +\section{Floating-Point \cite{HFP,goldberg1991whatevery,goldberg1992thedesign,priest1991algorithms}} + +A set of FP numbers is characterised by: +\begin{enumerate} + \item Radix (base) $\beta \geq 2$ + \item Precision %$p \req 2$ ``number of sig digits'' + \item Two ``extremal`` exponents $e_min < 0 < e_max$ (generally, don't have to have the $0$ in there) +\end{enumerate} + +Numbers are represented by {\bf integers}: $(M, e)$ such that $x = M \times \beta^{e - p + 1}$ + +Require: $|M| \leq \beta^{p}-1$ and $e_min \leq e \leq e_max$. + +Representations are not unique; set of equivelant representations is a cohort. + +$\beta^{e-p+1}$ is the quantum, $e-p+1$ is the quantum exponent. + +Alternate represetnation: $(s, m, e)$ such that $x = (-1)^s \times m \times \beta^{e}$ +$m$ is the significand, mantissa, or fractional part. Depending on what you read. + + + + + \section{Rounding Errors} + They happen. There is ULP and I don't mean a political party. TODO: Probably say something more insightful. Other than "here is a graph that shows errors and we blame rounding". @@ -610,20 +665,6 @@ Results with Simpson Method: Tests with \verb/calculatepi/ show it's not quite as simple as just blindly replacing all your additions with Fast2Sum from Dekker\cite{dekker1971afloating}. ie: The graph looks exactly the same for single precision. \verb/calculatepi/ obviously also has multiplication ops in it which I didn't change. Will look at after sleep maybe. -\subsection{A sequence that seems to converge to a wrong limit - pgs 9-10, \cite{HFP}} - -\begin{align*} - u_n &= \left\{ \begin{array}{c} u_0 = 2 \\ u_1 = -4 \\ u_n = 111 - \frac{1130}{u_{n-1}} + \frac{3000}{u_{n-1}u_{n-2}}\end{array}\right. -\end{align*} - -The limit of the series should be $6$ but when calculated with IEEE floats it is actually $100$ -The authors show that the limit is actually $100$ for different starting values, and the error in floating point arithmetic causes the series to go to that limit instead. - -\begin{figure}[H] - \centering - \includegraphics[width=0.8\textwidth]{figures/handbook1-1.pdf} - \caption{Output of Program 1.1 from \emph{Handbook of Floating-Point Arithmetic}\cite{HFP} for various IEEE types} -\end{figure} \pagebreak \bibliographystyle{unsrt} diff --git a/papers.bib b/papers.bib index ee4d247..bb7bbf4 100644 --- a/papers.bib +++ b/papers.bib @@ -607,4 +607,3 @@ language={English} price = "US\$90 (est.)", acknowledgement = ack-nhfb, } -