Automatic commit of irc logs

[ipdf/documents.git] / ArbitraryIntegers.tex
diff --git a/ArbitraryIntegers.tex b/ArbitraryIntegers.tex

index 414abe4..fb84a7f 100644 (file)
--- a/ArbitraryIntegers.tex
+++ b/ArbitraryIntegers.tex
@@ -45,12 +45,45 @@ Each digit is being written in base 2 or 10 because there is not a universal bas
  
  \section{Addition Algorithms}
  
+Addition $s = a + b$ is done by adding digits from least to most significant. 
+\begin{align*}
+       s = \displaystyle\sum_{i=0}^{\infty} (a_i + b_i) \beta^{i}
+\end{align*}
+
+Considering the contributions to the sum of the $i^\text{th}$ and $(i+1)^\text{th}$ digits:
+\begin{align}
+       s_i\beta^i + s_{i+1}\beta^{i+1} &= (a_i+b_i)\beta^i + (a_i+b_i)\beta^{i+1} \\
+       \implies s_i + s_{i+1}\beta &= (a_i+b_i) + (a_{i+1}+b_{i+1})\beta \\
+\end{align}
+
+If the sum $a_i + b_i \geq \beta$, ie: It cannot be represented in base $\beta$, then we can rewrite this as:
+\begin{align}
+       s_i + s_{i+1}\beta &= \beta + (a_i+b_i-\beta) + (a_{i+1}+b_{i+1})\beta \\
+       &= (a_i+b_i-\beta) + (a_{i+1}+b_{i+1}+1)\beta
+\end{align}
+
+So we can use the digits $s_i = (a_i+b_i-\beta) < \beta$ and $s_{i+1} = (a_{i+1}+b_{i+1}+1)$.
+This operation is the \emph{carry}\footnote{I'm pretty sure that is not a rigorous definition but close enough}.
+
+The x64 instruction set includes an \emph{add with carry} instruction \verb/adc/ which will add fixed sized digits and set a flag to indicate a carry. This allows for easy adding of an array of digits representing an arbitrary sized integer.
+
+
  
  
  \section{Subtraction Algorithms}
  
+Similarly, subtraction $s = a - b$ is done from least to most significant digit. If the result of $a_i - b_i < 0$ then we \emph{borrow} from a higher digit.
+
+\begin{align*}
+       s_i + s_{i+1}\beta &= \beta + (a_i-b_i+\beta) + (a_{i+1}-b_{i+1}-1)\beta
+\end{align*}
+
+The x64 instruction set also includes a \emph{subtract with borrow} instruction \verb/sbb/ which will set a borrow flag. 
+
  \section{Multiplication Algorithms}
  
+In general, the result of multiplying two $n$ digit numbers may require up to $2n$ digits.
+
  \section{Division Algorithms}
  
  \subsection{Naive Algorithm}
@@ -62,9 +95,31 @@ Each digit is being written in base 2 or 10 because there is not a universal bas
  Since humans are not very good at understanding binary, it is convenient to convert integer representations from one base to another. 
  
  
-\section{IPDF Integer Representations}
+\section{Performance Comparison of IPDF::Arbint and GMP Integers}
+
+We repeated 1000 trials of the four basic operations on arbitrary integers initialised from \verb/rand(3)/
+
+Here are the average IR costs per operation collected using the \emph{callgrind} tool with the memory analysis program \emph{valgrind}.
+
+\begin{figure}[H]
+       \centering\begin{tabular}{|c|c|c|c|}
+       \hline
+       {\bf Operation} & {\bf IR Cost Arbint} & {\bf IR Cost Gmpint} & {\bf Arbint/Gmpint}\\ \hline
+       *= & 3957 & 255 & 15.6 \\ \hline
+       /= & 395008 & 388 & 1018.1\\ \hline
+       += & 252 & 98 & 2.5 \\ \hline
+       -= & 458 & 102 & 4.5\\
+       \hline  
+\end{tabular}
+       \caption{GMP wins}
+\end{figure}
+
+Clearly we are not as good at implementing arbitrary integer arithmetic as the GMP project. We are particularly bad at division. This is probably because we used the second algorithm on wikipedia.
  
+Examining the GMP source code shows that the library is mostly implemented using highly optimised assembly which is selected based on the build target. We've used C++ classes with all their overhead. We also used a shittier division algorithm although our addition and subtraction are pretty similar.
  
+\section{Conclusion}
  
+Just use GMP.
  
  \end{document}