Notes on how GPUs suck at rendering circles

author Sam Moore <[email protected]>

Thu, 24 Jul 2014 11:34:01 +0000 (19:34 +0800)

committer Sam Moore <[email protected]>

Thu, 24 Jul 2014 11:34:01 +0000 (19:34 +0800)
author Sam Moore <[email protected]>
Thu, 24 Jul 2014 11:34:01 +0000 (19:34 +0800)
committer Sam Moore <[email protected]>
Thu, 24 Jul 2014 11:34:01 +0000 (19:34 +0800)
diff --git a/FloatingPointCPUvsGPU.pdf b/FloatingPointCPUvsGPU.pdf

new file mode 100644 (file)

index 0000000..a2b5b31

Binary files /dev/null and b/FloatingPointCPUvsGPU.pdf differ
diff --git a/FloatingPointCPUvsGPU.tex b/FloatingPointCPUvsGPU.tex

new file mode 100644 (file)

index 0000000..0ad0df8
--- /dev/null
+++ b/FloatingPointCPUvsGPU.tex
@@ -0,0 +1,92 @@
+\documentclass[11pt]{article}
+\input{template}
+
+\begin{document}
+\title{Floating Point and CPU vs GPU Rendering}
+\author{Sam Moore, David Gow}
+\maketitle
+
+\section*{Abstract}
+
+Modern GPUs appear to not be compliant with IEEE-754 when using floating point operations. There is also difference between the behaviour of different GPU models.
+We compare the rendering of filled circles on a x86-64 CPU and a GPU (AMD/ATI  Whilstler LE [Radeon HD 6610M/7610M]).
+
+\section{Introduction}
+
+Although it is well known that the behaviour of GPU drivers is inconsistent, there is little research into the behaviour of floating point operations using such drivers. 
+
+In 2004 Hillesland and Lastra adapted Kahan's well known program for testing floating point arithmetic on CPUs during the 1980s ``Paranoia'' for GPUs and found that many GPUs did not appear to be compliant with IEEE-754\cite{hillesland2004paranoia}.
+
+Given the recent interest in use of the GPU for vector graphics\cite{kilgard2012gpu} this seems worthy of further investigation.
+Using the same algorithm implemented in C/C++ and GLSL we have shown that a particular GPU using a particular driver exhibit less precision than IEEE-754 binary32 floats.
+
+\section{Algorithm}
+
+For each integer valued $(x,y)$ in the bounding rectangle, if $x^2 + y^2 \leq r^2$ where $r$ is the radius of the circle, then $(x,y)$ should be filled.
+
+There are two sources of error; the coordinate transforms of vertices \emph{before} rendering, and the operations required to subsequently render the circle at that position. The difference between using the CPU and GPU for the former is apparent but is only easily demonstrated in a live demo. The images below demonstrate the precision issues with the latter\footnote{Me write um good english}.
+
+\section{Results}
+
+Each pair of figures compares rendering using an x86-64 CPU and OpenGL shaders running on the AMD/ATI  Whilstler LE [Radeon HD 6610M/7610M] using the \emph{fglrx} proprietry graphics drivers.
+
+\begin{figure}[H]
+\centering
+\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/cpu0.png}
+\caption{A circle. Sort of.}
+\end{figure}
+
+The images produced by GPU and CPU rendering are indistinguishable at the original scale. Note that the "CPU rendering" is essentially producing a bitmap and then sending that to the GPU; so all floating point operations are still done on the CPU, wherase "GPU rendering" involves passing floats representing vertex positions to a GLSL shader program.
+
+\begin{figure}[H]
+\centering
+\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/cpu1.png}
+\caption{CPU, zoomed}
+\end{figure}
+
+\begin{figure}[H]
+\centering
+\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/gpu1.png}
+\caption{GPU, zoomed}
+\end{figure}
+
+The rounding errors begin to become apparent.
+
+\begin{figure}[H]
+\centering
+\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/cpu2.png}
+\caption{CPU, zoomed more}
+\end{figure}
+
+\begin{figure}[H]
+\centering
+\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/gpu2.png}
+\caption{GPU, zoomed more}
+\end{figure}
+
+Even worse...
+
+\begin{figure}[H]
+\centering
+\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/cpu3.png}
+\caption{CPU, zoomed more}
+\end{figure}
+
+\begin{figure}[H]
+\centering
+\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/gpu3.png}
+\caption{GPU, zoomed more}
+\end{figure}
+
+Image is not recognisable as once being a circle.
+Note that at this scale there are also issues with translation using the CPU (ie: It was not possible to position the circle so that it covered the same fraction of the screen as in the earlier images).
+
+It is hard to tell how much of this is due to bugs in the fglrx driver and how much is actually due to physical limitations on the GPU hardware.
+
+David found code for a graphics driver with a \verb/USE_IEEE_FLOATS/ define that was \verb/false/ by default...\footnote{David say more here?}
+
+\section{Conclusions}
+
+fglrx is pretty terrible. GPUs are probably not as good at floating point as CPUs.
+
+\end{document}
diff --git a/figures/circles_gpu_vs_cpu/cpu0.png b/figures/circles_gpu_vs_cpu/cpu0.png

new file mode 100644 (file)

index 0000000..27e1f29

Binary files /dev/null and b/figures/circles_gpu_vs_cpu/cpu0.png differ
diff --git a/figures/circles_gpu_vs_cpu/cpu1.png b/figures/circles_gpu_vs_cpu/cpu1.png

new file mode 100644 (file)

index 0000000..b10ac75

Binary files /dev/null and b/figures/circles_gpu_vs_cpu/cpu1.png differ
diff --git a/figures/circles_gpu_vs_cpu/cpu2.png b/figures/circles_gpu_vs_cpu/cpu2.png

new file mode 100644 (file)

index 0000000..982e22b

Binary files /dev/null and b/figures/circles_gpu_vs_cpu/cpu2.png differ
diff --git a/figures/circles_gpu_vs_cpu/cpu3.png b/figures/circles_gpu_vs_cpu/cpu3.png

new file mode 100644 (file)

index 0000000..c438aaa

Binary files /dev/null and b/figures/circles_gpu_vs_cpu/cpu3.png differ
diff --git a/figures/circles_gpu_vs_cpu/gpu0.png b/figures/circles_gpu_vs_cpu/gpu0.png

new file mode 100644 (file)

index 0000000..d18919c

Binary files /dev/null and b/figures/circles_gpu_vs_cpu/gpu0.png differ
diff --git a/figures/circles_gpu_vs_cpu/gpu1.png b/figures/circles_gpu_vs_cpu/gpu1.png

new file mode 100644 (file)

index 0000000..e18e014

Binary files /dev/null and b/figures/circles_gpu_vs_cpu/gpu1.png differ
diff --git a/figures/circles_gpu_vs_cpu/gpu2.png b/figures/circles_gpu_vs_cpu/gpu2.png

new file mode 100644 (file)

index 0000000..954a229

Binary files /dev/null and b/figures/circles_gpu_vs_cpu/gpu2.png differ
diff --git a/figures/circles_gpu_vs_cpu/gpu3.png b/figures/circles_gpu_vs_cpu/gpu3.png

new file mode 100644 (file)

index 0000000..55d8d00

Binary files /dev/null and b/figures/circles_gpu_vs_cpu/gpu3.png differ
author	Sam Moore <[email protected]>
	Thu, 24 Jul 2014 11:34:01 +0000 (19:34 +0800)
committer	Sam Moore <[email protected]>
	Thu, 24 Jul 2014 11:34:01 +0000 (19:34 +0800)
FloatingPointCPUvsGPU.pdf	[new file with mode: 0644]	patch \| blob
FloatingPointCPUvsGPU.tex	[new file with mode: 0644]	patch \| blob
figures/circles_gpu_vs_cpu/cpu0.png	[new file with mode: 0644]	patch \| blob
figures/circles_gpu_vs_cpu/cpu1.png	[new file with mode: 0644]	patch \| blob
figures/circles_gpu_vs_cpu/cpu2.png	[new file with mode: 0644]	patch \| blob
figures/circles_gpu_vs_cpu/cpu3.png	[new file with mode: 0644]	patch \| blob
figures/circles_gpu_vs_cpu/gpu0.png	[new file with mode: 0644]	patch \| blob
figures/circles_gpu_vs_cpu/gpu1.png	[new file with mode: 0644]	patch \| blob
figures/circles_gpu_vs_cpu/gpu2.png	[new file with mode: 0644]	patch \| blob
figures/circles_gpu_vs_cpu/gpu3.png	[new file with mode: 0644]	patch \| blob