From: Sam Moore Date: Thu, 24 Jul 2014 11:34:01 +0000 (+0800) Subject: Notes on how GPUs suck at rendering circles X-Git-Url: https://git.ucc.asn.au/?a=commitdiff_plain;h=ea6e56f6cce62a8ab8168c61e273dff83340b836;p=ipdf%2Fdocuments.git Notes on how GPUs suck at rendering circles Not very well written but at least they exist. --- diff --git a/FloatingPointCPUvsGPU.pdf b/FloatingPointCPUvsGPU.pdf new file mode 100644 index 0000000..a2b5b31 Binary files /dev/null and b/FloatingPointCPUvsGPU.pdf differ diff --git a/FloatingPointCPUvsGPU.tex b/FloatingPointCPUvsGPU.tex new file mode 100644 index 0000000..0ad0df8 --- /dev/null +++ b/FloatingPointCPUvsGPU.tex @@ -0,0 +1,92 @@ +\documentclass[11pt]{article} +\input{template} + +\begin{document} +\title{Floating Point and CPU vs GPU Rendering} +\author{Sam Moore, David Gow} +\maketitle + +\section*{Abstract} + +Modern GPUs appear to not be compliant with IEEE-754 when using floating point operations. There is also difference between the behaviour of different GPU models. +We compare the rendering of filled circles on a x86-64 CPU and a GPU (AMD/ATI Whilstler LE [Radeon HD 6610M/7610M]). + +\section{Introduction} + +Although it is well known that the behaviour of GPU drivers is inconsistent, there is little research into the behaviour of floating point operations using such drivers. + +In 2004 Hillesland and Lastra adapted Kahan's well known program for testing floating point arithmetic on CPUs during the 1980s ``Paranoia'' for GPUs and found that many GPUs did not appear to be compliant with IEEE-754\cite{hillesland2004paranoia}. + +Given the recent interest in use of the GPU for vector graphics\cite{kilgard2012gpu} this seems worthy of further investigation. +Using the same algorithm implemented in C/C++ and GLSL we have shown that a particular GPU using a particular driver exhibit less precision than IEEE-754 binary32 floats. + +\section{Algorithm} + +For each integer valued $(x,y)$ in the bounding rectangle, if $x^2 + y^2 \leq r^2$ where $r$ is the radius of the circle, then $(x,y)$ should be filled. + +There are two sources of error; the coordinate transforms of vertices \emph{before} rendering, and the operations required to subsequently render the circle at that position. The difference between using the CPU and GPU for the former is apparent but is only easily demonstrated in a live demo. The images below demonstrate the precision issues with the latter\footnote{Me write um good english}. + +\section{Results} + +Each pair of figures compares rendering using an x86-64 CPU and OpenGL shaders running on the AMD/ATI Whilstler LE [Radeon HD 6610M/7610M] using the \emph{fglrx} proprietry graphics drivers. + +\begin{figure}[H] +\centering +\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/cpu0.png} +\caption{A circle. Sort of.} +\end{figure} + +The images produced by GPU and CPU rendering are indistinguishable at the original scale. Note that the "CPU rendering" is essentially producing a bitmap and then sending that to the GPU; so all floating point operations are still done on the CPU, wherase "GPU rendering" involves passing floats representing vertex positions to a GLSL shader program. + +\begin{figure}[H] +\centering +\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/cpu1.png} +\caption{CPU, zoomed} +\end{figure} + +\begin{figure}[H] +\centering +\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/gpu1.png} +\caption{GPU, zoomed} +\end{figure} + +The rounding errors begin to become apparent. + +\begin{figure}[H] +\centering +\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/cpu2.png} +\caption{CPU, zoomed more} +\end{figure} + +\begin{figure}[H] +\centering +\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/gpu2.png} +\caption{GPU, zoomed more} +\end{figure} + +Even worse... + +\begin{figure}[H] +\centering +\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/cpu3.png} +\caption{CPU, zoomed more} +\end{figure} + +\begin{figure}[H] +\centering +\includegraphics[width=\textwidth]{figures/circles_gpu_vs_cpu/gpu3.png} +\caption{GPU, zoomed more} +\end{figure} + +Image is not recognisable as once being a circle. +Note that at this scale there are also issues with translation using the CPU (ie: It was not possible to position the circle so that it covered the same fraction of the screen as in the earlier images). + +It is hard to tell how much of this is due to bugs in the fglrx driver and how much is actually due to physical limitations on the GPU hardware. + +David found code for a graphics driver with a \verb/USE_IEEE_FLOATS/ define that was \verb/false/ by default...\footnote{David say more here?} + +\section{Conclusions} + +fglrx is pretty terrible. GPUs are probably not as good at floating point as CPUs. + +\end{document} diff --git a/figures/circles_gpu_vs_cpu/cpu0.png b/figures/circles_gpu_vs_cpu/cpu0.png new file mode 100644 index 0000000..27e1f29 Binary files /dev/null and b/figures/circles_gpu_vs_cpu/cpu0.png differ diff --git a/figures/circles_gpu_vs_cpu/cpu1.png b/figures/circles_gpu_vs_cpu/cpu1.png new file mode 100644 index 0000000..b10ac75 Binary files /dev/null and b/figures/circles_gpu_vs_cpu/cpu1.png differ diff --git a/figures/circles_gpu_vs_cpu/cpu2.png b/figures/circles_gpu_vs_cpu/cpu2.png new file mode 100644 index 0000000..982e22b Binary files /dev/null and b/figures/circles_gpu_vs_cpu/cpu2.png differ diff --git a/figures/circles_gpu_vs_cpu/cpu3.png b/figures/circles_gpu_vs_cpu/cpu3.png new file mode 100644 index 0000000..c438aaa Binary files /dev/null and b/figures/circles_gpu_vs_cpu/cpu3.png differ diff --git a/figures/circles_gpu_vs_cpu/gpu0.png b/figures/circles_gpu_vs_cpu/gpu0.png new file mode 100644 index 0000000..d18919c Binary files /dev/null and b/figures/circles_gpu_vs_cpu/gpu0.png differ diff --git a/figures/circles_gpu_vs_cpu/gpu1.png b/figures/circles_gpu_vs_cpu/gpu1.png new file mode 100644 index 0000000..e18e014 Binary files /dev/null and b/figures/circles_gpu_vs_cpu/gpu1.png differ diff --git a/figures/circles_gpu_vs_cpu/gpu2.png b/figures/circles_gpu_vs_cpu/gpu2.png new file mode 100644 index 0000000..954a229 Binary files /dev/null and b/figures/circles_gpu_vs_cpu/gpu2.png differ diff --git a/figures/circles_gpu_vs_cpu/gpu3.png b/figures/circles_gpu_vs_cpu/gpu3.png new file mode 100644 index 0000000..55d8d00 Binary files /dev/null and b/figures/circles_gpu_vs_cpu/gpu3.png differ