X-Git-Url: https://git.ucc.asn.au/?p=ipdf%2Fsam.git;a=blobdiff_plain;f=chapters%2FBackground.tex;h=87d18cebee3a9caf81c1600a8dbdc8f8a034de2a;hp=e5720aebc21f7cd608aaf6f22c1558645e595668;hb=3cc6f72b6bbdde973827f4f3cd47563d240cc345;hpb=5e4b9d22c1d1077d1179b5ee20c55e8662ea723a diff --git a/chapters/Background.tex b/chapters/Background.tex index e5720ae..87d18ce 100644 --- a/chapters/Background.tex +++ b/chapters/Background.tex @@ -15,7 +15,7 @@ In Chapter \ref{Progress}, we will discuss our findings so far with regards to a Throughout Section \ref{vector-vs-raster-graphics} we were careful to refer to ``modern'' display devices, which are raster based. It is of some historical significance that vector display devices were popular during the 70s and 80s, and papers oriented towards drawing on these devices can be found\cite{brassel1979analgorithm}. Whilst curves can be drawn at high resolution on vector displays, a major disadvantage was shading; by the early 90s the vast majority of computer displays were raster based\cite{computergraphics2}. -Hearn and Baker's textbook ``Computer Graphics''\cite{computergraphics2} gives a comprehensive overview of graphics from physical display technologies through fundamental drawing algorithms to popular graphics APIs. This section will examine algorithms for drawing two dimensional geometric primitives on raster displays as discussed in ``Computer Graphics'' and the relevant literature. Informal tutorials are abundant on the internet\cite{elias2000graphics}. This section is by no means a comprehensive survey of the literature but intends to provide some idea computations which are required to render a document. +Hearn and Baker's textbook ``Computer Graphics''\cite{computergraphics2} gives a comprehensive overview of graphics from physical display technologies through fundamental drawing algorithms to popular graphics APIs. This section will examine algorithms for drawing two dimensional geometric primitives on raster displays as discussed in ``Computer Graphics'' and the relevant literature. Informal tutorials are abundant on the internet\cite{elias2000graphics}. This section is by no means a comprehensive survey of the literature but intends to provide some idea of the computations which are required to render a document. \subsection{Straight Lines}\label{Straight Lines} \input{chapters/Background_Lines} @@ -30,31 +30,54 @@ Splines are continuous curves formed from piecewise polynomial segments. A polyn A straight line is simply a polynomial of $0$th degree. Splines may be rasterised by sampling of $y(x)$ at a number of points $x_i$ and rendering straight lines between $(x_i, y_i)$ and $(x_{i+1}, y_{i+1})$ as discussed in Section \ref{Straight Lines}. More direct algorithms for drawing splines based upon Brasenham and Wu's algorithms also exist\cite{citationneeded}. -There are many different ways to define a spline. One approach is to specify ``knots'' on the spline and solve for the cooefficients to generate a cubic spline ($n = 3$) passing through the points. Alternatively, there are many ways to specify a spline using ``control'' points which themselves are not part of the curve; these are convenient for graphical based editors. - +There are many different ways to define a spline. One approach is to specify ``knots'' on the spline and solve for the cooefficients to generate a cubic spline ($n = 3$) passing through the points. Alternatively, special polynomials may be defined using ``control'' points which themselves are not part of the curve; these are convenient for graphical based editors. Bezier splines are the most straight forward way to define a curve in the standards considered in Section \ref{Document Representations} \subsubsection{Bezier Curves} \input{chapters/Background_Bezier} +\subsection{Font Rendering} + +Donald Knuth's 1986 textbook ``Metafont'' blargh + + + \subsection{Shading} Algorithms for shading on vector displays involved drawing equally spaced lines in the region with endpoints defined by the boundaries of the region\cite{brassel1979analgorithm}. Apart from being unrealistic, these techniques required a computationally expensive sorting of vertices\cite{lane1983analgorithm}. On raster displays, shading is typically based upon Lane's algorithm of 1983\cite{lane1983analgorithm}. Lane's algorithm relies on the ability to ``subtract'' fill from a region. This algorithm is now implemented in the GPU \rephrase{stencil buffer-y and... stuff} \cite{kilgard2012gpu} -\subsection{Compositing} -\input{chapters/Background_Compositing} +\subsection{Compositing and the Painter's Model}\label{Compositing and the Painter's Model} + +So far we have discussed techniques for rendering vector graphics primitives in isolation, with no regard to the overall structure of a document which may contain many thousands of primitives. A straight forward approach would be to render all elements sequentially to the display, with the most recently drawn pixels overwriting lower elements. Such an approach is particularly inconvenient for anti-aliased images where colours must appear to smoothly blur between the edge of a primitive and any drawn underneath it. + +Colour raster displays are based on an additive red-green-blue $(r,g,b)$ colour representation which matches the human eye's response to light\cite{computergraphics2}. In 1984, Porter and Duff introduced a fourth colour channel for rasterised images called the ``alpha'' channel, analogous to the transparency of a pixel\cite{porter1984compositing}. In compositing models, elements can be rendered seperately, with the four colour channels of successively drawn elements being combined according to one of several possible operations. + +In the ``painter's model'' as described by the SVG standard, Porter and Duff's ``over'' operation is used when rendering one primitive over another\cite{svg2011-1.1}. +Given an existing pixel $P_1$ with colour values $(r_1, g_1, b_1, a_1)$ and a pixel $P_2$ with colours $(r_2, g_2, b_2, a_2)$ to be painted over $P_1$, the resultant pixel $P_T$ has colours given by: +\begin{align} + a_T &= 1 - (1-a_1)(1-a_2) \\ + r_T &= (1 - a_2)r_1 + r_2 \quad \text{(similar for $g_T$ and $b_T$)} +\end{align} +It should be apparent that alpha values of $1$ correspond to an opaque pixel; that is, when $a_2 = 1$ the resultant pixel $P_T$ is the same as $P_2$. +When the final pixel is actually drawn on an rgb display, the $(r, g, b)$ components are $(r_T/a_T, g_T/a_T, b_T/a_T)$. + +The PostScript and PDF standards, as well as the OpenGL API also use a painter's model for compositing. However, PostScript does not include an alpha channel, so $P_T = P_2$ always\cite{plrm}. Figure \ref{SVG} illustrates the painter's model for partially transparent shapes as they would appear in both the SVG and PDF models. + +\subsection{Rasterisation on the CPU and GPU} Traditionally, vector graphics have been rasterized by the CPU before being sent to the GPU for drawing\cite{kilgard2012gpu}. Lots of people would like to change this \cite{worth2003xr, loop2007rendering, rice2008openvg, kilgard2012gpu, green2007improved}. \rephrase{2. Here are the ways documents are structured ... we got here eventually} -\section{Document Representations} +\section{Document Representations}\label{Document Representations} The representation of information, particularly for scientific purposes, has changed dramatically over the last few decades. For example, Brassel's 1979 paper referenced earlier has been produced on a mechanical type writer. Although the paper discusses an algorithm for shading on computer displays, the figures illustrating this algorithm have not been generated by a computer, but drawn by Brassel's assistant\cite{brassel1979analgorithm}. In contrast, modern papers such as Barnes et. al's recent paper on embedding 3d images in PDF documents\cite{barnes2013embeddding} can themselves be an interactive proof of concept. -Hayes' 2012 article ``Pixels or Perish'' discusses the recent history and current state of the art in documents for scientific publications\cite{hayes2012pixels}. Hayes argued that there are currently two different approaches to representing documents although the line between these two philosophies is being blurred. We shall restrict ourselves to considering the standards discussed by Hayes. +In this section we will consider various approaches and motivations to specifying the structure and appearance of a document, including: early interpreted formats (PostScript, \TeX, DVI), the Document Object Model popular in standards for web based documents (HTML, SVG), and Adobe's ubiquitous Portable Document Format (PDF). Some of these formats were discussed in a recent paper ``Pixels Or Perish'' by Hayes\cite{hayes2012pixelsor} who argues for greater interactivity in the PDF standard. + +\subsection{Interpreted Document Formats} +\input{chapters/Background_Interpreted} -\subsection{Interpreted Model} \begin{itemize} \item This model treats a document as the source code program which produces graphics @@ -70,28 +93,12 @@ Hayes' 2012 article ``Pixels or Perish'' discusses the recent history and curren \item Problems with security --- Turing complete, can be exploited easily \end{itemize} -\subsection{Crippled Interpreted Model} - -\rephrase{I'm pretty sure I made that one up} - -\begin{itemize} - \item PDF is PostScript but without the Turing Completeness - \item Solves security issues, more efficient -\end{itemize} - -\subsection{Document Object Model} - -The Document Object Model (DOM) represents a document as a tree like data structure with the document as a root node. The elements of the document are represented as children of either this root node or of a parent element. In addition, elements may have attributes which contain information about that particular element. - -The DOM is described by the W3C XML (extensible markup language) standard\cite{citationneeded}. XML itself is a general language which is intended for representing any tree-like structure using the DOM, whilst languages such as HTML\cite{citationneeded} and SVG\cite{citationneeded} are specific XML based languages for representing visual information. +\pagebreak +\subsection{Document Object Model}\label{Document Object Model} +\input{chapters/Background_DOM} -The HyperText Markup Language (HTML) was, as its name implies, originally intended mostly for text. When combined with Cascading Style Sheets (CSS) control over the positioning and style of the text can be acheived. Images stored in some other format can be rendered within a HTML document, but HTML does not include ways to specify graphics primitives or their coordinates. +\subsection{The Portable Document Format} -The Scalable Vector Graphics (SVG) standard was designed to represent a vector image. In the SVG standard, each graphics primitive is an element in the DOM, whilst attributes of the element give information about how the primitive is to be drawn, such as path coordinates, line thickness, mitre styles and fill colours. - -\subsubsection{Modifying the DOM --- Javascript} - -Javascript is now ubiquitous in web based documents, and is essentially used to make the DOM interactive. This can be done by altering the attributes of elements, or adding and removing elements from the DOM, in response to some event such as user input or communication with a HTTP server. In the HTML5 standard, it is also possible to draw directly to a region of the document defined by the \verb// tag; as Hayes points out, this is similar to the use of PostScript in specifying the appearance of a document using low level drawing operators which are read by an interpreter. \subsection{Scientific Computation Packages} @@ -109,14 +116,20 @@ The document and the code that produces it are one and the same. \section{Precision in Modern Document Formats} -We briefly summarise the requirements of standard document formats in regards to the precision of number representations: +We briefly summarise the requirements of the standards discussed so far in regards to the precision of mathematical operations: \begin{itemize} \item {\bf PostScript} predates the IEEE-754 standard and originally specified a floating point representation with ? bits of exponent and ? bits of mantissa. Version ? of the PostScript standard changed to specify IEEE-754 binary32 ``single precision'' floats. \item {\bf PDF} has also specified IEEE-754 binary32 since version ?. Importantly, the standard states that this is a \emph{maximum} precision; documents created with higher precision would not be viewable in Adobe Reader. \item {\bf SVG} specifies a minimum of IEEE-754 binary32 but recommends more bits be used internally \item {\bf Javascript} uses binary32 floats for all operations, and does not distinguish between integers and floats. + \item {\bf Python} uses binary64 floats + \item {\bf Matlab} uses binary64 floats + \item {\bf Mathematica} uses some kind of terrifying symbolic / arbitrary float combination + \item {\bf Maple} is similar but by many accounts horribly broken + \end{itemize} + \rephrase{4. Here is IEEE-754 which is what these standards use} \section{Real Number Representations} @@ -125,7 +138,13 @@ We have found that PostScript, PDF, and SVG document standards all restrict them In this section we will begin by investigating floating point numbers as defined in the IEEE standard and their limitations. We will then consider alternative number representations including fixed point numbers, arbitrary precision floats, rational numbers, p-adic numbers and symbolic representations. \rephrase{Oh god I am still writing about IEEE floats let alone all those other things} -\subsection{Floating Point} +\rephrase{Reorder to start with Integers, General Floats, then go to IEEE, then other things} + +\subsection{IEEE Floating Points} + +Although the concept of a floating point representation has been attributed to various early computer scientists including Charles Babbage\cite{citationneeded}, it is widely accepted that William Kahan and his colleagues working on the IEEE-754 standard in the 1980s are the ``fathers of modern floating point computation''\cite{citationneeded}. The original IEEE-754 standard specified the encoding, number of bits, rounding methods, and maximum acceptable errors for the basic floating point operations for base $B = 2$ floats. It also specifies ``exceptions'' --- mechanisms by which a program can detect an error such as division by zero\footnote{Kahan has argued that exceptions in IEEE-754 are conceptually different to Exceptions as defined in several programming languages including C++ and Java. An IEEE exception is intended to prevent an error by its detection, whilst an exception in those languages is used to indicate an error has already occurred\cite{}}. We will restrict ourselves to considering $B = 2$, since it was found that this base in general gives the smallest rounding errors\cite{HFP}, although it is worth noting that different choices of base had been used historically\cite{goldman1991whatevery}, and the IEEE-854 and later the revised IEEE-754 standard specify a decimal representation $B = 10$ intended for use in financial applications. + +\subsection{Floating Point Definition} A floating point number $x$ is commonly represented by a tuple of integers $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}: @@ -136,31 +155,35 @@ A floating point number $x$ is commonly represented by a tuple of integers $(s, Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent. The name ``floating point'' refers to the equivelance of the $\times B^e$ operation to a shifting of a decimal point along the mantissa. This contrasts with a ``fixed point'' representation where $x$ is the sum of two fixed size numbers representing the integer and fractional part. - - -\subsection{The IEEE-754 Standard} - -Although the concept of a floating point representation has been attributed to various early computer scientists including Charles Babbage\cite{citationneeded}, it is widely accepted that William Kahan and his colleagues working on the IEEE-754 standard in the 1980s are the ``fathers of modern floating point computation''\cite{citationneeded}. The IEEE standard specifies the encoding, number of bits, rounding methods, and maximum acceptable errors for the basic floating point operations. It also specifies ``exceptions'' --- mechanisms by which a program can detect an error such as division by zero. - -In the IEEE-754 standard, for a base of $B = 2$, numbers are encoded in continuous memory by a fixed number of bits, with $s$ occupying 1 bit, followed by $e$ and $m$ occupying a number of bits specified by the precision; 5 and 10 for a binary16 or ``half precision'' float, 8 and 23 for a binary32 or ``single precision'' and 15 and 52 for a binary64 or ``double precision'' float\cite{HFP, ieee2008-754}. The IEEE-754 standard also specifies a base 10 encoding (useful in financial software\cite{citationneeded}), but since this is subject to similar limitations, we will restrict ourselves to the simpler base 2 encodings. +In the IEEE-754 standard, for a base of $B = 2$, numbers are encoded in continuous memory by a fixed number of bits, with $s$ occupying 1 bit, followed by $e$ and $m$ occupying a number of bits specified by the precision; 5 and 10 for a binary16 or ``half precision'' float, 8 and 23 for a binary32 or ``single precision'' and 15 and 52 for a binary64 or ``double precision'' float\cite{HFP, ieee2008-754}. \subsection{Precision and Rounding} -Real values which cannot be represented exactly in a floating point representation must be rounded. The results of a floating point operation may be such values and thus there is a rounding error possible in any floating point operation. Goldberg's assertively titled 1991 paper ``What Every Computer Scientist Needs to Know about Floating Point Arithmetic'' provides a comprehensive overview of issues in floating point arithmetic and relates these to the 1984 version of the IEEE-754 standard\cite{goldberg1991whatevery}. More recently, after the release of the revised IEEE-754 standard, +Real values which cannot be represented exactly in a floating point representation must be rounded. The results of a floating point operation will in general be such values and thus there is a rounding error possible in any floating point operation. Goldberg's assertively titled 1991 paper ``What Every Computer Scientist Needs to Know about Floating Point Arithmetic'' provides a comprehensive overview of issues in floating point arithmetic and relates these to the 1984 version of the IEEE-754 standard\cite{goldberg1991whatevery}. More recently, after the release of the revised IEEE-754 standard in 2008, a textbook ``Handbook Of Floating Point Arithmetic'' has been published which provides a thourough review of literature relating to floating point arithmetic in both software and hardware\cite{HFP}. -Figure \ref{minifloat.pdf} shows the real numbers which can be represented exactly by an 8 bit base $B = 2$ floating point number; and illustrates that a set of fixed precision floating point numbers forms a discrete approximation of the reals. There are only $2^8 = 256$ numbers in this set, which means it is easier to see some of the properties of floats that would be unclear using one of the IEEE-754 encodings. The first set of points corresponds to using $(1,2,5)$ to encode $(s,e,m)$ whilst the second set of points corresponds to a $(1,3,4)$ encoding. This allows us to see the trade off between the precision and range of real values represented. +Figure \ref{minifloat.pdf} shows the positive real numbers which can be represented exactly by an 8 bit base $B = 2$ floating point number; and illustrates that a set of fixed precision floating point numbers forms a discrete approximation of the reals. There are only $2^7 = 256$ numbers in this set, which means it is easier to see some of the properties of floats that would be unclear using one of the IEEE-754 encodings. The first set of points corresponds to using 2 and 5 bits to encode $e$ and $m$ whilst the second set of points corresponds to a 3 and 4 bit encoding. This allows us to see the trade off between the precision and range of real values represented. + +\begin{figure}[H] + \centering + \includegraphics[width=0.8\textwidth]{figures/minifloat.pdf} \\ + \includegraphics[width=0.8\textwidth]{figures/minifloat_diff.pdf} + \caption{The mapping of 8 bit floats to reals} +\end{figure} \subsection{Floating Point Operations} Floating point operations can in principle be performed using integer operations, but specialised Floating Point Units (FPUs) are an almost universal component of modern processors\cite{citationneeded}. The improvement of FPUs remains highly active in several areas including: efficiency\cite{seidel2001onthe}; accuracy of operations\cite{dieter2007lowcost}; and even the adaptation of algorithms originally used in software for reducing the overal error of a sequence of operations\cite{kadric2013accurate}. In this section we will consider the algorithms for floating point operations without focusing on the hardware implementation of these algorithms. -\subsection{Limitations Imposed By CPU} +\subsection{Some sort of Example(s) or Floating Point Mayhem} + +\rephrase{Eg: $f(x) = |x|$ calculated from sqrt and squaring} -CPU's are restricted in their representation of floating point numbers by the IEEE standard. +\rephrase{Eg: Massive rounding errors from calculatepi} +\rephrase{Eg: Actual graphics things :S} \subsection{Limitations Imposed By Graphics APIs and/or GPUs} @@ -174,6 +197,7 @@ Traditionally algorithms for drawing vector graphics are performed on the CPU; t \item OpenGL standards specify: binary16, binary32, binary64 \item OpenVG aims to become a standard API for SVG viewers but the API only uses binary32 and hardware implementations may use less than this internally\cite{rice2008openvg} \item It seems that IEEE has not been entirely successful; although all modern CPUs and GPUs are able to read and write IEEE floating point types, many do not conform to the IEEE standard in how they represent floating point numbers internally. + \item \rephrase{Blog post alert} \url{https://dolphin-emu.org/blog/2014/03/15/pixel-processing-problems/} \end{itemize}