From: Sam Moore Date: Sun, 18 May 2014 14:07:35 +0000 (+0800) Subject: Remove dot points add waffles X-Git-Url: https://git.ucc.asn.au/?a=commitdiff_plain;h=5e4b9d22c1d1077d1179b5ee20c55e8662ea723a;p=ipdf%2Fsam.git Remove dot points add waffles :-( There is still so much missing it's frankly starting to get a bit terrifying. --- diff --git a/Makefile b/Makefile index 74caf83..788633c 100644 --- a/Makefile +++ b/Makefile @@ -18,7 +18,7 @@ $(NAME).pdf : $(NAME).tex $(TEX) --shell-escape $(NAME) $(TEX) --shell-escape $(NAME) - silent atril $(NAME).pdf + silent evince $(NAME).pdf rm -f *.bbl *.log *.toc *.lof *.blg *.lot *.aux *.out diff --git a/chapters/Background.tex b/chapters/Background.tex index dbb6f26..e5720ae 100644 --- a/chapters/Background.tex +++ b/chapters/Background.tex @@ -1,20 +1,13 @@ \chapter{Literature Review}\label{Background} -\rephrase{0. Here is a brilliant summary of the sections below} +The first half of this chapter will be devoted to documents themselves, including: the representation and displaying of graphics primitives\cite{computergraphics2}, and how collections of these primitives are represented in document formats, focusing on widely used standards\cite{plrm, pdfref17, svg2011-1.1}. -This chapter provides an overview of relevant literature. The areas of interest can be broadly grouped into two largely separate categories; Documents and Number Representations. +We will find that although there has been a great deal of research into the rendering, storing, editing, manipulation, and extension of document formats, modern standards are content to specify at best single precision IEEE-754 floating point arithmetic. -The first half of this chapter will be devoted to documents themselves, including: the representation and displaying of graphics primitives\cite{computergraphics2}, and how collections of these primitives are represented in document formats, focusing on well known standards currently in use\cite{plrm, pdfref17, svg2011-1.1}. - -We will find that although there has been a great deal of research into the rendering, storing, editing, manipulation, and extension of document formats, these widely used document standards are content to specify at best a single precision IEEE-754 floating point number representations. - -The research on arbitrary precision arithmetic applied to documents is very sparse; however arbitrary precision arithmetic itself is a very active field of research. Therefore, the second half of this chapter will be devoted to considering the IEEE-754 standard, its advantages and limitations, and possible alternative number representations to allow for arbitrary precision arithmetic. +The research on arbitrary precision arithmetic applied to documents is very sparse; however arbitrary precision arithmetic itself is a very active field of research. Therefore, the second half of this chapter will be devoted to considering fixed precision floating point numbers as specified by the IEEE-754 standard, possible limitations in precision, and alternative number representations for increased or arbitrary precision arithmetic. In Chapter \ref{Progress}, we will discuss our findings so far with regards to arbitrary precision arithmetic applied to document formats, and expand upon the goals outlined in Chapture \ref{Proposal}. - -\pagebreak - \section{Raster and Vector Images}\label{Raster and Vector Images} \input{chapters/Background_Raster-vs-Vector} @@ -22,7 +15,7 @@ In Chapter \ref{Progress}, we will discuss our findings so far with regards to a Throughout Section \ref{vector-vs-raster-graphics} we were careful to refer to ``modern'' display devices, which are raster based. It is of some historical significance that vector display devices were popular during the 70s and 80s, and papers oriented towards drawing on these devices can be found\cite{brassel1979analgorithm}. Whilst curves can be drawn at high resolution on vector displays, a major disadvantage was shading; by the early 90s the vast majority of computer displays were raster based\cite{computergraphics2}. -Hearn and Baker's textbook ``Computer Graphics''\cite{computergraphics2} gives a comprehensive overview of graphics from physical display technologies through fundamental drawing algorithms to popular graphics APIs. This section will examine algorithms for drawing two dimensional geometric primitives on raster displays as discussed in ``Computer Graphics'' and the relevant literature. Informal tutorials are abundant on the internet\cite{elias2000graphics}. +Hearn and Baker's textbook ``Computer Graphics''\cite{computergraphics2} gives a comprehensive overview of graphics from physical display technologies through fundamental drawing algorithms to popular graphics APIs. This section will examine algorithms for drawing two dimensional geometric primitives on raster displays as discussed in ``Computer Graphics'' and the relevant literature. Informal tutorials are abundant on the internet\cite{elias2000graphics}. This section is by no means a comprehensive survey of the literature but intends to provide some idea computations which are required to render a document. \subsection{Straight Lines}\label{Straight Lines} \input{chapters/Background_Lines} @@ -37,47 +30,43 @@ Splines are continuous curves formed from piecewise polynomial segments. A polyn A straight line is simply a polynomial of $0$th degree. Splines may be rasterised by sampling of $y(x)$ at a number of points $x_i$ and rendering straight lines between $(x_i, y_i)$ and $(x_{i+1}, y_{i+1})$ as discussed in Section \ref{Straight Lines}. More direct algorithms for drawing splines based upon Brasenham and Wu's algorithms also exist\cite{citationneeded}. -There are many different ways to define a spline. One approach is to specify ``knots'' on the spline and solve for the cooefficients to generate a cubic spline ($n = 3$) passing through the points. Beziers are a popular spline which can be created in GUI based graphics editors using several ``control points'' which themselves are not part of the curve. +There are many different ways to define a spline. One approach is to specify ``knots'' on the spline and solve for the cooefficients to generate a cubic spline ($n = 3$) passing through the points. Alternatively, there are many ways to specify a spline using ``control'' points which themselves are not part of the curve; these are convenient for graphical based editors. \subsubsection{Bezier Curves} \input{chapters/Background_Bezier} \subsection{Shading} -Algorithms for shading on vector displays involved drawing equally spaced lines within a region; this is limited both in the complexity of shading and the performance required to compute the lines\cite{brassel1979analgorithm}. +Algorithms for shading on vector displays involved drawing equally spaced lines in the region with endpoints defined by the boundaries of the region\cite{brassel1979analgorithm}. Apart from being unrealistic, these techniques required a computationally expensive sorting of vertices\cite{lane1983analgorithm}. -On raster displays, shading is typically based upon Lane's algorithm of 1983\cite{lane1983analgorithm} which is implemented in the GPU \cite{kilgard2012gpu} +On raster displays, shading is typically based upon Lane's algorithm of 1983\cite{lane1983analgorithm}. Lane's algorithm relies on the ability to ``subtract'' fill from a region. This algorithm is now implemented in the GPU \rephrase{stencil buffer-y and... stuff} \cite{kilgard2012gpu} -\rephrase{6. Sort of starts here... or at least background does} +\subsection{Compositing} +\input{chapters/Background_Compositing} -\subsection{Rendering Vector Graphics on the GPU} - -Traditionally, vector graphics have been rasterized by the CPU before being sent to the GPU for drawing\cite{kilgard2012gpu}. Lots of people would like to change this \cite{worth2003xr, loop2007rendering, rice2008openvg, kilgard2012gpu, green2007improved} ... \rephrase{All of these are things David found except kilgard which I thought I found and then realised David already had it :S} +Traditionally, vector graphics have been rasterized by the CPU before being sent to the GPU for drawing\cite{kilgard2012gpu}. Lots of people would like to change this \cite{worth2003xr, loop2007rendering, rice2008openvg, kilgard2012gpu, green2007improved}. \rephrase{2. Here are the ways documents are structured ... we got here eventually} \section{Document Representations} -\rephrase{The file format can be either human readable\footnote{For some definition of human and some definition of readable} or binary\footnote{So, our viewer is basically a DOM style but stored in a binary format}. Can also be compressed or not. Here we are interested in how the document is interpreted or traversed in order to produce graphics output.} +The representation of information, particularly for scientific purposes, has changed dramatically over the last few decades. For example, Brassel's 1979 paper referenced earlier has been produced on a mechanical type writer. Although the paper discusses an algorithm for shading on computer displays, the figures illustrating this algorithm have not been generated by a computer, but drawn by Brassel's assistant\cite{brassel1979analgorithm}. In contrast, modern papers such as Barnes et. al's recent paper on embedding 3d images in PDF documents\cite{barnes2013embeddding} can themselves be an interactive proof of concept. -\subsection{Interpreted Model} +Hayes' 2012 article ``Pixels or Perish'' discusses the recent history and current state of the art in documents for scientific publications\cite{hayes2012pixels}. Hayes argued that there are currently two different approaches to representing documents although the line between these two philosophies is being blurred. We shall restrict ourselves to considering the standards discussed by Hayes. -\rephrase{Did I just invent that terminology or did I read it in a paper? Is there actually existing terminology for this that sounds similar enough to ``Document Object Model'' for me to compare them side by side?} +\subsection{Interpreted Model} \begin{itemize} \item This model treats a document as the source code program which produces graphics \item Arose from the desire to produce printed documents using computers (which were still limited to text only displays). \item Typed by hand or (later) generated by a GUI program \item PostScript --- largely supersceded by PDF on the desktop but still used by printers\footnote{Desktop pdf viewers can still cope with PS, but I wonder if a smartphone pdf viewer would implement it?} - \item \TeX --- Predates PostScript! {\LaTeX } is being used to create this very document and until now I didn't even have it here! + \item \TeX --- Predates PostScript, similar idea \begin{itemize} - \item I don't really want to go down the path of investigating the billion steps involved in getting \LaTeX into an actually viewable format - \item There are interpreters (usually WYSIWYG editors) for \LaTeX though \item Maybe if \LaTeX were more popular there would be desktop viewers that converted \LaTeX directly into graphics \end{itemize} \item Potential for dynamic content, interactivity; dynamic PostScript, enhanced Postscript - \item Scientific Computing --- Mathematica, Matlab, IPython Notebook --- The document and the code that produces it are stored together \item Problems with security --- Turing complete, can be exploited easily \end{itemize} @@ -92,104 +81,81 @@ Traditionally, vector graphics have been rasterized by the CPU before being sent \subsection{Document Object Model} -\begin{itemize} - \item DOM = Tree of nodes; node may have attributes, children, data - \item XML (SGML) is the standard language used to represent documents in the DOM - \item XML is plain text - \item SVG is a standard for a vector graphics language conforming to XML (ie: a DOM format) - \item CSS style sheets represent more complicated styling on objects in the DOM -\end{itemize} - -\subsection{Blurring the Line --- Javascript} - -\begin{itemize} - \item The document is expressed in DOM format using XML/HTML/SVG - \item A Javascript program is run which can modify the DOM - \item At a high level this may be simply changing attributes of elements dynamically - \item For low level control there is canvas2D and even WebGL which gives direct access to OpenGL functions - \item Javascript can be used to make a HTML/SVG interactive - \begin{itemize} - \item Overlooking the fact that the SVG standard already allows for interactive elements... - \end{itemize} - \item Javascript is now becoming used even in desktop environments and programs (Windows 8, GNOME 3, Cinnamon, Game Maker Studio) ({\bf shudder}) - \item There are also a range of papers about including Javascript in PDF ``Pixels or Perish'' being the only one we have actually read\cite{hayes2012pixels} - \begin{itemize} - \item I have no idea how this works; PDF is based on PostScript... it seems very circular to be using a programming language to modify a document that is modelled on being a (non turing complete) program - \item This is yet more proof that people will converge towards solutions that ``work'' rather than those that are optimal or elegant - \item I guess it's too much effort to make HTML look like PDF (or vice versa) so we could phase one out - \end{itemize} -\end{itemize} +The Document Object Model (DOM) represents a document as a tree like data structure with the document as a root node. The elements of the document are represented as children of either this root node or of a parent element. In addition, elements may have attributes which contain information about that particular element. -\subsection{Why do we still use static PDFs} +The DOM is described by the W3C XML (extensible markup language) standard\cite{citationneeded}. XML itself is a general language which is intended for representing any tree-like structure using the DOM, whilst languages such as HTML\cite{citationneeded} and SVG\cite{citationneeded} are specific XML based languages for representing visual information. -Despite their limitations, we still use static, boring old PDFs. Particularly in scientific communication. -\begin{itemize} - \item They are portable; you can write an amazing document in Mathematica/Matlab but it - \item Scientific journals would need to adapt to other formats and this is not worth the effort - \item No network connection is required to view a PDF (although DRM might change this?) - \item All rescources are stored in a single file; a website is stored accross many separate files (call this a ``distributed'' document format?) - \item You can create PDFs easily using desktop processing WYSIWYG editors; WYSIWYG editors for web based documents are worthless due to the more complex content - \item Until Javascript becomes part of the PDF standard, including Javascript in PDF documents will not become widespread - \item Once you complicate a PDF by adding Javascript, it becomes more complicated to create; it is simply easier to use a series of static figures than to embed a shader in your document. Even for people that know WebGL. -\end{itemize} +The HyperText Markup Language (HTML) was, as its name implies, originally intended mostly for text. When combined with Cascading Style Sheets (CSS) control over the positioning and style of the text can be acheived. Images stored in some other format can be rendered within a HTML document, but HTML does not include ways to specify graphics primitives or their coordinates. -\rephrase{3. Here are the ways document standards specify precision (or don't)} +The Scalable Vector Graphics (SVG) standard was designed to represent a vector image. In the SVG standard, each graphics primitive is an element in the DOM, whilst attributes of the element give information about how the primitive is to be drawn, such as path coordinates, line thickness, mitre styles and fill colours. -\section{Precision in Modern Document Formats} +\subsubsection{Modifying the DOM --- Javascript} -\rephrase{All the above is very interesting and provides important context, but it is not actually directly related to the problem of infinite precision which we are going to try and solve. Sorry to make you read it all.} +Javascript is now ubiquitous in web based documents, and is essentially used to make the DOM interactive. This can be done by altering the attributes of elements, or adding and removing elements from the DOM, in response to some event such as user input or communication with a HTTP server. In the HTML5 standard, it is also possible to draw directly to a region of the document defined by the \verb// tag; as Hayes points out, this is similar to the use of PostScript in specifying the appearance of a document using low level drawing operators which are read by an interpreter. +\subsection{Scientific Computation Packages} +The document and the code that produces it are one and the same. \begin{itemize} - \item Implementations of PostScript and PDF must by definition restrict themselves to IEEE binary32 ``single precision''\footnote{The original IEEE-754 defined single, double and extended precisions; in the revision these were renamed to binary32, binary64 and binary128 to explicitly state the base and number of bits} - floating point number representations in order to conform to the standards\cite{plrm, pdfref17}. - \item Implementations of SVG are by definition required to use IEEE binary32 as a {\bf minimum}. ``High Quality'' SVG viewers are required to use at least IEEE binary64.\cite{svg2011-1.1} \item Numerical computation packages such as Mathematica and Maple use arbitrary precision floats \begin{itemize} \item Mathematica is not open source which is an issue when publishing scientific research (because people who do not fork out money for Mathematica cannot verify results) \item What about Maple? \cite{HFP} and \cite{fousse2007mpfr} both mention it being buggy. \item Octave and Matlab use fixed precision doubles \end{itemize} + \item IPython is pretty cool guys \end{itemize} -The use of IEEE binary32 floats in the PostScript and PDF standards is not surprising if we consider that these documents are oriented towards representing static pages. They don't actually need higher precision to do this; 32 bits is more than sufficient for A4 paper. +\section{Precision in Modern Document Formats} + +We briefly summarise the requirements of standard document formats in regards to the precision of number representations: +\begin{itemize} + \item {\bf PostScript} predates the IEEE-754 standard and originally specified a floating point representation with ? bits of exponent and ? bits of mantissa. Version ? of the PostScript standard changed to specify IEEE-754 binary32 ``single precision'' floats. + \item {\bf PDF} has also specified IEEE-754 binary32 since version ?. Importantly, the standard states that this is a \emph{maximum} precision; documents created with higher precision would not be viewable in Adobe Reader. + \item {\bf SVG} specifies a minimum of IEEE-754 binary32 but recommends more bits be used internally + \item {\bf Javascript} uses binary32 floats for all operations, and does not distinguish between integers and floats. +\end{itemize} \rephrase{4. Here is IEEE-754 which is what these standards use} -\section{Representation of Numbers} +\section{Real Number Representations} -Although this project has been motivated by a desire for more flexible document formats, the fundamental source of limited precision in vector document formats is the restriction to IEEE floating point numbers for representation of coordinates. +We have found that PostScript, PDF, and SVG document standards all restrict themselves to IEEE floating point number representations of coordinates. This is unsurprising as the IEEE standard has been successfully adopted almost universally by hardware manufactures and programming language standards since the early 1990s. In the traditional view of a document as a static, finite sheet of paper, there is little motivation for enhanced precision. -Whilst David Gow will be focusing on structures \rephrase{and the use of multiple coordinate systems} to represent a document so as to avoid or reduce these limitations\cite{proposalGow}, the focus of our own research will be \rephrase{increased precision in the representation of real numbers so as to get away with using a single global coordinate system}. +In this section we will begin by investigating floating point numbers as defined in the IEEE standard and their limitations. We will then consider alternative number representations including fixed point numbers, arbitrary precision floats, rational numbers, p-adic numbers and symbolic representations. \rephrase{Oh god I am still writing about IEEE floats let alone all those other things} -\subsection{The IEEE Standard} +\subsection{Floating Point} -\subsection{Floating Point Number Representations} +A floating point number $x$ is commonly represented by a tuple of integers $(s, e, m)$ in base $B$ as\cite{HFP, ieee2008-754}: \begin{align*} x &= (-1)^{s} \times m \times B^{e} \end{align*} -$B = 2$, although IEEE also defines decimal representations for $B = 10$ --- these are useful in financial software\cite{ieee2008-754}. +Where $s$ is the sign and may be zero or one, $m$ is commonly called the ``mantissa'' and $e$ is the exponent. +The name ``floating point'' refers to the equivelance of the $\times B^e$ operation to a shifting of a decimal point along the mantissa. This contrasts with a ``fixed point'' representation where $x$ is the sum of two fixed size numbers representing the integer and fractional part. + -\rephrase{Aside: Are decimal representations for a document format eg: CAD also useful because you can then use metric coordinate systems?} +\subsection{The IEEE-754 Standard} -\subsubsection{Precision} +Although the concept of a floating point representation has been attributed to various early computer scientists including Charles Babbage\cite{citationneeded}, it is widely accepted that William Kahan and his colleagues working on the IEEE-754 standard in the 1980s are the ``fathers of modern floating point computation''\cite{citationneeded}. The IEEE standard specifies the encoding, number of bits, rounding methods, and maximum acceptable errors for the basic floating point operations. It also specifies ``exceptions'' --- mechanisms by which a program can detect an error such as division by zero. -The floats map an infinite set of real numbers onto a discrete set of representations. +In the IEEE-754 standard, for a base of $B = 2$, numbers are encoded in continuous memory by a fixed number of bits, with $s$ occupying 1 bit, followed by $e$ and $m$ occupying a number of bits specified by the precision; 5 and 10 for a binary16 or ``half precision'' float, 8 and 23 for a binary32 or ``single precision'' and 15 and 52 for a binary64 or ``double precision'' float\cite{HFP, ieee2008-754}. The IEEE-754 standard also specifies a base 10 encoding (useful in financial software\cite{citationneeded}), but since this is subject to similar limitations, we will restrict ourselves to the simpler base 2 encodings. +\subsection{Precision and Rounding} -\rephrase{Figure: 8 bit ``minifloats'' (all 255 of them) clearly showing the ``precision vs range'' issue} +Real values which cannot be represented exactly in a floating point representation must be rounded. The results of a floating point operation may be such values and thus there is a rounding error possible in any floating point operation. Goldberg's assertively titled 1991 paper ``What Every Computer Scientist Needs to Know about Floating Point Arithmetic'' provides a comprehensive overview of issues in floating point arithmetic and relates these to the 1984 version of the IEEE-754 standard\cite{goldberg1991whatevery}. More recently, after the release of the revised IEEE-754 standard, -The most a result can be rounded in conversion to a floating point number is the units in last place; $m_{N} \times B^{e}$. +Figure \ref{minifloat.pdf} shows the real numbers which can be represented exactly by an 8 bit base $B = 2$ floating point number; and illustrates that a set of fixed precision floating point numbers forms a discrete approximation of the reals. There are only $2^8 = 256$ numbers in this set, which means it is easier to see some of the properties of floats that would be unclear using one of the IEEE-754 encodings. The first set of points corresponds to using $(1,2,5)$ to encode $(s,e,m)$ whilst the second set of points corresponds to a $(1,3,4)$ encoding. This allows us to see the trade off between the precision and range of real values represented. -\rephrase{Even though that paper that claims double is the best you will ever need because the error can be as much as the size of a bacterium relative to the distance to the moon}\cite{} \rephrase{there are many cases where increased number of bits will not save you}.\cite{HFP} +\subsection{Floating Point Operations} + +Floating point operations can in principle be performed using integer operations, but specialised Floating Point Units (FPUs) are an almost universal component of modern processors\cite{citationneeded}. The improvement of FPUs remains highly active in several areas including: efficiency\cite{seidel2001onthe}; accuracy of operations\cite{dieter2007lowcost}; and even the adaptation of algorithms originally used in software for reducing the overal error of a sequence of operations\cite{kadric2013accurate}. In this section we will consider the algorithms for floating point operations without focusing on the hardware implementation of these algorithms. -\rephrase{5. Here are limitations of IEEE-754 floating point numbers on compatible hardware} \subsection{Limitations Imposed By CPU} @@ -214,11 +180,9 @@ Traditionally algorithms for drawing vector graphics are performed on the CPU; t \rephrase{7. Sod all that, let's just use an arbitrary precision library (AND THUS WE FINALLY GET TO THE POINT)} -\subsection{Alternate Number Representations} - -\rephrase{They exist\cite{HFP}}. +\subsection{Arbitrary Precision Floating Point Numbers} -Do it all using MFPR\cite{}, she'll be right. +An arbitrary precision floating point number simply uses extra bits to store extra precision. Do it all using MFPR\cite{fousse2007mpfr}, she'll be right. \rephrase{8. Here is a brilliant summary of sections 7- above} @@ -226,3 +190,5 @@ Dear reader, thankyou for your persistance in reading this mangled excuse for a Hopefully we have brought together the radically different areas of interest together in some sort of coherant fashion. In the next chapter we will talk about how we have succeeded in rendering a rectangle. It will be fun. I am looking forward to it. +\rephrase{Oh dear this is not going well} + diff --git a/chapters/Background_Bezier.tex b/chapters/Background_Bezier.tex index 974f064..3652358 100644 --- a/chapters/Background_Bezier.tex +++ b/chapters/Background_Bezier.tex @@ -10,5 +10,5 @@ Figure \ref{bezier_3} shows a Bezier Curve defined by the points $\left\{(0,0), A straightforward algorithm for rendering Bezier's is to simply sample $P(t)$ for some number of values of $t$ and connect the resulting points with straight lines using Bresenham or Wu's algorithm (See Section \ref{Straight Lines}). Whilst the performance of this algorithm is linear, in ???? De Casteljau derived a more efficient means of sub dividing beziers into line segments. -Recently, Goldman presented an argument that Bezier's could be considered as fractal in nature, a fractal being the fixed point of an iterated function system\cite{goldman_thefractal}. Goldman's proof depends upon a modification to the De Casteljau Subdivision algorithm which expresses the subdivisions as an iterated function system. The cost of this modification is that the algorithm is no longer $O(n)$ but $O(n^2)$; although it is not explicitly stated by Goldman it seems clear that the modified algorithm is mainly of theoretical interest. +Recently, Goldman presented an argument that Bezier's could be considered as fractal in nature, a fractal being the fixed point of an iterated function system\cite{goldman_thefractal}. Goldman's proof depends upon a modification to the De Casteljau Subdivision algorithm which expresses the subdivisions as an iterated function system. diff --git a/chapters/Background_Compositing.tex b/chapters/Background_Compositing.tex new file mode 100644 index 0000000..453fc7a --- /dev/null +++ b/chapters/Background_Compositing.tex @@ -0,0 +1 @@ +In 1984, Porter and Duff introduced Digital Compositing for rastered images\cite{porter1984compositing}. diff --git a/chapters/Background_Lines.tex b/chapters/Background_Lines.tex index 4752a21..581a653 100644 --- a/chapters/Background_Lines.tex +++ b/chapters/Background_Lines.tex @@ -1,8 +1,8 @@ It is well known that in cartesian coordinates, a line between points $(x_1, y_1)$ and $(x_2, y_2)$, can be described by: \begin{align} - y(x) &= m x + b\label{eqn_line} \quad \text{ on $x \in [x_1, x_2]$} \\ - \text{ for } & m = (y_2 - y_1)/(x_2 - x_1) \\ - \text{ and } & b + y(x) &= m x + c\label{eqn_line} \quad \text{ on $x \in [x_1, x_2]$} + \text{ for } & m = \frac{(y_2 - y_1)}{(x_2 - x_1)} \\ + \text{ and } & c = \end{align} On a raster display, only points $(x,y)$ with integer coordinates can be displayed; however $m$ will generally not be an integer. Thus a straight forward use of Equation \ref{eqn_line} will require costly floating point operations and rounding (See Section\ref{}). Modifications based on computing steps $\delta x$ and $\delta y$ eliminate the multiplication but are still less than ideal in terms of performance\cite{computergraphics2}. diff --git a/chapters/Background_Raster-vs-Vector.tex b/chapters/Background_Raster-vs-Vector.tex index d6018a9..c60b0ba 100644 --- a/chapters/Background_Raster-vs-Vector.tex +++ b/chapters/Background_Raster-vs-Vector.tex @@ -4,11 +4,12 @@ A raster image's structure closely matches it's representation as shown on moder The drawback of raster images is that by their very nature there can only be one level of detail. Figures \ref{vector-vs-raster} and \ref{vector-vs-raster-scaled} attempt to illustrate this by comparing raster images to vector images in a similar way to Worth and Packard\cite{worth2003xr}. -Consider the right side of Figure \ref{vector-vs-raster}. This is a raster image which should be recognisable as an animal defined by fairly sharp edges. Figure \ref{vector-vs-raster-scaled} shows that zooming on the animal's face causes these edges to appear jagged. There is no information in the original image as to what should be displayed at a larger size, so each square shaped pixel is simply increased in size. A blurring effect will probably be visible in most PDF viewers; the software has attempted to make the ``edge'' appear more realistic using a technique called ``antialiasing'' (See Section \ref{Straight Lines}).\footnote{The exact appearance of the images at different zoom levels will depend greatly on the PDF viewer or printer used to display this report. On the author's display using the Atril (1.6.0) Document viewer, the top images appear to be pixel perfect mirror images at a 100\% scale. In the bottom raster image, antialiasing is not applied at zoom levels above $125\%$ and the effect of scaling is quite noticeable.} +The right side of Figure \ref{vector-vs-raster} is a raster image which should be recognisable as an animal defined by fairly sharp edges. Figure \ref{vector-vs-raster-scaled} shows how these edges appear jagged when scaled. There is no information in the original image as to what should be displayed at a larger size, so each square shaped pixel is simply increased in size. A blurring effect will probably be visible in most PDF viewers; the software has attempted to make the ``edge'' appear more realistic using a technique called ``antialiasing''. +%(See Section \ref{Straight Lines}).\footnote{The exact appearance of the images at different zoom levels will depend greatly on the PDF viewer or printer used to display this report. On the author's display using the Atril (1.6.0) Document viewer, the top images appear to be pixel perfect mirror images at a 100\% scale. In the bottom raster image, antialiasing is not applied at zoom levels above $125\%$ and the effect of scaling is quite noticeable.} %\footnote{\noindent This behaviour may be configured in some PDF viewers (Adobe Reader) whilst others (Evince, Atril, Okular) will choose whether or not to bother with antialiasing based on the zoom level. For best results experiment with changing the zoom level in your PDF viewer.\footnotemark}\footnotetext{On the author's hardware, the animals in the vector and raster images should appear mirrored pixel for pixel; but they may vary slightly on other PDF viewers or display devices.} -In contrast, the left sides of Figures \ref{vector-vs-raster} and \ref{vector-vs-raster-scaled} are a vector image. A vector image contains information about a number of geometric shapes. To display this image on modern display hardware, the coordinates are transformed according to the view and \emph{then} the image is converted into a raster like representation. Whilst the raster image merely appears to contain edges, the vector image actually contains information about these edges, meaning they can be displayed ``infinitely sharply'' at any level of detail\cite{citationneeded} --- or they could be if the coordinates are stored with enough precision (see Section \ref{}). Thus, vector images are well suited to high quality digital art\footnote{Figure \ref{vector-vs-raster} is not to be taken as an example of this.} and text. +In contrast, the left sides of Figures \ref{vector-vs-raster} and \ref{vector-vs-raster-scaled} are a vector image. A vector image contains information about the positioning and shading of geometric shapes. To display this image on modern display hardware, coordinates are transformed according to the view and then the image is converted into a raster like representation. Whilst the raster image merely appears to contain edges, the vector image actually contains information about these edges, meaning they can be displayed ``infinitely sharply'' at any level of detail\cite{citationneeded} --- or they could be if the coordinates are stored with enough precision (see Section \ref{}). Vector images are well suited to high quality digital art\footnote{Figure \ref{vector-vs-raster} is not to be taken as an example of this.} and text. \newlength\imageheight diff --git a/chapters/Progress.tex b/chapters/Progress.tex index 808e7c0..1aa5f1b 100644 --- a/chapters/Progress.tex +++ b/chapters/Progress.tex @@ -1,47 +1,89 @@ \chapter{Progress Report}\label{Progress Report} This chapter outlines the current state of our research in relation to the aims outlined in Chapter \ref{Introduction}. -\rephrase{It will serve as an explanation for where the Figures in Chapter \ref{Background} came from. It will just be a short summary of the implementation details}. -\section{Development of Testbed Software} +\section{Literature Review} -We wrote a very simple OpenGL 1.1 program to experiment with, and then David Gow converted it to OpenGL 3.1 and I have no idea how it works anymore. +We have examined a range of literature that can be broadly classed into three different areas: +\begin{enumerate} + \item Rendering Vector Graphics + \item Representations of Vector Documents + \item Floating Point number representations +\end{enumerate} -\section{Design and Implementation of ``Tests''} +In summary, we have found: \begin{itemize} - \item Compile by swapping out \verb/main()/ for a tester - \item There are tests for doing some of the things in Chapter \ref{Introduction} but most still aren't written yet. - + \item Rasterisation of Vector Graphics is non-trivial but well understood + \item Traditionally rasterisation has been performed on the CPU and rendering on a dedicated GPU; current interest is in techniques for utilising the GPU directly to rasterise vector graphics. + \item The popular standards for document formats including PostScript, PDF, HTML, SVG require IEEE-754 binary32 precision + \item Fixed precision floating point numbers make a trade off between precision and range + \item IEEE-754 is widely used although there are instances of languages or processors which do not conform exactly to the standard + \item GPUs in particular may not conform to IEEE-754, trading some accuracy of operations for performance \end{itemize} -\section{Document Format} +\section{Development of Testbed Software} -Currently we effectively have a DOM format but with the following non-features: -\begin{itemize} - \item Binary file format (non standard; not XML) - \item Only rectangles. -\end{itemize} +We have produced a basic Document Viewer capable of rendering simple primitives under translation and scaling. OpenGL 3.1 is used to interface with graphics hardware. This software has the following features: +\begin{enumerate} + \item A type name \verb/Real/ is used in place of the standard floating point types \verb/float/, \verb/double/ or \verb/long double/. This type name can be redefined to refer to one of the standard types or a custom real number representation, allowing us to easily recompile and test our software for different representations. + \item Screenshots can be overlaid on top of each other to get a pixel comparison of the graphical output of different versions of the program + \item Test documents can be loaded and saved so that we can compare different versions of the program on identical inputs + \item Transformations can be performed on either the GPU or CPU + \item Performance of rendering can be measured +\end{enumerate} -\section{Floating Point Number Representations} +We have found the performance of coordinate transforms on the GPU to be far superior to the CPU. However, at large enough scales it becomes apparent that the GPU is performing operations at a lower precision than the CPU. See Figure \ref{}. -\rephrase{I have\footnote{Ok... ``will have''} some figures that I would prefer to include in Chapter \ref{Background} when I am talking about the papers that inspired them.} -\rephrase{This section will probably briefly talk about how they were created and just refer back to them}. -\begin{itemize} - \item \verb/calculatepi.test/ - \item \verb/typedef/ of \verb/Real/ -\end{itemize} -\section{Virtual FPU} +\section{Floating Point Precision} + +Algorithms for floating point arithmetic may be implemented in software (CPU) or on dedicated hardware (FPU). We have made progress towards both approaches. + +An open source Virtual FPU implemented in the VHDL language has been successfully compiled and can be substituted into our testbed software in place of native arithmetic running on the CPU. The timing diagram for this FPU throughout the execution of test programs can be extracted. Currently the virtual FPU is restricted to 32 bit floats and the square root operation is unimplemented. + +Mainly motivated by producing Figure \ref{minifloat.pdf} we have also implemented functions to convert arbitrary real numbers (which may themselves be IEEE-754 floats) to and from a fixed size floating point representation of our choosing. We have not implemented any operations for floating point arithmetic using these representations. + +By using the functions to convert real numbers to variable precision floats as an interface for the virtual FPU, we hope to illustrate the limitations of floating point arithmetic more clearly than would be possible using IEEE-754 binary32 as is native to the C and C++ languages. + +\subsection{Prototype Document Formats} -Techniques for dealing with FP numbers can be implemented in software (CPU) or on dedicated hardware (FPU). We are able to run FP arithmetic on arbitrary simulations of FPUs created using VHDL. \rephrase{Hopefully explore this a bit in Chapter \ref{Background}}. +Our testbed software is capable of reading primitive attributes from either a binary file or XML plain text file. Our format is closest to the Document Object Model, although there is currently only one generation in the tree as no primitives can contain other elements as of yet. +If time permits, we plan to extend our XML format to cover a subset of the SVG standard. This may allow us to compare the rasterisation of an SVG using our own software and traditional software relying on IEEE-754 floats. -\section{Version Control} +\section{Version Control and Backup of Work} -Git is a distributed version control system widely used in the development of open source software\cite{}. All rescources created for or used by this project have been placed in git repositories on several servers. The repositories are publically accessable at \url{http://git.ucc.asn.au} +Git is a distributed version control system widely used in the development of open source software\cite{}. All rescources created for or used by this project have been placed in git repositories on several servers. The repositories are publically accessable at \url{http://git.ucc.asn.au}, \url{http://szmoore.net/ipdf} and \url{david's website probably I guess}\footnote{These are all actually on the same filesystem but it sounds impressive anyway} +\section{Timeline} +Deadlines enforced by the faculty of Engineering Computing and Mathematics are \emph{italicised}. Tasks completed as of the submission of this report are struck through. \footnote{David Gow is being assessed under the 2014 rules for a BEng (Software) Final Year Project, whilst the author is being assessed under the 2014 rules for a BEng (Mechatronics) Final Year Project; deadlines and requirements as shown in Gow's proposal\cite{proposalGow} may differ}. +\begin{center} +\begin{tabular}{l|p{0.5\textwidth}} + {\bf Date} & {\bf Milestone}\\ + \hline + $1^{\text{st}}$ May & Testbed Software (basic document format and viewer) completed and approaches for extending to allow infinite precision identified. \\ + \hline + ? May & Draft Progress Report and Literature Review \\ + \hline + $26^{\text{th}}$ May & \emph{Progress Report and Literature Review due.}\\ + \hline + $9^{\text{th}}$ June & Demonstrations of limitations of floating point precision in the Testbed software. \\ + $1^{\text{st}}$ July & At least one implementation of infinite precision for basic primitives (lines, polygons, curves) completed. Other implementations, advanced features, and areas for more detailed research identified. \\ + \hline + $1^{\text{st}}$ August & Experiments and comparison of various infinite precision implementations completed. \\ + \hline + $1^{\text{st}}$ September & Advanced features implemented and tested, work underway on Final Report. \\ + \hline + TBA & \emph{Conference Abstract and Presentation due.} \\ + \hline + $10^{\text{th}}$ October & \emph{Draft of Final Report due.} \\ + \hline + $27^{\text{th}}$ October & \emph{Final Report due.}\\ + \hline +\end{tabular} +\end{center} diff --git a/chapters/Proposal.tex b/chapters/Proposal.tex index dd2f8e6..f8e1ed0 100644 --- a/chapters/Proposal.tex +++ b/chapters/Proposal.tex @@ -6,13 +6,13 @@ \section{Aim} In this project, we will explore the state of the art of current document formats including PDF, PostScript, SVG, HTML, and the limitations of each in terms of precision. -We will consider designs for a document format allowing graphics primitives at an arbitrary level of zoom with no loss of detail. We will refer to such a document format as ``infinite precision''. A viewer and editor will be implemented as a proof of concept; we adopt a low level, ground up approach to designing this viewer so as to not become restricted by any single existing document format. +We will consider designs for a document format allowing graphics primitives at an arbitrary level of zoom with no loss of detail. A viewer and editor will be implemented as a proof of concept; we adopt a low level, ground up approach to designing this viewer so as to not become restricted by any single existing document format. There are many possible applications for documents in which precision is unlimited. Several areas of use include: visualisation of extremely large or infinite data sets; visualisation of high precision numerical computations; digital artwork; computer aided design; and maps. \subsection{Clarification of Terms} -It may be necessary to clarify what we mean by the terms ``infinite precision'' and ``document formats''. Regarding the latter, we consider a document format to be any representation of visual information which is capable of being stored indefinitely. Regarding the former, we do not propose to be able to contain an infinite amount of information within such a document. The goal is to be able to render a primitive at the same level of detail it is specified by a document format, regardless of how precise this level is. For example, the precision of coordinates of primitives drawn in a graphical document editor will always be limited by the resolution of the display on which they are drawn, but not by the viewer. +It may be necessary to clarify what we mean by the terms ``arbitrary precision'' and ``document formats''. Regarding the latter, we consider a document format to be any representation of visual information which is capable of being stored indefinitely. Regarding the former, we do not propose to be able to contain an infinite amount of information within such a document. The goal is to be able to render a primitive at the same level of detail it is specified by a document format, regardless of how precise this level is. For example, the precision of coordinates of primitives drawn in a graphical document editor will always be limited by the resolution of the display on which they are drawn, but not by the viewer. \section{Methods} @@ -25,17 +25,15 @@ At this stage we have identified two possible areas for individual research: \item {\bf Arbitrary Precision real valued numbers} --- Sam Moore - We plan to investigate the representation of real values to a high or arbitary degree of precision. Such representations would allow for a document to be implemented - using a single global coordinate system. However, we would expect a decrease in performance with increased complexity of the data structure used to represent a real value. \rephrase{Both software and hardware techniques will be explored.} We will also consider the limitations imposed by performing calculations on the GPU or CPU. + We plan to investigate the representation of real values to a high or arbitary degree of precision. Such representations would allow for the coordinates of primitives to be relative to a single global coordinate system. We would expect a decrease in performance with increased complexity of the data structure used to represent a real value. \rephrase{Both software and hardware techniques will be explored.} We will also consider the limitations imposed by performing calculations on the GPU or CPU. - Starting points for research in this area are Priest's 1991 paper, ``Algorithms for Arbitrary Precision Floating Point Arithmetic''\cite{priest1991algorithms}, and Goldberg's 1992 paper ``The design of floating point data types''\cite{goldberg1992thedesign}. A more recent and comprehensive text book, ``Handbook of Floating Point Arithmetic''\cite{HFP}, published in 2010, has also been identified as highly relevant. +Starting points for research in this area are Priest's 1991 paper, ``Algorithms for Arbitrary Precision Floating Point Arithmetic''\cite{priest1991algorithms}, and Goldberg's 1992 paper ``The design of floating point data types''\cite{goldberg1992thedesign}. A more recent and comprehensive text book, ``Handbook of Floating Point Arithmetic''\cite{HFP}, published in 2010, has also been identified as highly relevant. \item {\bf Local coordinate systems} --- David Gow \cite{proposalGow} An alternative approach involves segmenting the document into different regions using fixed precision floats to define primitives within each region. A quadtree or similar data structure could be employed to identify and render those regions currently visible in the document viewer.\rephrase{Say more here?} \end{enumerate} -\pagebreak We aim to compare these and any additional implementations considered using the following metrics: \begin{enumerate} @@ -70,36 +68,6 @@ We aim to compare these and any additional implementations considered using the Due to the relative immaturity and inconsistency of graphics drivers on mobile devices, our proof of concept will be developed for a conventional GNU/Linux desktop or laptop computer using OpenGL. However, the techniques explored could easily be extended to other platforms and libraries. -\pagebreak - -\section{Timeline} - -Deadlines enforced by the faculty of Engineering Computing and Mathematics are \emph{italicised}.\footnote{David Gow is being assessed under the 2014 rules for a BEng (Software) Final Year Project, whilst the author is being assessed under the 2014 rules for a BEng (Mechatronics) Final Year Project; deadlines and requirements as shown in Gow's proposal\cite{proposalGow} may differ}. - -\begin{center} -\begin{tabular}{l|p{0.5\textwidth}} - {\bf Date} & {\bf Milestone}\\ - \hline - $1^{\text{st}}$ May & Testbed Software (basic document format and viewer) completed and approaches for extending to allow infinite precision identified. \\ - \hline - ? May & Draft Progress Report and Literature Review \\ - \hline - $26^{\text{th}}$ May & \emph{Progress Report and Literature Review due.}\\ - \hline - $9^{\text{th}}$ June & Demonstrations of limitations of floating point precision in the Testbed software. \\ - $1^{\text{st}}$ July & At least one implementation of infinite precision for basic primitives (lines, polygons, curves) completed. Other implementations, advanced features, and areas for more detailed research identified. \\ - \hline - $1^{\text{st}}$ August & Experiments and comparison of various infinite precision implementations completed. \\ - \hline - $1^{\text{st}}$ September & Advanced features implemented and tested, work underway on Final Report. \\ - \hline - TBA & \emph{Conference Abstract and Presentation due.} \\ - \hline - $10^{\text{th}}$ October & \emph{Draft of Final Report due.} \\ - \hline - $27^{\text{th}}$ October & \emph{Final Report due.}\\ - \hline -\end{tabular} -\end{center} + diff --git a/meta/Titlepage.tex b/meta/Titlepage.tex index 4977697..cfbd463 100644 --- a/meta/Titlepage.tex +++ b/meta/Titlepage.tex @@ -1,6 +1,6 @@ % Suitably pretty title page is required. \begin{titlepage} -\title{Infinite Precision Document Formats} +\title{Precision In Document Formats} %From ipdf -> pidf (-_-) \author{{\it Author:} Samuel Moore\cite{proposalMoore} \\ {{\it Partners:} David Gow\cite{proposalGow}} \\ {{\it Supervisor:} Prof Tim French}\\ diff --git a/thesis.pdf b/thesis.pdf index 5f9881c..4766a9e 100644 Binary files a/thesis.pdf and b/thesis.pdf differ diff --git a/thesis.tex b/thesis.tex index f943eac..6808cbd 100644 --- a/thesis.tex +++ b/thesis.tex @@ -1,4 +1,4 @@ -\documentclass[a4paper,11pt,titlepage]{report} +\documentclass[a4paper,10pt,titlepage]{report} \linespread{1.3} \usepackage{setspace} \onehalfspacing @@ -117,11 +117,11 @@ \include{meta/Titlepage} % This is who you are -\include{meta/Abstract} % This is your thesis abstract +\input{meta/Abstract} % This is your thesis abstract -\newpage +%\newpage -\include{meta/Acknowledgments} % This is who you thank +%\include{meta/Acknowledgments} % This is who you thank \pagenumbering{roman}