From: David Gow Date: Wed, 21 May 2014 19:36:00 +0000 (+0800) Subject: I really need to sleep. Also SVG and rendering. X-Git-Url: https://git.ucc.asn.au/?a=commitdiff_plain;h=3a63f1e6c12b50b670100545fe21c58d7af479b4;p=ipdf%2Fdocuments.git I really need to sleep. Also SVG and rendering. --- diff --git a/LitReviewDavid.pdf b/LitReviewDavid.pdf index d819fc3..5274454 100644 Binary files a/LitReviewDavid.pdf and b/LitReviewDavid.pdf differ diff --git a/LitReviewDavid.tex b/LitReviewDavid.tex index 5605cef..da4d5d8 100644 --- a/LitReviewDavid.tex +++ b/LitReviewDavid.tex @@ -23,6 +23,171 @@ And thus the document was born. Traditionally, documents have been static: just marks on paper, but with the advent of computers many more possibilities open up. + +\section{Rendering} + +Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours; +and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs +and match how cameras, printers and monitors work. + + +\begin{figure}[h] + \centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example} + \caption{A circle as a vector image and a $32 \times 32$ pixel raster image} +\end{figure} + + +However, bitmap devices do not handle zooming beyond their ``native'' resolution --- the resolution where one document pixel maps +to one display pixel ---, exhibiting an artefact called pixelation where the pixel structure becomes evident. Attempts to use +interpolation to hide this effect are never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected. + +Vector graphics avoid many of these problems: the representation is independent of the output resolution, and rather +an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines, +arcs and glyphs. + +As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at +a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image +at that resolution. + +% Project specific line +This project will be based around vector graphics, as these properties make it more suited to experimenting with zoom +quality. + +\subsection{Rasterizing Vector Graphics} + +Before an vector document can be rasterized, the co-ordinates of any shapes must +be transformed into \emph{screen space} or \emph{viewport space}\cite{blinn1992trip}. +On a typical display, many of these screen-space coordinates require very little precision or range. +However, the co-ordinate transform must take care to ensure that precision is preserved during this transform. + +After this transformation, the image is decomposed into its separate shapes, which are rasterized +and then composited together. +Most graphics formats support Porter-Duff compositing\cite{porter1984compositing}. +Porter-Duff compositing gives each element (typically a pixel) a ``coverage'' value, +denoted $\alpha$ which represents the contribution of that element to the final scene. +Completely transparent elements would have an $\alpha$ value of $0$, and completely opaque +elements have an $\alpha$ of $1$. This permits arbitrary shapes to be layered over one another +in the raster domain, while retaining soft-edges. + +The rasterization process may then proceed on one object (or shape) at a time. There are special algorithms for rasterizing +different shapes. + +\begin{description} + \item[Line Segment] + Straight lines between two points are easily rasterized using Bresenham's algorithm\cite{bresenham1965algorithm}. + Bresenham's algorithm draws a pixel for every point along the \emph{long} axis of the line, moving along the short + axis when the error exceeds $\frac{1}{2}$ a pixel. + + Bresenham's algorithm only operates on lines whose endpoints lie on integer pixel coordinates. Due to this, line ``clipping'' + may be performed to find endpoints of the line segment such that the entire line will be on-screen. However, if line clipping is + performed na\"ively without also setting the error accumulator correctly, the line's slope will be altered slightly, becoming dependent + on the viewport. + + \item[B\'ezier Curve] + A B\'ezier curve is a smooth (i.e.\ infinitely differentiable) curve between two points, represented by a Bernstein polynomial. + The coefficients of this Bernstein polynomial are known as the ``control points.'' + + Line Segments are a first-order B\'ezier curve. + + \item[B\'ezier Spline] + A spline of order $n$ is a $C^{n-1}$ smooth continuous piecewise function composed of polynomials of degree $\leq n$. + In a B\'ezier spline, these polynomials are Bernstein polynomials, hence the spline is a curve made by joining B\'ezier curves + end-to-end (in a manner which preserves some level of smoothness). + + Many vector graphics formats call B\'ezier splines of a given order (typically quadratic or cubic) ``paths'' and treat them as the + fundamental type from which shapes are formed. +\end{description} + + +%There are special algorithms +%for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier +%Curves\cite{goldman_thefractal}. + +While traditionally, rasterization was done entirely in software, modern computers and mobile devices have hardware support for rasterizing +lines and triangles designed for use rendering 3D scenes. This hardware is usually programmed with an +API like \texttt{OpenGL}\cite{openglspec}. + +More complex shapes like B\'ezier curves can be rendered by combining the use of bitmapped textures (possibly using signed-distance +fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) with polygons approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}. + +Indeed, there are several implementations of entire vector graphics systems using OpenGL: OpenVG\cite{robart2009openvg} on top of OpenGL ES\cite{oh2007implementation}; +the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz} and the SVG/PostScript GPU +renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}. + + +\section{Numeric formats} + +On modern computer architectures, there are two basic number formats supported: +fixed-width integers and \emph{floating-point} numbers. Typically, computers +natively support integers of up to 64 bits, capable of representing all integers +between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers, +which have the same cardinality as their \emph{unsigned} counterparts, but which +represent integers in the range $[-(2^{63}), 2^{63} - 1]$}. + +By introducing a fractional component (analogous to a decimal point), we can convert +integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater +precision. For example, a number in 4.4 fixed-point format would have four bits representing the integer +component, and four bits representing the fractional component: +\begin{equation} + \underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75 +\end{equation} + + +Floating-point numbers\cite{goldberg1992thedesign} are the binary equivalent of scientific notation: +each number consisting of an exponent ($e$) and a mantissa ($m$) such that a number is given by +\begin{equation} + n = 2^{e} \times m +\end{equation} + +The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types +which are used\footnote{Many systems' implement the IEEE 754 standard's storage formats, +but do not implement arithmetic operations in accordance with this standard.} by most +computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa, 1 sign bit) and +64-bit (11-bit exponent, 53-bit mantissa, 1 sign bit) formats\footnote{The 2008 +revision to this standard\cite{ieee754std2008} adds some additional formats, but is +less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits +of precision respectively. + +Floating-point numbers behave quite differently to integers or fixed-point numbers, as +the representable numbers are not evenly distributed. Large numbers are stored to a lesser +precision than numbers close to zero. This can present problems in documents when zooming in +on objects far from the origin. + +IEEE floating-point has some interesting features as well, including values for negative zero, +positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{denormal} values, which +trade precision for range when dealing with very small numbers. Indeed, with these values, +IEEE 754 floating-point equality does not form an equivalence relation, which can cause issues +when not considered carefully.\cite{goldberg1991whatevery} + +There also exist formats for storing numbers with arbitrary precising and/or range. +Some programming languages support ``big integer''\cite{java_bigint} types which can +represent any integer that can fit in the system's memory. Similarly, there are +arbitrary-precision floating-point data types\cite{java_bigdecimal}\cite{boost_multiprecision} +which can represent any number of the form +\begin{equation} + \frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed. +\end{equation} +These types are typically built from several native data types such as integers and floats, +paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms} +These, therefore, are likely slower than the native types they are built on. + +While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats, +modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64} +IEEE floats. Note, however, that some parts of the GPU are only able to use some formats, +so precision will likely be truncated at some point before display. +Higher precision numeric types can be implemented or used on the GPU, but are +slow.\cite{emmart2010high} + +Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows +values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats, +this would have a recurring representation: +\begin{equation} + \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots +\end{equation} +Whereas with a rational type, this is simply $\frac{1}{3}$. +Rationals do not have a unique representation for each value, typically the reduced fraction is used +as a characteristic element. + \section{Document Formats} Most existing document formats --- such as the venerable PostScript and PDF --- are, however, designed to imitate @@ -107,6 +272,11 @@ which already have their elements placed. Later versions of PDF also extend the PostScript rendering model to support translucent regions via Porter-Duff compositing\cite{porter1984compositing}. PDF documents represent a particular layout, and must be rasterized before display. + + \item[SVG] + Scalable Vector Graphics (SVG) is a vector graphics document format\cite{svg2011} which uses the Document Object Model. It consists of a tree of matrix transforms, + with objects such as vector paths (made up of B\'ezier curves) and text at the leaves. + \end{description} \subsection{Precision in Document Formats} @@ -137,124 +307,11 @@ representable with the \emph{real} type have been specified differently: the lar $\pm1.175 \times 10^{-38}$ with approximately $5$ decimal digits of precision \emph{in the fractional part}. Adobe's implementation of PDF uses both IEEE 754 single precision floating-point numbers and (for some calculations, and in previous versions) 16.16 bit fixed-point values. - -\section{Rendering} - -Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours; -and \emph{vector} graphics, defined by mathematical descriptions of objects. Bit-mapped graphics are well suited to photographs -and are match how cameras, printers and monitors work. However, bitmap devices do not handle zooming beyond their -``native'' resolution --- the resolution where one document pixel maps to one display pixel ---, exhibiting an artefact -called pixelation where the pixel structure becomes evident. Attempts to use interpolation to hide this effect are -never entirely successful, and sharp edges, such as those found in text and diagrams, are particularly affected. - -\begin{figure}[h] - \centering \includegraphics[width=0.8\linewidth]{figures/vectorraster_example} - \caption{A circle as a vector image and a $32 \times 32$ pixel raster image} -\end{figure} - - -Vector graphics lack many of these problems: the representation is independent of the output resolution, and rather -an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines, -arcs and ``B\'ezier curves''\cite{catmull1974asubdivision}. -As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at -a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image -at that resolution. - -This project will be based around vector graphics, as these properties make it more suited to experimenting with zoom -quality. - - -The rasterization process typically operates on an individual ``object'' or ``shape'' at a time: there are special algorithms -for rendering lines\cite{bresenham1965algorithm}, triangles\cite{giesen2013triangle}, polygons\cite{pineda1988parallel} and B\'ezier -Curves\cite{goldman_thefractal}. Typically, these are rasterized independently and composited in the bitmap domain using Porter-Duff -compositing\cite{porter1984compositing} into a single image. This allows complex images to be formed from many simple pieces, as well -as allowing for layered translucent objects, which would otherwise require the solution of some very complex constructive geometry problems. - -While traditionally, rasterization was done entirely in software, modern computers and mobile devices have hardware support for rasterizing -some basic primitives --- typically lines and triangles ---, designed for use rendering 3D scenes. This hardware is usually programmed with an -API like \texttt{OpenGL}\cite{openglspec}. - -More complex shapes like B\'ezier curves can be rendered by combining the use of bitmapped textures (possibly using signed-distance -fields\cite{leymarie1992fast}\cite{frisken2000adaptively}\cite{green2007improved}) with polygons approximating the curve's shape\cite{loop2005resolution}\cite{loop2007rendering}. - -Indeed, there are several implementations of entire vector graphics systems using OpenGL: OpenVG\cite{robart2009openvg} on top of OpenGL ES\cite{oh2007implementation}; -the Cairo\cite{worth2003xr} library, based around the PostScript/PDF rendering model, has the ``Glitz'' OpenGL backend\cite{nilsson2004glitz} and the SVG/PostScript GPU -renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300programming}. - - -\section{Numeric formats} - -On modern computer architectures, there are two basic number formats supported: -fixed-width integers and \emph{floating-point} numbers. Typically, computers -natively support integers of up to 64 bits, capable of representing all integers -between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers, -which have the same cardinality as their \emph{unsigned} counterparts, but which -represent integers in the range $[-(2^{63}), 2^{63} - 1]$}. - -By introducing a fractional component (analogous to a decimal point), we can convert -integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater -precision. For example, a number in 4.4 fixed-point format would have four bits representing the integer -component, and four bits representing the fractional component: -\begin{equation} - \underbrace{0101}_\text{integer component}.\underbrace{1100}_\text{fractional component} = 5.75 -\end{equation} - - -Floating-point numbers\cite{goldberg1992thedesign} are the binary equivalent of scientific notation: -each number consisting of an exponent ($e$) and a mantissa ($m$) such that a number is given by -\begin{equation} - n = 2^{e} \times m -\end{equation} - -The IEEE 754 standard\cite{ieee754std1985} defines several floating-point data types -which are used\footnote{Many systems' implement the IEEE 754 standard's storage formats, -but do not implement arithmetic operations in accordance with this standard.} by most -computer systems. The standard defines 32-bit (8-bit exponent, 23-bit mantissa, 1 sign bit) and -64-bit (11-bit exponent, 53-bit mantissa, 1 sign bit) formats\footnote{The 2008 -revision to this standard\cite{ieee754std2008} adds some additional formats, but is -less widely supported in hardware.}, which can store approximately 7 and 15 decimal digits -of precision respectively. - -Floating-point numbers behave quite differently to integers or fixed-point numbers, as -the representable numbers are not evenly distributed. Large numbers are stored to a lesser -precision than numbers close to zero. This can present problems in documents when zooming in -on objects far from the origin. - -IEEE floating-point has some interesting features as well, including values for negative zero, -positive and negative infinity, the ``Not a Number'' (NaN) value and \emph{denormal} values, which -trade precision for range when dealing with very small numbers. Indeed, with these values, -IEEE 754 floating-point equality does not form an equivalence relation, which can cause issues -when not considered carefully.\cite{goldberg1991whatevery} - -There also exist formats for storing numbers with arbitrary precising and/or range. -Some programming languages support ``big integer''\cite{java_bigint} types which can -represent any integer that can fit in the system's memory. Similarly, there are -arbitrary-precision floating-point data types\cite{java_bigdecimal}\cite{boost_multiprecision} -which can represent any number of the form -\begin{equation} - \frac{n}{2^d} \; \; \; \; n,d \in \mathbb{Z} % This spacing is horrible, and I should be ashamed. -\end{equation} -These types are typically built from several native data types such as integers and floats, -paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms} -These, therefore, are likely slower than the native types they are built on. - -While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats, -modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64} -IEEE floats. Note, however, that some parts of the GPU are only able to use some formats, -so precision will likely be truncated at some point before display. -Higher precision numeric types can be implemented or used on the GPU, but are -slow.\cite{emmart2010high} - -Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows -values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats, -this would have a recurring representation: -\begin{equation} - \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots -\end{equation} -Whereas with a rational type, this is simply $\frac{1}{3}$. -Rationals do not have a unique representation for each value, typically the reduced fraction is used -as a characteristic element. - +The SVG specification\cite{svg2011} specifies numbers as strings with a decimal representation of the number. +It is stated that a ``Conforming SVG Viewer'' must have ``all visual rendering accurate to within one device pixel to the mathematically correct result at the initial 1:1 +zoom ratio'' and that ``it is suggested that viewers attempt to keep a high degree of accuracy when zooming.'' +A ``Conforming High-Quality SVG Viewer'' must use ``double-precision floating point\footnote{Presumably the 64-bit IEEE 754 ``double'' type.}'' for computations involving +coordinate system transformations. \section{Quadtrees} When viewing or processing a small part of a large document, it may be helpful to diff --git a/papers.bib b/papers.bib index 7ffb0c2..0ae7de8 100644 --- a/papers.bib +++ b/papers.bib @@ -361,6 +361,31 @@ Goldberg:1991:CSK:103162.103163, organization={IEEE} } +@article{blinn1992trip, + title={A Trip Down the Graphics Pipeline: Grandpa, What Does “Viewport” Mean?}, + author={Blinn, James}, + journal={Computer Graphics and Applications, IEEE}, + month={Jan}, + volume={12}, + number={1}, + pages={83--87}, + year={1992} +} + +@ARTICLE{blinn1991trip, + author={Blinn, James}, + journal={Computer Graphics and Applications, IEEE}, + title={A Trip Down the Graphics Pipeline: Line Clipping}, + year={1991}, + month={Jan}, + volume={11}, + number={1}, + pages={98-105}, + keywords={computer graphics;Z clipping;clipping function;computer graphics;global clipping;homogeneous clipping;line clipping;transform-clip-draw pipeline;Application software;Arithmetic;Assembly;Computer graphics;Displays;Education;Hardware;Pipelines;Standards publication}, + doi={10.1109/38.67707}, + ISSN={0272-1716}, +} + %%%%%%%%%%%%%%%%% % Quadtrees %%%%%%%%%%%%%%%%% diff --git a/references/blinn1992trip.pdf b/references/blinn1992trip.pdf new file mode 100644 index 0000000..e234f23 Binary files /dev/null and b/references/blinn1992trip.pdf differ