From: Sam Moore Date: Wed, 21 May 2014 18:49:32 +0000 (+0800) Subject: Merge branch 'master' of git.ucc.asn.au:/ipdf/documents X-Git-Url: https://git.ucc.asn.au/?p=ipdf%2Fdocuments.git;a=commitdiff_plain;h=f4e49b995f08a0a016ec5cbb38451a9a00ccca27;hp=72b2011c7a1bdd5a3fec447a56e0b1ff6aa03d7b Merge branch 'master' of git.ucc.asn.au:/ipdf/documents Conflicts: papers.bib Just that paper on precision in TeX and METAFONT that David found --- diff --git a/LitReviewDavid.pdf b/LitReviewDavid.pdf index 1e0bb5e..d819fc3 100644 Binary files a/LitReviewDavid.pdf and b/LitReviewDavid.pdf differ diff --git a/LitReviewDavid.tex b/LitReviewDavid.tex index a8947e8..5605cef 100644 --- a/LitReviewDavid.tex +++ b/LitReviewDavid.tex @@ -34,12 +34,33 @@ the content of the document to be explored in ways that perhaps the author had n However, these data-driven formats typically do not support fixed layouts, and the display differs from renderer to renderer. -Ultimately, there are two fundamental stages by which all documents --- digital or otherwise --- are produced and displayed: -\emph{layout} and \emph{display}. The \emph{layout} stage is where the positions and sizes of text and other graphics are -determined, while the \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor. +\subsection{A Taxonomy of Document formats} -Different document formats approach these stages in different ways. Some treat the document as a program, written in -a turing complete document language with instructions which emit shapes to be displayed. These shapes are either displayed +The process of creating and displaying a document is a rather universal one (\ref{documenttimeline}), though +different document formats approach it slightly differently. A document often begins as raw content: text and images +(be they raster or vector) and it must end up as a set of photons flying towards the reader's eyes. + +\begin{figure} + \label{documenttimeline} + \centering \includegraphics[width=0.8\linewidth]{figures/documenttimeline} + \caption{The lifecycle of a document} +\end{figure} + +There are two fundamental stages by which all documents --- digital or otherwise --- are produced and displayed: +\emph{layout} and \emph{rendering}. The \emph{layout} stage is where the positions and sizes of text and other graphics are +determined. The text will be \emph{flowed} around graphics, the positions of individual glyphs will be placed, ensuring +that there is no undesired overlap and that everything will fit on the page or screen. + +The \emph{display} stage actually produces the final output, whether as ink on paper or pixels on a computer monitor. +Each graphical element is rasterized and composited into a single image of the target resolution. + + +Different document formats cover documents in different stages of this project. Bitmapped images, +for example, would represent the output of the final stage of the process, whereas markup languages typically specify +a document which has not yet been processed, ready for the layout stage. + +Furthermore, some document formats treat the document as a program, written in +a (usually turing complete) document language with instructions which emit shapes to be displayed. These shapes are either displayed immediately, as in PostScript, or stored in another file, such as with \TeX or \LaTeX, which emit a \texttt{DVI} file. Most other forms of document use a \emph{Document Object Model}, being a list or tree of objects to be rendered. \texttt{DVI}, \texttt{PDF}, \texttt{HTML}\footnote{Some of these formats --- most notably \texttt{HTML} --- implement a scripting lanugage such as JavaScript, @@ -47,12 +68,76 @@ which permit the DOM to be modified while the document is being viewed.} and SVG store documents in pre-layout stages, whereas even turing complete document formats such as PostScript typically encode documents which already have their elements placed. -Existing document formats, due to being designed to model paper, -have limited precision (8 decimal digits for PostScript\cite{plrm}, 5 decimal digits for PDF\cite{pdfref17}). -This matches the limited resolution of printers and ink, but is limited when compared to what aught to be possible -with ``zoom'' functionality, which is prevent from working beyond a limited scale factor, lest artefacts appear due +\begin{description} + \item[\TeX \, and \LaTeX] + Donald Knuth's typesetting language \TeX \, is one of the older computer typesetting systems, originally conceived in 1977\cite{texdraft}. + It implements a turing-complete language and is human-readable and writable, and is still popular + due to its excellent support for typesetting mathematics. + \TeX only implements the ``layout'' stage of document display, and produces a typeset file, + traditionally in \texttt{DVI} format, though modern implementations will often target \texttt{PDF} instead. + + This document was prepared in \LaTeXe. + + \item[DVI] + \TeX \, traditionally outputs to the \texttt{DVI} (``DeVice Independent'') format: a binary format which consists of a + simple stack machine with instructions for drawing glyphs and curves\cite{fuchs1982theformat}. + + A \texttt{DVI} file is a representation of a document which has been typeset, and \texttt{DVI} + viewers will rasterize this for display or printing, or convert it to another similar format like PostScript + to be rasterized. + + \item[HTML] + The Hypertext Markup Language (HTML)\cite{html2rfc} is the widely used document format which underpins the + world wide web. In order for web pages to adapt appropriately to different devices, the HTML format simply + defined semantic parts of a document, such as headings, phrases requiring emphasis, references to images or links + to other pages, leaving the \emph{layout} up to the browser, which would also rasterize the final document. + + The HTML format has changed significantly since its introduction, and most of the layout and styling is now controlled + by a set of style sheets in the CSS\cite{css2spec} format. + + \item[PostScript] + Much like DVI, PostScript\cite{plrm} is a stack-based format for drawing vector graphics, though unlike DVI (but like \TeX), PostScript is + text-based and turing complete. PostScript was traditionally run on a control board in laser printers, rasterizing pages at high resolution + to be printed, though PostScript interpreters for desktop systems also exist, and are often used with printers which do not support PostScript natively.\cite{ghostscript} + + PostScript programs typically embody documents which have been typeset, though as a turing-complete language, some layout can be performed by the document. + + \item[PDF] + Adobe's Portable Document Format (PDF)\cite{pdfref17} takes the PostScript rendering model, but does not implement a turing-complete language. + Later versions of PDF also extend the PostScript rendering model to support translucent regions via Porter-Duff compositing\cite{porter1984compositing}. + + PDF documents represent a particular layout, and must be rasterized before display. +\end{description} + +\subsection{Precision in Document Formats} + +Existing document formats --- typically due to having been designed for documents printed on paper, which of course has +limited size and resolution --- use numeric types which can only represent a fixed range and precision. +While this works fine with printed pages, users reading documents on computer screens using programs +with ``zoom'' functionality are prevented from working beyond a limited scale factor, lest artefacts appear due to issues with numeric precision. +\TeX uses a $14.16$ bit fixed point type (implemented as a 32-bit integer type, with one sign bit and one bit used to detect overflow)\cite{beebe2007extending}. +This can represent values in the range $[-(2^14), 2^14 - 1]$ with 16 binary digits of fractional precision. + +The DVI files \TeX \, produces may use ``up to'' 32-bit signed integers\cite{fuchs1982theformat} to specify the document, but there is no requirement that +implementations support the full 32-bit type. It would be permissible, for example, to have a DVI viewer support only 24-bit signed integers, though many +files which require greater range may fail to render correctly. + +PostScript\cite{plrm} supports two different numeric types: \emph{integers} and \emph{reals}, both of which are specified as strings. The interpreter's representation of numbers +is not exposed, though the representation of integers can be divined by a program by the use of bitwise operations. The PostScript specification lists some ``typical limits'' +of numeric types, though the exact limits may differ from implementation to implementation. Integers typically must fall in the range $[-2^{31}, 2^{31} - 1]$, +and reals are listed to have largest and smallest values of $\pm10^{38}$, values closest to $0$ of $\pm10^{-38}$ and approximately $8$ decimal digits of precision, +derived from the IEEE 754 single-precision floating-point specification. + +Similarly, the PDF specification\cite{pdfref17} stores \emph{integers} and \emph{reals} as strings, though in a more restricted format than PostScript. +The PDF specification gives limits for the internal representation of values. Integer limits have not changed from the PostScript specification, but numbers +representable with the \emph{real} type have been specified differently: the largest representable values are $\pm 3.403\times 10^{38}$, the smallest non-zero representable values are +\footnote{The PDF specification mistakenly leaves out the negative in the exponent here.} +$\pm1.175 \times 10^{-38}$ with approximately $5$ decimal digits of precision \emph{in the fractional part}. +Adobe's implementation of PDF uses both IEEE 754 single precision floating-point numbers and (for some calculations, and in previous versions) 16.16 bit fixed-point values. + + \section{Rendering} Computer graphics comes in two forms: bit-mapped (or raster) graphics, which is defined by an array of pixel colours; @@ -70,7 +155,7 @@ never entirely successful, and sharp edges, such as those found in text and diag Vector graphics lack many of these problems: the representation is independent of the output resolution, and rather an abstract description of what it is being rendered, typically as a combination of simple geometric shapes like lines, -arcs and ``B\'ezier curves''. +arcs and ``B\'ezier curves''\cite{catmull1974asubdivision}. As existing displays (and printers) are bit-mapped devices, vector documents must be \emph{rasterized} into a bitmap at a given resolution. This bitmap is then displayed or printed. The resulting bitmap is then an approximation of the vector image at that resolution. @@ -102,9 +187,9 @@ renderer by nVidia\cite{kilgard2012gpu} as an OpenGL extension\cite{kilgard300pr On modern computer architectures, there are two basic number formats supported: fixed-width integers and \emph{floating-point} numbers. Typically, computers natively support integers of up to 64 bits, capable of representing all integers -between $0$ and $2^{64} - 1$\footnote{Most machines also support \emph{signed} integers, +between $0$ and $2^{64} - 1$, inclusive\footnote{Most machines also support \emph{signed} integers, which have the same cardinality as their \emph{unsigned} counterparts, but which -represent integers between $-(2^{63})$ and $2^{63} - 1$}. +represent integers in the range $[-(2^{63}), 2^{63} - 1]$}. By introducing a fractional component (analogous to a decimal point), we can convert integers to \emph{fixed-point} numbers, which have a more limited range, but a fixed, greater @@ -153,14 +238,22 @@ These types are typically built from several native data types such as integers paired with custom routines implementing arithmetic primitives.\cite{priest1991algorithms} These, therefore, are likely slower than the native types they are built on. - While traditionally, GPUs have supported some approximation of IEEE 754's 32-bit floats, modern graphics processors also support 16-bit\cite{nv_half_float} and 64-bit\cite{arb_gpu_shader_fp64} -IEEE floats. +IEEE floats. Note, however, that some parts of the GPU are only able to use some formats, +so precision will likely be truncated at some point before display. Higher precision numeric types can be implemented or used on the GPU, but are -slow. -\cite{emmart2010high} +slow.\cite{emmart2010high} +Pairs of integers $(a \in \mathbb{Z},b \in \mathbb{Z}\setminus 0)$ can be used to represent rationals. This allows +values such as $\frac{1}{3}$ to be represented exactly, whereas in fixed or floating-point formats, +this would have a recurring representation: +\begin{equation} + \underbrace{0}_\text{integer part} . \underbrace{01}_\text{recurring part} 01 \; \; 01 \; \; 01 \dots +\end{equation} +Whereas with a rational type, this is simply $\frac{1}{3}$. +Rationals do not have a unique representation for each value, typically the reduced fraction is used +as a characteristic element. \section{Quadtrees} diff --git a/figures/documenttimeline.pdf b/figures/documenttimeline.pdf new file mode 100644 index 0000000..7277997 Binary files /dev/null and b/figures/documenttimeline.pdf differ diff --git a/irc/#ipdf.log b/irc/#ipdf.log index f2d49a0..70a40a6 100644 --- a/irc/#ipdf.log +++ b/irc/#ipdf.log @@ -1210,3 +1210,340 @@ 23:16 * matches goes back to floating point nubers 23:18 <@matches> So I wonder how much of PostScript is a shameless rip off of METAFONT 23:18 <@matches> It's not like you have to give credit when you are trying to sell something +--- Day changed Tue May 20 2014 +09:29 <@matches> So it's quite easy to do a +09:29 <@matches> "fractal" in SVG/Javascript +09:29 <@matches> Where easy does not necessarily mean it isn't horrifying +09:30 <@matches> Aaand there goes all my RAM +21:43 <@matches> Casually slipping in a footnote directing the reader to rabbitgame.net in my lit review... +21:43 <@matches> So my webby documents section is probably my least shitly written, or maybe that's just because it has pointless pretty pictures in it +21:44 <@matches> It is basically +21:44 <@matches> "Here are the w3c standards" +21:44 <@matches> "Here are some pretty pictures made with SVGs, can't you just see the DOM leaching out of them?" +21:44 <@matches> :S +21:45 <@matches> I'm not sure how well I am treading the line between actually reviewing literature and just giving examples of things +--- Day changed Wed May 21 2014 +12:09 <@matches> PDF is a mess of a "standard" +12:09 <@matches> As are all useful things I suppose +12:09 <@matches> As far as I can work out +12:09 <@matches> It is not a DOM but a graph +12:10 <@matches> However, it is also PostScript-y +12:10 <@matches> But they deal with "interactivity" +12:10 <@matches> By including XHTML +12:10 <@matches> And having an "action dictionary" which is literally just a string of javascript +12:11 <@matches> I just +12:11 <@matches> Can't even begin to understand how it all works +12:12 <@matches> But yeah, not really "Crippled Postscript" so much as "Everything including the kitchen sink except for a few bits of Postscript" +12:13 <@matches> So the Postscript part of it is no longer turing complete, but I don't think you can pretend something in which you can stick arbitrary Javascript isn't turing complete :S +12:13 <@matches> Oh and even though they have XHTML-ish stuff their Javascript API is totally different to W3Cs +12:13 <@matches> Hooray +12:14 <@matches> I suppose the fact that nothing except Adobe products seem to actually use Javascript/XHTML stuff is telling us something about this approach +12:15 <@matches> I reckon the ideal standard +12:15 <@matches> Would probably be the DOM but with the "we actually care about efficiency" parts of PDF +12:18 <@matches> The interactivity of web pages combined with the actually professional looking type setting of PDF +12:18 <@matches> Or just plain text files +12:19 <@matches> Plain text files are an underapreciated Document Format +12:23 <@matches> Ah, I think it sort of makes sense now +12:23 <@matches> PDF uses what is essentially PostScript to construct this graph thing +12:24 <@matches> And the graph thing can have elements in it that are just "Make this part of the graph the equivelant DOM from this XHTML" +12:25 <@matches> And it can also have elements that are "Execute this Javascript to dynamically change this graph" +12:25 <@matches> But the normal elements are just like PostScript as it would be sent to a printer to show the thing statically +12:26 <@matches> So when it's rendered it is interpreting the Postscripty bits and when its being interacted with it is updating the Postscripty bits +12:26 <@matches> I *think* +12:26 <@matches> This is different from the webby standards which don't really specify how things are actually drawn +12:27 <@matches> No wait it's not +12:27 <@matches> Argh I don't know +12:27 <@matches> You can't classify this shit +12:27 <@matches> Document Goes In -> Pixels Come Out +12:28 * matches despairs +12:49 < sulix> You will find (slightly) less despair if you relegate javascript to the footnote where it belongs. :P +13:07 < sulix> Hmm... the HTML 2 spec looks like it almost got properly IETF standardised. Might just reference that. +13:12 < sulix> Oh, they obsoleted it and replaced it with a "Just look at w3c" standard... +13:17 <@matches> But PDF isn't just flattened PostScript +13:17 <@matches> It is like, everything +13:17 <@matches> All merged into one horrifying standard +13:18 <@matches> Oh well +13:18 <@matches> I made my shape example in PostScript by removing the alpha +13:18 <@matches> I'm not sure whether there's any point in including it as a figure +13:18 <@matches> Most of the PostScript file is taken up by the header +13:19 < sulix> Holy balls, I just looked up the CSS spec. There are like 200 of them. +13:19 <@matches> Yeah I just used CSS2 +13:19 <@matches> The others are like +13:19 <@matches> Colours +13:19 <@matches> Or something +13:20 < sulix> That's what I'm using, too. +13:20 < sulix> The tired and tested "what gets top result on google" method of paper selection. +13:20 <@matches> Closer examination reveals that most of the PostScript header is defining commands to be shorter :P +13:20 <@matches> Amazing +13:20 <@matches> Cairo probably needs to get referenced somewhere +13:21 <@matches> If only so I have a way out of my Javascript in PDF section by saying that Cairo doesn't support it +13:21 <@matches> I desperately need to escape the Javascript +13:25 < sulix> I think the secret is to use the phrase "rendering model" wherever possible. +13:51 <@matches> Dammit +13:51 <@matches> So I have that wierd shape in both SVG and PostScript now +13:51 <@matches> The SVG version fits beautifully and is wonderfully concise and you can see how SVG works +13:51 <@matches> The PostScript version is just like, BLARGH WALL OF TEXT +13:51 <@matches> ALSO WE DON'T HAVE ALPHA +13:52 <@matches> So I'm not sure whether to cut just the PostScript one or both of them now :S +13:53 <@matches> PDF looks distinctly not like it is just PostScript the more I think about it +13:53 <@matches> It's like "We are using the same model as PostScript in that commands go in and pixels come out" +13:54 <@matches> By that logic SVG is also the same +13:54 <@matches> I think what I should do is just make an appendix +13:54 <@matches> "A Shape in 20 Document Formats" +13:56 <@matches> SVG really is the most concise compared to PS and PDF +14:53 <@matches> Right I can simplify the god awful mess of PS a bit +14:54 <@matches> I'm hoping I can just say "Here is the PS reference and here is some PostScript as you can see it is interpreted-ish" +14:54 <@matches> Cairo appeared to draw each element backwards and reverse it after drawing it +14:54 <@matches> It is stupid +14:57 <@matches> Like, why bother doing definitions like m == moveto etc +14:57 <@matches> If you're just going to stick pointless crap in +14:57 <@matches> My document is half the size without using single letter definitions +15:31 < sulix> Welp. The wrath of Tim is upon us... +15:41 <@matches> I'm choosing to latch onto the "quite good" rather than "some way to go" +15:45 <@matches> It sort of sounds like "Well at least you gave me a pdf file" :P +15:48 < sulix> One day, all anyone will use are ipdf files... +15:50 <@matches> Right, TeX is very different from PostScript I think +15:50 <@matches> At least, pure tex +15:50 < sulix> Also, holy mackerel, I might have just found a paper on precision in document formats... +15:50 <@matches> :O +15:50 < sulix> It even quotes Kahan +15:50 <@matches> :OOO +15:50 < sulix> https://www.tug.org/TUGboat/tb28-3/tb90beebe.pdf +15:50 <@matches> What is it +15:50 <@matches> Emergency rewrite of entire lit review +15:51 < sulix> It's a bit TeX specific, but still. +15:51 <@matches> That's alright +15:51 <@matches> It ties in amazingly with my decision to hamfist TeX and Metafont into the lit review +15:52 <@matches> Although I'm not sure it is wise because it means I have to talk about fonts and things +15:53 <@matches> I wonder if "Fonts are just bezier curves" is sufficient +15:53 <@matches> They are always treated seperately to curved paths +15:53 <@matches> Which is understandable because it's a bit inconvenient if you want text in a document to have to define the paths for each glyph +15:54 <@matches> Anyway I'm glad my assertion that Beziers are the only curves we care about is proving true +16:33 <@matches> Are you in a position to retrieve this "envelope" +16:57 < sulix> Not tonight: I'm going to pick it up tomorrow morning. +16:58 < sulix> And hopefully replace it with a sparkling, glorious review of literature. +17:00 <@matches> :( +17:00 <@matches> I cannot concentrate now +17:00 <@matches> Because I haven't read the comments, I could be doing everything wrong! +17:02 <@matches> Admitedly I'm technically "working" right now +17:27 < sulix> My "Document Format Taxonomy" is almost complete... Just need to add SVG. +17:28 < sulix> (And close my eyes and assert that Microsoft Word documents are not actually documents or something) +17:28 <@matches> I am jealous +17:29 <@matches> I just added PostScript it's not particularly well written +17:29 < sulix> (I don't have any pretty pictures or code, though) +17:29 < sulix> I've discovered that, despite having totally different numbers for "implementation limits", the PostScript and PDF specs are (a) talking about the same data types and (b) lying. +17:32 <@matches> Bahaha +17:32 < sulix> Do you know where the SVG spec mentions precision? +17:33 <@matches> Ah, I regret not noting the page number +17:33 <@matches> But a text search should find it +17:33 <@matches> It specifically says things +17:33 <@matches> I am interested in whether or not Javascript is subject to the same requirements +17:34 < sulix> All I've found is "must be correct within 1px at 1:1 zoom", and "It is suggested that viewers attempt to keep a high degree of accuracy when zooming". +17:35 <@matches> There's something that is about IEEE floats +17:35 < sulix> Aaah... and a "High-Quality Viewer" must support at least double precision on coordinate system transforms. +17:35 < sulix> But "IEEE" does not show up in a search of the spec. +17:36 <@matches> Ah right +17:36 <@matches> My brain just inserts IEEE whenever I hear "single" or "double" now +17:36 <@matches> "An IEEE Double Episode of MasterChef!" +17:36 <@matches> (Which would probably be infinitely more exciting) +17:37 < sulix> (Or would it be NaNly more exciting...? :P) +17:38 <@matches> Speaking of "where things are" are we meant to reference page numbers in standards? +17:38 <@matches> I guess I'll find out when I read Tim's comments +17:39 <@matches> Excellent my lab finished 20 minutes early +17:39 <@matches> And also 40 minutes later than the other demonstrators :S +17:40 <@matches> Do you want me to pick up your comments and scan them and email them to you? :P +17:40 < sulix> That'd be great. +17:40 < sulix> Also probably depressing. +17:41 < sulix> But great. +17:41 <@matches> Alright, ETA Transperth + Scanner is probably broken O'clock +17:42 < sulix> I'll savour the blissful ignorance. +19:50 <@matches> I don't think scanning is worth it, I'll just spam the feedback into this channel +19:50 <@matches> First up, David's Lit Review +19:50 <@matches> There is either "Gool" or "Cool" or possibly "Good" written and underlined on the first page +19:51 <@matches> The opening paragraph is "A little overdramatic?" +19:51 <@matches> (Since it's a question, I'd like to voice a "No" opinion here) +19:51 <@matches> The DOM in a footnote is not defined +19:52 <@matches> Page 2 +19:52 <@matches> There is a tick +19:52 <@matches> A question mark in regards to the hyphenated bit in the rendering paragraph +19:53 <@matches> Say "avoid" instead of lack +19:53 <@matches> Add what the "basic primitives" actually are +19:53 <@matches> There appears to be an issue with hyphenated phrases the hyphens are circled +19:53 <@matches> Another tick! +19:54 <@matches> Oh, you have a $2^64 - 1$\footnote{} which is unfortunate because it looks like $2^64 - 1^2$ +19:54 <@matches> That footnote (probably others?) would work in the paragraph without being footnote +19:54 <@matches> Fullstops go after \cite{} +19:55 <@matches> A tick (in regards to the quadtree diagram) +19:55 <@matches> The concluding comment +19:56 <@matches> "OK, Much to do (underline) There doesn't seem to be much scholarly references used. You have enough, but you seem to cite them in the context of their contributions to standards as opposed to how they addressed a research question or open problem" +19:57 <@matches> And (not even our references lists are safe!) +19:57 < sulix> Oh dear. +19:57 <@matches> Where referencing web pages, include the date retrieved +19:57 <@matches> That's it +19:57 <@matches> I shall move on to my own for completeness although you might not need to care +19:58 < sulix> Phew, that's not quite as horrible as it could have been, I guess. +19:58 <@matches> I also have "Good" +19:58 <@matches> There are some "I didn't read this bit but it had words that seemed vaguely relevant" ticks in Chapter 1 (Introduction) and 2 (Proposal) +19:59 <@matches> Sorry Tim if you read this +19:59 <@matches> But when I mark lab reports for Physics that's usually where I put the ticks :P +19:59 < sulix> (The secret comes out) +20:00 <@matches> (In my defence I did spend two hours marking the reports this morning and I am paid for none of them, so...) +20:01 <@matches> (It's the bits that I scribble all over that are where the marking gets done) +20:01 <@matches> (I think I've covered myself in case the lawyers of any of my students read this channel now, so I will resume my story...) +20:02 <@matches> Attention is called to the many glaring instances of [?] and "Refer to Section ?" +20:02 <@matches> :S +20:02 <@matches> I should probably define a vector image before comparing it to a raster image +20:02 <@matches> Incidentally my Fox looks amazing +20:02 <@matches> On printed paper +20:03 <@matches> (Tim didn't say that, that's just my modest opinion) +20:03 <@matches> Ahem. +20:03 < sulix> Can you see the difference between the vector and bitmap versions easily? +20:04 <@matches> At the original scale there is, alas, a very slight fuzziness +20:04 <@matches> But I reckon the markers will be old and blind +20:04 <@matches> Hmm, I should either be more careful about what I say here or stop logging this channel... +20:04 * sulix hopes they don't read that. +20:04 <@matches> Sorry markers +20:04 <@matches> I worship your power +20:04 <@matches> Please do not smite me +20:05 <@matches> The scaled up version is interesting +20:06 <@matches> It looks a bit like your circle with the blocky non-anti-aliased bit but actually anti-aliased by the pdf viewer +20:06 < sulix> I guess the scaling would be done by the printer's postscript RIP. +20:06 <@matches> Yeah I guess +20:07 < sulix> (Side note: I find the whole idea of Postscript interpreters being called RIPs somewhat fitting) +20:07 <@matches> The PDF decides not to antialias it and converts it to Postscript and then the postscript interpreter adds its own antialiasing? +20:07 <@matches> I don't know +20:07 * sulix joins the "SVG is the least broken format" club. +20:07 <@matches> It's very tempting to descend into footnote madness with this lit review +20:07 <@matches> "By the way, this very document is an example of this thing!" +20:07 <@matches> Etc +20:08 <@matches> Moving on +20:08 <@matches> The point of talking about vector displays at all is questioned (at least I think that's what the "Why?" refers to here) +20:08 <@matches> Or it could be "Why is there yet another ?? in this paragraph" I guess +20:09 <@matches> But probably the former +20:09 <@matches> I do not have space to include Bresenham's algorithm +20:09 <@matches> Oh boy, he's going to love what I did with the SVG and Postscript images... +20:09 <@matches> But I am glad I do not have to actually explain Bresenham's algorithm because it's actually annoyingly detailed +20:11 < sulix> All sane descriptions of Bresenham's algorithm end up being cascades of "By symmetry" anyway. +20:11 <@matches> I need to actually find a reference that applied Wu/Bresenham directly to a non-straight line +20:11 <@matches> You said Bresenham adapted his algorithm to circles but I don't think I'll bother unless someone adapted them to beziers +20:12 <@matches> Bresenham's paper on rasterisation techniques basically says "Compute some points close enough together and then just connect them with straight lines" +20:12 <@matches> But I think things might have advanced since the 1980s +20:12 < sulix> Well, we can compute points that are closer together and draw more lines, I guess. +20:13 <@matches> Next, Tim wants an example of a spline +20:13 <@matches> (Oh boy have I got that covered) +20:13 <@matches> My mathematics terminology on Beziers is not really great +20:14 <@matches> Well it's right but confusing maybe +20:14 <@matches> Or I just need to say "t is a trajectory parameter" +20:14 <@matches> Haha +20:15 <@matches> He found one of my "????" that is actually just me typing question marks and not a broken reference +20:15 <@matches> The *entire* section on shading and compositing has a big question mark +20:15 <@matches> Oh dear +20:15 <@matches> I just finished writing the compositing bit +20:16 <@matches> I hope the question mark means "Why isn't this written" and not "Why is this in here" +20:16 <@matches> Because it is quite useful for an excuse to say PostScript can't do alpha +20:17 <@matches> I need to refer to the IM (I really don't think that's a thing) and DOM when citing Hayes +20:17 <@matches> "I don't think Turing Completeness is essential" (Big cross through the Crippled Interpreted Model) +20:17 <@matches> Fair enough +20:17 <@matches> A tick appears +20:18 <@matches> Predictably in the web based documents part +20:18 <@matches> I need to explain why Ipython is cool if I want to talk about it +20:18 <@matches> My entire section on Precision as defined in the various formats is ? +20:20 <@matches> My still to be completed/started section on Graphics APIs, GPUs and Arbitrary Precision is three question marks and "How's all this going" +20:20 <@matches> The progress report gets a single tick +20:20 <@matches> And the references have similar issues +20:20 <@matches> Well +20:21 <@matches> I'll take a few minutes to quiver in terror +20:21 <@matches> But I think if I can just find a way to not sleep and still maintain productivity, I might be able to pull this off +20:22 <@matches> Interestingly he didn't call me out for just talking about standards +20:22 <@matches> But now I realise that's because I didn't have all the crap I've just written on standards in there +20:23 < sulix> It's going to be a long night, but I think we'll manage it. +20:23 <@matches> Mine will be too long but I don't care +20:24 <@matches> I'll ask for an extension to prepare a condensed version if I must :P +20:28 <@matches> It's kind of funny I've been spending more time making my vector image in SVG and PostScript nice than actually writing about either of those standards +20:50 <@matches> Argh the idea of making my koch snowflake example for PS just got in my head +20:50 <@matches> Which would be brilliant I guess if the topic was still "Fractal Document Formats" +20:50 <@matches> It probably would be useful if I could demonstrate precision issues.. +20:50 <@matches> NO +20:50 <@matches> MUST WRITE +20:50 <@matches> WORDS +20:50 <@matches> NOT PICTURES +20:51 <@matches> But still it would make the PostScript and SVG sections consistent with each other... +20:51 <@matches> NO +20:51 <@matches> Must control urge to put pointless pictures in +20:51 <@matches> No matter how much it seems like a good idea +20:51 <@matches> And not pointless +20:52 <@matches> Help I'm losing this battle +20:53 <@matches> It is probably actually a better way of making a Koch curve than the hideous Javascript parsing of strings version +22:50 <@matches> By the way, you can totally have "pre layout" stages in PostScript since you can define your own operators +22:50 <@matches> Or do I misunderstand your sentence +22:50 <@matches> Oh well it sounds smart anyway +22:51 <@matches> In fact it's a lot more concise than my DOM-y section +22:52 <@matches> I should sign my Lit Review as Captain Obvious +22:57 < sulix> My current version does have a "PostScript programs typically embody documents which have been type- +22:57 < sulix> set, though as a turing-complete language, some layout can be performed +22:57 < sulix> " sentence. +22:57 < sulix> by the document. +23:01 * sulix is still a little bit concerned about how he should reference things for their solutions to open problems rather than their contributions to standards. +23:03 <@matches> I think I am managing to do it +23:03 <@matches> I will commit something at some point +23:03 < sulix> I'm hoping that rewriting most of the rendering section with painful discussions of algorithms will do it. +23:03 <@matches> An example is Porter and Duff Compositing +23:04 <@matches> Because PostScript doesn't have alpha and I am really hoping that's just because Adobe had moved on to PDF by the time alpha was a thing +23:04 <@matches> And not because they thought alpha was dumb :P +23:05 <@matches> So I can relate Porter and Duff's model to the standards that do use it and say how it solves a problem that the standards that don't use it still have +23:05 <@matches> And then I can sit back in satisfaction +23:05 <@matches> And realise this says fuck all about precision +23:06 <@matches> But at least by talking about it, I have eliminated it from the set of things we need to worry about when talking about precision :P +23:06 < sulix> I've got a section which basically goes through all of the different document formats and looks at what their specs say about precision now. +23:06 <@matches> Yeah I have that, but it was dot-pointed +23:06 <@matches> I thought that would be OK actually but it has a question mark here... :S +23:06 < sulix> Basically most of them say "implementation-defined" anyway. +23:06 <@matches> Oh right because I was saying random stuff about how Postscript *used* to not have IEEE +23:07 <@matches> Yeah it is odd that the standards don't actually reference IEEE +23:07 <@matches> You'd think, since it's a standard... +23:07 <@matches> Instead they just say "single" or "double" or "it might be single if you're lucky but we don't care really" +23:08 <@matches> I assume "single" is widely accepted to mean IEEE single +23:08 < sulix> From my reading of the Postscript spec, it says basically "We've put IEEE here, but ask your printer manufacturer because they could be using anything for all we care." +23:08 <@matches> Ah I will check that more carefully +23:08 <@matches> But it sounds about right +23:08 < sulix> They give "typical limits" for their data types, but specifically do not specify what they are to be implemented as. +23:08 <@matches> I don't think I have the time to look at what PostScript did historically before IEEE-754 although it would be kind of interesting to know +23:09 <@matches> PostScript also does a bunch of silly maths because of units +23:09 < sulix> The idea being that each postscript interpreter could do whatever they liked. +23:09 <@matches> Cool +23:09 <@matches> I should know this already :S +23:09 <@matches> I just included a single character as a figure +23:10 <@matches> But I want to actually work out how to do it in LaTeX by setting the size of the font appropriately +23:10 < sulix> The PDF spec says pretty much the same thing, but notes that Adobe's implementation uses "Mostly IEEE singles" but "used to use 16.16 fixed point" and "still uses it for some things" +23:10 <@matches> I did see that +23:10 < sulix> TeX using 14.16 fixed point. +23:10 < sulix> DVI uses "up-to 32bit" signed integers. +23:11 <@matches> So basically no one actually uses IEEE for anything :P +23:11 <@matches> Good work +23:11 <@matches> I shall panic a bit and then try and actually do that work myself +23:11 < sulix> SVG uses "implementation defined" or "double-precision floating point" "for coordinate transforms" if you want to be certified "High Quality" +23:12 <@matches> I saw that one +23:12 <@matches> But I'm skeptical about how this plays with Javascript +23:12 <@matches> Not for High Quality even, just in general +23:12 < sulix> Javascript numbers are always IEEE 754 doubles. +23:13 <@matches> Ah thanks +23:13 < sulix> (Even their integers are IEEE 754 doubles, which just happen to be integers) +23:13 <@matches> Yes I have heard this before +23:13 <@matches> From you probably :P +23:14 < sulix> I don't have a source for that, and I'm not going to read the ECMAscript spec to find one, though. +23:16 <@matches> Oh right, Javascript is actually ECMAscript +23:16 <@matches> I forgot that +23:17 <@matches> Dammit I am struggling to stay awake here +23:17 <@matches> I'm not sure whether it's healthier to try to not sleep and give Tim a draft tomorrow and demand he read it in enough time to make last minute changes +23:18 <@matches> Or sleep and then be more coherant tomorrow +23:18 <@matches> I guess I'll try and finish a couple more sections +23:19 < sulix> I'm going to try to finish this tonight. +23:20 * sulix has another assignment due Friday that needs significant work. +--- Day changed Thu May 22 2014 +00:47 <@matches> So X just managed to totally shit itself +00:47 <@matches> Time to see when I last pressed Ctrl-S +00:48 <@matches> Oh good (I typically press it once per sentence) +00:48 <@matches> I hope it wasn't one of my SVGs that broke everything +00:49 <@matches> Making all my figures in SVG +00:49 <@matches> Lovingly hand written +00:49 <@matches> I'm not sure that was a good idea diff --git a/papers.bib b/papers.bib index 1610bd9..a913361 100644 --- a/papers.bib +++ b/papers.bib @@ -16,6 +16,46 @@ year={2006} } +@misc{texdraft, + title={Preliminary preliminary description of {\TeX}}, + author={Knuth, Donald}, + year={1977}, + howpublished={\url{http://www.saildart.org/TEXDR.AFT[1,DEK]1}} +} + +@article{fuchs1982theformat, + title={The Format of {\TeX}'s {DVI} files}, + author={Fuchs, David}, + year={1982}, + journal={TUGBoat}, + volume={3}, + number={2}, + howpublished={\url{http://www.tug.org/TUGboat/Articles/tb03-2/tb06software.pdf}} +} + +% HTML 2 spec +@article{html2rfc, + title={Hypertext Markup Language -- 2.0}, + author={Berners-Lee, Tim and Connolly, Daniel}, + year={1995}, + journal={Internet RFC 1866} +} + +% CSS 2 spec +@misc{css2spec, + title={Cascading Style Sheets, Level 2, {CSS2} Specification}, + author={Bos, Bert and Wium Lie, Håkon and Lilley, Chris and Jacobs Ian}, + date={1998}, + howpublished={\url{http://www.w3.org/TR/1998/REC-CSS2-19980512/}} +} + +@misc{ghostscript, + title={GhostScript, an interpreter for the PostScript language and PDF}, + author={Artifex Software}, + year={1988}, + howpublished={\url{http://www.ghostscript.com/}} +} + %%%%%%%%%%%%%%%%%%%%%%%% % Basic Rendering Theory %%%%%%%%%%%%%%%%%%%%%%%% @@ -72,6 +112,16 @@ publisher={ACM} } +% Bézier curves and friends. +@phdthesis{catmull1974asubdivision, + author = {Catmull, Edwin Earl}, + title = {A Subdivision Algorithm for Computer Display of Curved Surfaces.}, + year = {1974}, + note = {AAI7504786}, + publisher = {The University of Utah}, +} + + %%%%%%%%%%%%%%%%%%%%%%% % Floating-pt Precision %%%%%%%%%%%%%%%%%%%%%%% @@ -956,3 +1006,12 @@ ISSN={0272-1716},} month = "April", journal = "Adobe Acrobat Reader SDK" } +% Holy mackerel, a paper on precision in document formats! +@article{beebe2007extending, + author={Beebe, Nelson}, + title={Extending {\TeX} and {METAFONT} With Floating-Point Arithmetic}, + year={2007}, + journal={{TUGboat}}, + volume={28}, + number={3}, +} diff --git a/references/beebe2007extending.pdf b/references/beebe2007extending.pdf new file mode 100644 index 0000000..833b49c Binary files /dev/null and b/references/beebe2007extending.pdf differ diff --git a/references/fuchs1982theformat.pdf b/references/fuchs1982theformat.pdf new file mode 100644 index 0000000..f635937 Binary files /dev/null and b/references/fuchs1982theformat.pdf differ