From: Sam Moore Date: Wed, 21 May 2014 03:28:51 +0000 (+0800) Subject: Horrible notes on PDF X-Git-Url: https://git.ucc.asn.au/?a=commitdiff_plain;h=439a25f18a7a9ef114a01b11ead914d94d088ef4;p=ipdf%2Fsam.git Horrible notes on PDF At bus port now --- diff --git a/chapters/Background.tex b/chapters/Background.tex index 87d18ce..f11c1be 100644 --- a/chapters/Background.tex +++ b/chapters/Background.tex @@ -99,6 +99,30 @@ In this section we will consider various approaches and motivations to specifyin \subsection{The Portable Document Format} +``A PDF file should be thought of as a flattened representation of a data structure +consisting of a collection of objects that can refer to each other in any arbitrary +way.'' + +The PDF 1.7 standard describes a format which is \rephrase{essentially PostScript plus everything in the kitchen sink}. +\begin{itemize} + \item PDF is not just crippled postscript + \item Objects - has a type system, like a programming language, not like the DOM where all objects are fundamentally the same - this is similar to PostScript + \item File structure - Header, body, reference table (location of objects in file), trailer (location of reference table and special objects) + \begin{itemize} + \item Read the file from the end + \item File can be updated incrementally as long as the trailer is at the end + \end{itemize} + \item Document structure - This is basically a graph, wheras the DOM is a tree + \item Content streams - objects but conceptually different - operators or instructions + \item Interactivity --- At this point, PDF suddenly changes from being PostScript to being XML + \begin{itemize} + \item + \end{itemize} +\end{itemize} + +The biggest difference between the PDF design philosophy and the HTML5 philosophy is the emphasis in PDF on the actual file format. +This means PDF is more complicated but also more efficient (at least, we would hope so). + \subsection{Scientific Computation Packages}