From: Sam Moore Date: Mon, 7 Apr 2014 13:52:26 +0000 (+0800) Subject: Add some references related to FPUs (hardware) X-Git-Url: https://git.ucc.asn.au/?a=commitdiff_plain;h=c6ca9623378dc9e4517a041ca309a38f36de9e04;p=ipdf%2Fdocuments.git Add some references related to FPUs (hardware) - Most talk about VHDL which in theory I am supposed to know - Not summarised in great detail; adopting a bit of a shotgun approach at the moment --- diff --git a/LiteratureNotes.pdf b/LiteratureNotes.pdf index 426e0a4..9adb421 100644 Binary files a/LiteratureNotes.pdf and b/LiteratureNotes.pdf differ diff --git a/LiteratureNotes.tex b/LiteratureNotes.tex index 95b2997..08b23e1 100644 --- a/LiteratureNotes.tex +++ b/LiteratureNotes.tex @@ -379,6 +379,8 @@ Performance was much improved over the software rasterization and over XRender a on all except nVidia hardware. However, nVidia's XRender implementation did slow down significantly when some transformations were applied. +%% Sam again + \section{Boost Multiprecision Library\cite{boost_multiprecision}} \begin{itemize} @@ -387,6 +389,72 @@ some transformations were applied. \item Precision is fixed... {\bf possible approach to project:} Use \verb/boost::mpf_float/ and increase \verb/N/ as more precision is required? \end{itemize} + +% Some hardware related sounding stuff... + +\section{A CMOS Floating Point Unit\cite{kelley1997acmos}} + +The paper describes the implentation of a FPU for PowerPC using a particular Hewlett Packard process (HP14B 0.5$\mu$m, 3M, 3.3V). +It implements a ``subset of the most commonly used double precision floating point instructions''. The unimplemented operations are compiled for the CPU. + +The paper gives a description of the architecture and design methods. +This appears to be an entry to a student design competition. + +Standard is IEEE 754, but the multiplier tree is a 64-bit tree instead of a 54 bit tree. +`` The primary reason for implementing a larger tree is for future additions of SIMD [Single Instruction Multiple Data (?)] instructions similar to Intel's MMX and Sun's VIS instructions''. + +HSPICE simulations used to determine transistor sizing. + +Paper has a block diagram that sort of vaguely makes sense to me. +The rest requires more background knowledge. + +\section{Simply FPU\cite{filiatreault2003simply}} + +This is a webpage at one degree of seperation from wikipedia. + +It talks about FPU internals, but mostly focuses on the instruction sets. +It includes FPU assembly code examples (!) + +It is probably not that useful, I don't think we'll end up writing FPU assembly? + +FPU's typically have 80 bit registers so they can support REAL4, REAL8 and REAL10 (single, double, extended precision). + + +\section{Floating Point Package User's Guide\cite{bishop2008floating}} + +This is a technical report describing floating point VHDL packages \url{http://www.vhdl.org/fphdl/vhdl.html} + +In theory I know VHDL (cough) so I am interested in looking at this further to see how FPU hardware works. +It might be getting a bit sidetracked from the ``document formats'' scope though. + +The report does talk briefly about the IEEE standard and normalised / denormalised numbers as well. + +See also: Java Optimized Processor\cite{jop} (it has a VHDL implementation of a FPU). + +\section{Low-Cost Microarchitectural Support for Improved Floating-Point Accuracy\cite{dieter2007lowcost}} + +Mentions how GPUs offer very good floating point performance but only for single precision floats. + +Has a diagram of a Floating Point adder. + +Talks about some magical technique called "Native-pair Arithmetic" that somehow makes 32-bit floating point accuracy ``competitive'' with 64-bit floating point numbers. + +\section{Accurate Floating Point Arithmetic through Hardware Error-Free Transformations\cite{kadric2013accurate}} + +From the abstract: ``This paper presents a hardware approach to performing ac- +curate floating point addition and multiplication using the idea of error- +free transformations. Specialized iterative algorithms are implemented +for computing arbitrarily accurate sums and dot products.'' + +The references for this look useful. + +It also mentions VHDL. + +So whenever hardware papers come up, VHDL gets involved... +I guess it's time to try and work out how to use the Opensource VHDL implementations. + + + \pagebreak \bibliographystyle{unsrt} \bibliography{papers} diff --git a/ProjectProposalDavid.pdf b/ProjectProposalDavid.pdf index 6d9c246..5860fbb 100644 Binary files a/ProjectProposalDavid.pdf and b/ProjectProposalDavid.pdf differ diff --git a/ProjectProposalSam.pdf b/ProjectProposalSam.pdf index 64c2426..f12fdcb 100644 Binary files a/ProjectProposalSam.pdf and b/ProjectProposalSam.pdf differ diff --git a/papers.bib b/papers.bib index 52350bc..7cea9c9 100644 --- a/papers.bib +++ b/papers.bib @@ -424,4 +424,61 @@ doi={10.1109/ARITH.1991.145549},} howpublished = {\url{http://www.boost.org/doc/libs/1_53_0/libs/multiprecision/doc/html/boost_multiprecision/}} } +% A CMOS Floating Point Unit +@MISC{kelley1997acmos, + author = {Michael J. Kelley and Matthew A. Postiff and Advisor Richard and B. Brown}, + title = {A CMOS Floating Point Unit}, + year = {1997} +} + +@misc{filiatreault2003simply, + author = {Raymond Filiatreault}, + title = "Simply FPU", + year = 2003, + howpublished = {\url{http://www.website.masmforum.com/tutorials/fptute/index.html}} +} + +@article{bishop2008floating, + author = {David Bishop}, + year = 2008, + howpublished = {\url{http://www.vhdl.org/fphdl/Float_ug.pdf}}, + title = {Floating Point Package User's Guide}, + note = {Technical Report}, + journal = {EDA Industry Working Groups} +} + +@article{dieter2007lowcost, + author = {Dieter, William R. and Kaveti, Akil and Dietz, Henry G.}, + title = {Low-Cost Microarchitectural Support for Improved Floating-Point Accuracy}, + journal = {IEEE Comput. Archit. Lett.}, + issue_date = {January 2007}, + volume = {6}, + number = {1}, + month = jan, + year = {2007}, + issn = {1556-6056}, + pages = {13--16}, + numpages = {4}, + url = {http://dx.doi.org/10.1109/L-CA.2007.1}, + doi = {10.1109/L-CA.2007.1}, + acmid = {1271937}, + publisher = {IEEE Computer Society}, + address = {Washington, DC, USA}, + keywords = {B Hardware, B.2 Arithmetic and Logic Structures, B.2.4 High-Speed Arithmetic, B.2.4.b Cost/performance, C Computer Systems Organization, C.0 General, C.0.b Hardware/software interfaces, C.1 Processor Architectures, C.1.5 Micro-architecture implementation considerations, G Mathematics of Computing, G.1 Numerical Analysis, G.1.0 General, G.1.0.e Multiple precision arithmetic, I Computing Methodologies, I.3 Computer Graphics, I.3.1 Hardware Architecture, I.3.1.a Graphics processors}, +} + +@misc{jop, + author = "jop-devel", + title = "Java Optimized Processor", + howpublished = "\url{https://github.com/jop-devel/jop}" +} + +@inproceedings{kadric2013accurate, + title={Accurate Parallel Floating-Point Accumulation}, + author={Kadric, Edin and Gurniak, Paul and DeHon, Andr{\'e}}, + booktitle={Computer Arithmetic (ARITH), 2013 21st IEEE Symposium on}, + pages={153--162}, + year={2013}, + organization={IEEE} +} diff --git a/references/bishop2008floating.pdf b/references/bishop2008floating.pdf new file mode 100644 index 0000000..4cafc30 Binary files /dev/null and b/references/bishop2008floating.pdf differ diff --git a/references/dieter2007lowcost.pdf b/references/dieter2007lowcost.pdf new file mode 100644 index 0000000..d3552b8 Binary files /dev/null and b/references/dieter2007lowcost.pdf differ diff --git a/references/isbn9789514296598.pdf b/references/isbn9789514296598.pdf new file mode 100644 index 0000000..a3a9b43 Binary files /dev/null and b/references/isbn9789514296598.pdf differ diff --git a/references/kadric2013accurate.pdf b/references/kadric2013accurate.pdf new file mode 100644 index 0000000..dc8f18b Binary files /dev/null and b/references/kadric2013accurate.pdf differ diff --git a/references/kelley1997acmos.pdf b/references/kelley1997acmos.pdf new file mode 100644 index 0000000..3900c29 Binary files /dev/null and b/references/kelley1997acmos.pdf differ