= How are things laid out? = 1 scan page contains 2 physical pages. each physical page may contain either 2 or 1 logical pages (future: or 4 slides!) = What do we do? = 0. init, process args, etc. 1. determine page count 2. determine depth 3. determine dpi 4. foreach double-page-spread (scan page) 4.1. extract scan page from pdf, save as png 4.2. run a mask over it to pull off large black areas 4.3. run unpaper over it, creating 2 pages (physical page) 4.4. foreach physical page 4.4.1. remask and retrim 4.4.2. attempt to detect if a physical page contains 2 logical pages, 4.4.2.1. if so split with unpaper 4.4.3. do any final processing (resize for bebook) 5. move all the final pictures into a final picture directory In the accidentally deleted code we used ocropus's binarise stuff to do some extra cleaning. = What options do we need? = Anything we attempt to detect automatically should have the option to set manually - depth - dpi - probably which pages we want to process - how many logical pages a physical page has * an option to set a default and certain exceptions would be ace. - options for final output - options to ignore partial products