a dependency list: working towards a simpler install process

[dja/scandal.git] / architecture.txt
diff --git a/architecture.txt b/architecture.txt

index 95251d3..981d872 100644 (file)
--- a/architecture.txt
+++ b/architecture.txt
@@ -12,17 +12,19 @@ each physical page may contain either 2 or 1 logical pages
  3. determine dpi
  4. foreach double-page-spread (scan page)
         4.1. extract scan page from pdf, save as png
-       4.2. run a mask over it to pull off large black areas
-       4.3. run unpaper over it, creating 2 pages (physical page)
-       4.4. foreach physical page
-               4.4.1. remask and retrim
-               4.4.2. attempt to detect if a physical page contains 2 logical pages, 
-                       4.4.2.1. if so split with unpaper
-               4.4.3. do any final processing (resize for bebook)
-5. move all the final pictures into a final picture directory
  
-In the accidentally deleted code we used ocropus's binarise stuff to do some
-extra cleaning.
+5. run ocropus's binarise over all the pngs
+
+6. foreach binarised scan page
+       6.1. create a mask from the original (unbinarised) page
+       6.2. use the mask to trim the binarised page (cutting this off improves unpaper's accuracy)
+       6.3. run unpaper over the clean binarised page, creating 2 pages (physical page)
+       6.4. foreach physical page
+               6.4.1. remask and retrim
+               6.4.2. attempt to detect if a physical page contains 2 logical pages, 
+                       6.4.2.1. if so split with unpaper
+               6.4.3. do any final processing (resize for bebook)
+7. move all the final pictures into a final picture directory
  
  = What options do we need? =
  Anything we attempt to detect automatically should have the option to set manually
@@ -33,3 +35,4 @@ Anything we attempt to detect automatically should have the option to set manual
         * an option to set a default and certain exceptions would be ace.
   - options for final output
   - options to ignore partial products
+ - more debug options