<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://wiki.calafou.org//api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Jxxx</id>
	<title>Wiki-Fou - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="http://wiki.calafou.org//api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Jxxx"/>
	<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php/Special:Contributions/Jxxx"/>
	<updated>2026-05-25T22:01:26Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.43.5</generator>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3444</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3444"/>
		<updated>2019-01-25T15:35:16Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
[[File:CameraSettings.png| 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
* TURN OFF THE CAMERAS BEFORE REMOVING THE SD CARDS! (otherwise, pics will be erased!)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt; marron_scanner 😎 &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* to rename the files properly, place &amp;lt;code&amp;gt;.JPG&amp;lt;/code&amp;gt; in the &amp;quot;After the number&amp;quot; section above. &amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
* WHAT&#039;S THIS ??? -&amp;gt; &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt; ??? does it just talk about the .JPG in the step above?&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
[[File:Rotate.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
[[File:SplitPages.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&lt;br /&gt;
[[File:Deskew main tab.jpeg | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
[[File:Content main tab.jpeg | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[[File:Margins to zero.png | 300px |none | left ]]&lt;br /&gt;
[[File:Center.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
[[File:Dewarping.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3443</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3443"/>
		<updated>2019-01-25T15:31:17Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
[[File:CameraSettings.png| 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt; marron_scanner 😎 &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* to rename the files properly, place &amp;lt;code&amp;gt;.JPG&amp;lt;/code&amp;gt; in the &amp;quot;After the number&amp;quot; section above. &amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
* WHAT&#039;S THIS ??? -&amp;gt; &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt; ??? does it just talk about the .JPG in the step above?&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
[[File:Rotate.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
[[File:SplitPages.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&lt;br /&gt;
[[File:Deskew main tab.jpeg | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
[[File:Content main tab.jpeg | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[[File:Margins to zero.png | 300px |none | left ]]&lt;br /&gt;
[[File:Center.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
[[File:Dewarping.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanner&amp;diff=3442</id>
		<title>Bookscanner</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanner&amp;diff=3442"/>
		<updated>2019-01-25T15:30:07Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Bookscanning in Calafou:&lt;br /&gt;
&lt;br /&gt;
* We made an event called Hack the Biblio: Let&#039;s make Digital Public Libraries in 2014 as a collaboration with memoryoftheworld, when Voja Antonic donated a bookscanner to Calafou: https://calafou.org/en/content/hack-biblio-construir-bibliotecas-publicas https://calafou.org/en/content/voja-antonic-calafou&lt;br /&gt;
* We scan out of print books published by Virus: http://www.viruseditorial.net/&lt;br /&gt;
* &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;marron_scanner 😎 &amp;lt;/span&amp;gt; was built in 2018 during the Kunlabora event for supporting other collectives that want to scan: https://calafou.org/en/content/kunlabora-ephimeral-projects-kooperative&lt;br /&gt;
&lt;br /&gt;
This page is documenting the design and construction of bookscanners.  For the tutorial that explains &amp;quot;How to use the bookscanner&amp;quot;, go to http://wiki.calafou.org/index.php/Bookscanning&lt;br /&gt;
&lt;br /&gt;
= Building a &amp;quot;New Standard Scanner&amp;quot; =&lt;br /&gt;
&lt;br /&gt;
We built this: https://forum.diybookscanner.org/viewtopic.php?f=1&amp;amp;t=333 during the Kunlabora event in Calafou: https://calafou.org/en/content/kunlabora-ephimeral-projects-kooperative A more reproducible and portable and simple scanner.&lt;br /&gt;
&lt;br /&gt;
In theory for the simple scanner it should be possible to get all the ingredients from a hardware shop (&amp;quot;ferreteria&amp;quot;).  We got most of the materials from Bauhaus in Barcelona (Zona Franca).  Our experience was that most of the screws, nuts and bolts are easy to substitute with others that have more or less the same dimensions.&lt;br /&gt;
&lt;br /&gt;
[[File:bookscanner_pseudoready2.jpg|600px]] &lt;br /&gt;
&lt;br /&gt;
== Budget ==&lt;br /&gt;
&lt;br /&gt;
Kunlabora event was four days of collaborative construction.  More or less six persons worked on building the scanner.  This is the receipts that we spent, because we had 500 EUR budget for the raw materials from the participation fee of the event. &lt;br /&gt;
&lt;br /&gt;
Receipts:&lt;br /&gt;
&lt;br /&gt;
    25    2x plexiglass [J] @ Expocryl, Igualada&lt;br /&gt;
    18    2x SD cards @ Life Informatica, Barcelona&lt;br /&gt;
    50    2x magic want camera stand @ FotoK, Barcelona&lt;br /&gt;
    42.2  Nuts/bolts/etc. @ Bauhaus, Zona Franca, Barcelona&lt;br /&gt;
    108.8 Nuts/bolts/etc. @ Bauhaus, Zona Franca, Barcelona&lt;br /&gt;
    10    2x USB cable male/female @ Worten San Antoni, Barcelona&lt;br /&gt;
    5     2x mini USB (USB-B) cable @ Tienda Cables, Barcelona&lt;br /&gt;
    178   2x Canon IXUS 175 compact digital camera @ Amazon.es&lt;br /&gt;
    60    4x Wooden beams (listones) @ Fustes Fargas, Barcelona&lt;br /&gt;
    ============================================&lt;br /&gt;
    497&lt;br /&gt;
&lt;br /&gt;
Donations:&lt;br /&gt;
&lt;br /&gt;
    70    Raspberry Pi 3 (Model B), 2x SD cards, 5V charger, rulers [M] @ Barcelona&lt;br /&gt;
&lt;br /&gt;
We also used nuts and bolts, etc. found in the workshop of Calafou (in Catalunya, near Barcelona).  The conclusion is that it should be possible to build this scanner from about 5-600 EUR anywhere in (Western) Europe.  We did not use all the materials we bought and if we would do it again then we tried to use real wood instead of FDM, because the latter proved very fragile.  However, this could make the platen more heavy, and it is already quite heavy.&lt;br /&gt;
&lt;br /&gt;
== Parts / componentes ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[File:01 SHOPPING 001.JPG|500px|none|left]]&lt;br /&gt;
&amp;lt;i&amp;gt;Going for shopping&amp;lt;/i&amp;gt;&lt;br /&gt;
[[File:01 SHOPPING 003.JPG|500px|none|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;From Bauhaus, Barcelona (Zona Franca):&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:01 SHOPPING 027.JPG|500px|none|left]]&lt;br /&gt;
 -1x Varilla roscada&lt;br /&gt;
 -4x Listones de madera  ~800x40x90mm&lt;br /&gt;
 -2x (Tablero) MD sheet 600x400x10mm&lt;br /&gt;
 -2x (Tablero) MDF sheet 600x400x16mm&lt;br /&gt;
[[File:01 SHOPPING 023.JPG|500px|none|left]]&lt;br /&gt;
 - Cola blanca rápida para madera&lt;br /&gt;
[[File:01 SHOPPING 024.JPG|500px|none|left]]&lt;br /&gt;
 - Base 4 tomas 1.5 mts&lt;br /&gt;
[[File:01 SHOPPING 007.JPG|500px|none|left]]&lt;br /&gt;
 - Tuerca hexagonal&lt;br /&gt;
[[File:01 SHOPPING 020.JPG|500px|none|left]]&lt;br /&gt;
 - Guia corredera&lt;br /&gt;
[[File:01 SHOPPING 004.JPG|500px|none|left]]&lt;br /&gt;
 - Arandelas ancha&lt;br /&gt;
[[File:01 SHOPPING 005.JPG|500px|none|left]]&lt;br /&gt;
 - Aldabon BPF 185 M&lt;br /&gt;
[[File:01 SHOPPING 006.JPG|500px|none|left]]&lt;br /&gt;
 - Tornillo mad.cab.pl.&lt;br /&gt;
[[File:01 SHOPPING 010.JPG|500px|none|left]]&lt;br /&gt;
 - Tornillo cab. avell.&lt;br /&gt;
[[File:01 SHOPPING 011.JPG|500px|none|left]]&lt;br /&gt;
 - Tornillo p/metal&lt;br /&gt;
[[File:01 SHOPPING 008.JPG|500px|none|left]]&lt;br /&gt;
 - Tuerca palomilla&lt;br /&gt;
[[File:01 SHOPPING 021.JPG|500px|none|left]]&lt;br /&gt;
 - Guia cajones 10kg&lt;br /&gt;
[[File:01 SHOPPING 022.JPG|500px|none|left]]&lt;br /&gt;
 - Spax cab. red zinc&lt;br /&gt;
[[File:01 SHOPPING 012.JPG|500px|none|left]]&lt;br /&gt;
 - Tirador dorado&lt;br /&gt;
[[File:01 SHOPPING 014.JPG|500px|none|left]]&lt;br /&gt;
 - 4x Abrazaderas&lt;br /&gt;
[[File:01 SHOPPING 025.JPG|500px|none|left]]&lt;br /&gt;
 - Mosqueton-fw&lt;br /&gt;
&lt;br /&gt;
 - 2 x plexi glass 380x280x3mm from a company in Igualada that J. found.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Other things from other shops:&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:01 SHOPPING 013.JPG|500px|none|left]]&lt;br /&gt;
 - 4x Cable USB 2.0 male-female @ Worten Sant Antoni&lt;br /&gt;
 - 2x USB-A -&amp;gt; USB-B (mini) cables @ Tienda de Cables&lt;br /&gt;
 - 2x SD Cards 16 Gg&lt;br /&gt;
[[File:01 SHOPPING 017.JPG|500px|none|left]]&lt;br /&gt;
 - 2x Manfrotto brazo flexible MF237 (Nr. referencia AFP018036) @ FotoK, Ronda Universitat&lt;br /&gt;
[[File:01 SHOPPING 015.JPG|500px|none|left]]&lt;br /&gt;
 - 2x Tira LED 1 nice&lt;br /&gt;
&lt;br /&gt;
 - 2 x digital cameras (see below for background information): Canon IXUS 175 (Powershot / ELPH 180)&lt;br /&gt;
   - https://www.amazon.es/Canon-IXUS-175-compacta-estabilizador/dp/B01A8QU70I/&lt;br /&gt;
   - http://chdk.wikia.com/wiki/ELPH180&lt;br /&gt;
   - https://www.canon-europe.com/for_home/product_finder/cameras/digital_camera/ixus/ixus_175/specification.aspx&lt;br /&gt;
&lt;br /&gt;
For electronic parts a good shop in Barcelona is Diotronic: https://diotronic.com/&lt;br /&gt;
&lt;br /&gt;
 - usb charger to give power to the external trigger trough USB cable&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Things to take in consideration from the building experience&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* the MDF pieces are fragile and gave us problems when introducing screws&lt;br /&gt;
* we considered varnishing the wood pieces to limit humidity impact&lt;br /&gt;
* the plexiglas we bought generates many reflections&lt;br /&gt;
* passing from inches to cm was uncomfortable&lt;br /&gt;
* the position and height of the light should be tested before being fixed&lt;br /&gt;
* the upper part of the light stand should be easier to mount and unmount for transportation&lt;br /&gt;
* the wood used for the platen should be without texture or painted with mate dark paint&lt;br /&gt;
* the gooseneck (Manfrotto) option for the camera stand is not a good option: it is very easy to put the camera in the way of the cradle; the calibration process is difficult because the cameras can move in the three axis; they are springy and difficult to set in an exact position&lt;br /&gt;
&lt;br /&gt;
== Cameras ==&lt;br /&gt;
&lt;br /&gt;
The bookscanner is basically a glorified camera stand that allows us to take good pictures of book pages.  So the cameras are an important part of the bookscanner.  Summary of research about cameras for book scanners:&lt;br /&gt;
&lt;br /&gt;
Basically there are three categories of cameras that can be used for book scanners (from cheapest to most expensive).&lt;br /&gt;
&lt;br /&gt;
1. Remote control support&lt;br /&gt;
&lt;br /&gt;
The cheapest option is any camera with remote trigger support, so we can take pictures without pushing the button on the camera.  This is important because when you press the button the camera position may be disadjusted to the physical pressure.&lt;br /&gt;
&lt;br /&gt;
2. CHDK firmware&lt;br /&gt;
&lt;br /&gt;
Middle category is CHDK firmware compatibles.  CHDK is a third party open source firmware that allows the customisation of cameras.  CHDK firmware is for Canon Powershot cameras, which are the cheaper compact digital camera product line.  We have 200 euros in the budget for cameras, so we went with this option.&lt;br /&gt;
&lt;br /&gt;
3. Magic Lantern support&lt;br /&gt;
&lt;br /&gt;
Magic Lantern is a third party open source firmware that is more advanced.  However, it only works with Canon DLSR cameras (these are the cameras that have a reflex mirror to look at the shot through a small hole before you take the picture, and they usually have big lenses). The scanner we have now uses Canon 1100D, which are the cheapest type suported by Magic Lantern, but they still cost a few hundred euros.&lt;br /&gt;
&lt;br /&gt;
4. Connecting to a monitor&lt;br /&gt;
&lt;br /&gt;
It is very useful to be able to connect the cameras to a monitor during the calibration process. The cameras we bought use the same output for connecting to a remote control and to a monitor making it not practical at all using a monitor. Furthermore, connection with usb to a computer monitor was not an option, we had to look for RC connectors or adapters.&lt;br /&gt;
&lt;br /&gt;
== Building process ==&lt;br /&gt;
&lt;br /&gt;
=== Base / Base ===&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 035.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 043.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 040.JPG|500px|none|left]]&lt;br /&gt;
We cut the wood beams to have 2 pieces of 56x4cm and 2 of 31.5x4cm&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 044.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 041.JPG|500px|none|left]]&lt;br /&gt;
we measured with precision tools and marked the position of the drawer sliders on the 54x4cm wood beams &lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 045.JPG|500px|none|left]]&lt;br /&gt;
we drilled holes on the marked position&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 049.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 050.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 051.JPG|500px|none|left]]&lt;br /&gt;
and then we fixed the sliders&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 033.JPG|500px|none|left]]&lt;br /&gt;
We used the thicker MDF material to cut a 37x29cm sheet&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 056.JPG|500px|none|left]]&lt;br /&gt;
The last parts were fixing the other part of the sliders to the MDF sheet and attaching the different part of the base with screws&lt;br /&gt;
&lt;br /&gt;
=== Column / Columna ===&lt;br /&gt;
&lt;br /&gt;
=== Cradle / Cuna ===&lt;br /&gt;
&lt;br /&gt;
=== Platen / Platina ===&lt;br /&gt;
&lt;br /&gt;
=== CHDK ===&lt;br /&gt;
&lt;br /&gt;
= Photos = &lt;br /&gt;
&lt;br /&gt;
== Shopping ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:01_SHOPPING_001.JPG&lt;br /&gt;
File:01_SHOPPING_002.JPG&lt;br /&gt;
File:01_SHOPPING_003.JPG&lt;br /&gt;
File:01_SHOPPING_004.JPG&lt;br /&gt;
File:01_SHOPPING_005.JPG&lt;br /&gt;
File:01_SHOPPING_006.JPG&lt;br /&gt;
File:01_SHOPPING_007.JPG&lt;br /&gt;
File:01_SHOPPING_008.JPG&lt;br /&gt;
File:01_SHOPPING_009.JPG&lt;br /&gt;
File:01_SHOPPING_010.JPG&lt;br /&gt;
File:01_SHOPPING_011.JPG&lt;br /&gt;
File:01_SHOPPING_012.JPG&lt;br /&gt;
File:01_SHOPPING_013.JPG&lt;br /&gt;
File:01_SHOPPING_014.JPG&lt;br /&gt;
File:01_SHOPPING_015.JPG&lt;br /&gt;
File:01_SHOPPING_016.JPG&lt;br /&gt;
File:01_SHOPPING_017.JPG&lt;br /&gt;
File:01_SHOPPING_018.JPG&lt;br /&gt;
File:01_SHOPPING_019.JPG&lt;br /&gt;
File:01_SHOPPING_020.JPG&lt;br /&gt;
File:01_SHOPPING_021.JPG&lt;br /&gt;
File:01_SHOPPING_022.JPG&lt;br /&gt;
File:01_SHOPPING_023.JPG&lt;br /&gt;
File:01_SHOPPING_024.JPG&lt;br /&gt;
File:01_SHOPPING_025.JPG&lt;br /&gt;
File:01_SHOPPING_026.JPG&lt;br /&gt;
File:01_SHOPPING_027.JPG&lt;br /&gt;
File:01_SHOPPING_028.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Base ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:02_BASE_029.JPG&lt;br /&gt;
File:02_BASE_032.JPG&lt;br /&gt;
File:02_BASE_033.JPG&lt;br /&gt;
File:02_BASE_035.JPG&lt;br /&gt;
File:02_BASE_040.JPG&lt;br /&gt;
File:02_BASE_041.JPG&lt;br /&gt;
File:02_BASE_042.JPG&lt;br /&gt;
File:02_BASE_043.JPG&lt;br /&gt;
File:02_BASE_044.JPG&lt;br /&gt;
File:02_BASE_045.JPG&lt;br /&gt;
File:02_BASE_048.JPG&lt;br /&gt;
File:02_BASE_049.JPG&lt;br /&gt;
File:02_BASE_050.JPG&lt;br /&gt;
File:02_BASE_051.JPG&lt;br /&gt;
File:02_BASE_056.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Cradle ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:03_CRADDLE_030.JPG&lt;br /&gt;
File:03_CRADDLE_037.JPG&lt;br /&gt;
File:03_CRADDLE_046.JPG&lt;br /&gt;
File:03_CRADDLE_057.JPG&lt;br /&gt;
File:03_CRADDLE_058.JPG&lt;br /&gt;
File:03_CRADDLE_059.JPG&lt;br /&gt;
File:03_CRADDLE_060.JPG&lt;br /&gt;
File:03_CRADDLE_061.JPG&lt;br /&gt;
File:03_CRADDLE_062.JPG&lt;br /&gt;
File:03_CRADDLE_071.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Platen ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:04_PLATEN_038.JPG&lt;br /&gt;
File:04_PLATEN_039.JPG&lt;br /&gt;
File:04_PLATEN_047.JPG&lt;br /&gt;
File:04_PLATEN_053.JPG&lt;br /&gt;
File:04_PLATEN_054.JPG&lt;br /&gt;
File:04_PLATEN_055.JPG&lt;br /&gt;
File:04_PLATEN_063.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Column ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:05_COLUMN_065.JPG&lt;br /&gt;
File:05_COLUMN_066.JPG&lt;br /&gt;
File:05_COLUMN_067.JPG&lt;br /&gt;
File:05_COLUMN_068.JPG&lt;br /&gt;
File:05_COLUMN_070.JPG&lt;br /&gt;
File:05_COLUMN_072.JPG&lt;br /&gt;
File:05_COLUMN_073.JPG&lt;br /&gt;
File:05_COLUMN_074.JPG&lt;br /&gt;
File:05_COLUMN_075.JPG&lt;br /&gt;
File:05_COLUMN_076.JPG&lt;br /&gt;
File:05_COLUMN_077.JPG&lt;br /&gt;
File:05_COLUMN_078.JPG&lt;br /&gt;
File:05_COLUMN_079.JPG&lt;br /&gt;
File:05_COLUMN_080.JPG&lt;br /&gt;
File:05_COLUMN_083.JPG&lt;br /&gt;
File:05_COLUMN_084.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Mounting ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:06_MOUNTED_064.JPG&lt;br /&gt;
File:06_MOUNTED_086.JPG&lt;br /&gt;
File:06_MOUNTED_088.JPG&lt;br /&gt;
File:06_MOUNTED_089.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:07_GEN_031.JPG&lt;br /&gt;
File:07_GEN_034.JPG&lt;br /&gt;
File:07_GEN_036.JPG&lt;br /&gt;
File:07_GEN_052.JPG&lt;br /&gt;
File:07_GEN_085.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Triggering mechanism ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:08_TRIGERING_069.JPG&lt;br /&gt;
File:08_TRIGERING_081.JPG&lt;br /&gt;
File:08_TRIGERING_082.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Camera holder ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:09_CAM-HOLDERS_087.JPG&lt;br /&gt;
File:09_CAM-HOLDERS_090.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Day 1 (temporary) ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:IMG_20181012_161540990.jpg&lt;br /&gt;
File:IMG_20181012_161552147.jpg&lt;br /&gt;
File:IMG_20181012_161615376.jpg&lt;br /&gt;
File:IMG_20181012_161626209.jpg&lt;br /&gt;
File:IMG_20181012_161639871.jpg&lt;br /&gt;
File:IMG_20181012_162253725.jpg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Raspberry Pi for the bookscanner =&lt;br /&gt;
&lt;br /&gt;
At the moment we use an electromechanical button documented on the CHDK website that works well but does not do much more than triggering the two cameras to shoot at the same time: http://chdk.wikia.com/wiki/USB_Remote&lt;br /&gt;
&lt;br /&gt;
There are many more possibilities of how to optimise the bookscanning process.  Many ideas start with connecting the cameras to a small computer such as the Raspberry Pi.  We made some experiments and tests with this setup, which is documented below:&lt;br /&gt;
&lt;br /&gt;
It runs [https://www.raspbian.org/ Raspbian].&lt;br /&gt;
&lt;br /&gt;
== CHDKPTP ==&lt;br /&gt;
&lt;br /&gt;
In /home/pi/chdkptp there are precompiled binaries of [https://app.assembla.com/spaces/chdkptp/wiki CHDKPTP] downloaded from [https://app.assembla.com/spaces/chdkptp/documents here].&lt;br /&gt;
&lt;br /&gt;
CHDKPTP is used for remote control of camera running CHDK firmware.&lt;br /&gt;
&lt;br /&gt;
Our setup has two modes of work:&lt;br /&gt;
* mechanical &amp;quot;button&amp;quot; which triggers both camera capturing its photo based on this tutorial: http://chdk.wikia.com/wiki/USB_Remote&lt;br /&gt;
* (in progress) Raspberry Pi mode where an bookscanner operator uses Raspberry Pi to capture photos, preview photos in real time and transfer them already renamed for the next step in Scantailor.&lt;br /&gt;
&lt;br /&gt;
When camera is connected this line will list info about the camera e.g.:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;sudo ./chdkptp.sh -elist&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
and for one of the camera this is what is listed then:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;-1:Canon IXUS 175 b=001 d=030 v=0x4a9 p=0x32c1 s=8B20D62641B041BAA3E1D597D560D110&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
An example of capturing a picture from commandline (once in /home/pi/chdkptp/):&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;sudo ./chdkptp.sh -e&amp;quot;connect -d=030&amp;quot; -erec -eshoot&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Above line connects to the camera sitting on -d=021, puts it into rec mode (if not already) and capture the photo saving it to SD card already in camera. If one wants to bypass the SD card altogether it should replace -eshoot with -eremoteshoot. In that case ./chdkptp.sh will save photos into the directory from where it was called.&lt;br /&gt;
&lt;br /&gt;
Note: check the SD card is locked.&lt;br /&gt;
&lt;br /&gt;
== ZeroTier ==&lt;br /&gt;
&lt;br /&gt;
It is added to 565799d8f6ebf1a8 public network of [https://zerotier.com ZeroTier] with this command:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;sudo zerotier-cli join 565799d8f6ebf1a8&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
and it got static IP address (in 565799d8f6ebf1a8 network):&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;192.168.192.171/24&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In /home/pi/.ssh/authorized_keys public keys of maxigas and marcell are added.&lt;br /&gt;
&lt;br /&gt;
= Ideas for building Voja&#039;s scanner =&lt;br /&gt;
&lt;br /&gt;
Our first idea was to reproduce Voja&#039;s scanner but we had to realise that we are not Voja and the scanner is very beautiful and unique engineering and consequently it is hard to reproduce. &lt;br /&gt;
&lt;br /&gt;
Here are two links to the public documentation of our scanner, built by Voja Antonic:&lt;br /&gt;
&lt;br /&gt;
https://www.memoryoftheworld.org/blog/2012/10/28/our-beloved-bookscanner-2/&lt;br /&gt;
&lt;br /&gt;
https://hackaday.io/project/5604-diy-book-scanner&lt;br /&gt;
&lt;br /&gt;
The electronics is not really documented (which means that it is hard to reproduce) and it is built from basic parts (which means that it takes a lot of time to put it together).  So we brainstorm about an Arduino-based solution instead.  Arduino is a general-purpose programmable microcontroller that has already built-in many of the functions/parts we need.  The idea is that this makes it easier for us to build the scanner and for others to reproduce it.  We also have more experience working with Arduino than with only basic electronic components. &lt;br /&gt;
&lt;br /&gt;
We will also try to use cheaper cameras in order to bring down the budget.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
== About our first bookscanner ==&lt;br /&gt;
&lt;br /&gt;
[https://www.memoryoftheworld.org/blog/2012/10/28/our-beloved-bookscanner/ English]&lt;br /&gt;
&lt;br /&gt;
[https://www.memoryoftheworld.org/es/blog/2012/10/28/our-beloved-bookscanner/ Spanish]&lt;br /&gt;
&lt;br /&gt;
== Principal sources ==&lt;br /&gt;
&lt;br /&gt;
[http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]&lt;br /&gt;
&lt;br /&gt;
== Future activities ==&lt;br /&gt;
&lt;br /&gt;
[[next steps]]&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanner&amp;diff=3441</id>
		<title>Bookscanner</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanner&amp;diff=3441"/>
		<updated>2019-01-25T15:27:36Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Bookscanning in Calafou:&lt;br /&gt;
&lt;br /&gt;
* We made an event called Hack the Biblio: Let&#039;s make Digital Public Libraries in 2014 as a collaboration with memoryoftheworld, when Voja Antonic donated a bookscanner to Calafou: https://calafou.org/en/content/hack-biblio-construir-bibliotecas-publicas https://calafou.org/en/content/voja-antonic-calafou&lt;br /&gt;
* We scan out of print books published by Virus: http://www.viruseditorial.net/&lt;br /&gt;
* We built a book scanner (the marron_scanner 😎) in 2018 during the Kunlabora event for supporting other collectives that want to scan: https://calafou.org/en/content/kunlabora-ephimeral-projects-kooperative&lt;br /&gt;
&lt;br /&gt;
This page is documenting the design and construction of bookscanners.  For the tutorial that explains &amp;quot;How to use the bookscanner&amp;quot;, go to http://wiki.calafou.org/index.php/Bookscanning&lt;br /&gt;
&lt;br /&gt;
= Building a &amp;quot;New Standard Scanner&amp;quot; =&lt;br /&gt;
&lt;br /&gt;
We built this: https://forum.diybookscanner.org/viewtopic.php?f=1&amp;amp;t=333 during the Kunlabora event in Calafou: https://calafou.org/en/content/kunlabora-ephimeral-projects-kooperative A more reproducible and portable and simple scanner.&lt;br /&gt;
&lt;br /&gt;
In theory for the simple scanner it should be possible to get all the ingredients from a hardware shop (&amp;quot;ferreteria&amp;quot;).  We got most of the materials from Bauhaus in Barcelona (Zona Franca).  Our experience was that most of the screws, nuts and bolts are easy to substitute with others that have more or less the same dimensions.&lt;br /&gt;
&lt;br /&gt;
[[File:bookscanner_pseudoready2.jpg|600px]] &lt;br /&gt;
&lt;br /&gt;
== Budget ==&lt;br /&gt;
&lt;br /&gt;
Kunlabora event was four days of collaborative construction.  More or less six persons worked on building the scanner.  This is the receipts that we spent, because we had 500 EUR budget for the raw materials from the participation fee of the event. &lt;br /&gt;
&lt;br /&gt;
Receipts:&lt;br /&gt;
&lt;br /&gt;
    25    2x plexiglass [J] @ Expocryl, Igualada&lt;br /&gt;
    18    2x SD cards @ Life Informatica, Barcelona&lt;br /&gt;
    50    2x magic want camera stand @ FotoK, Barcelona&lt;br /&gt;
    42.2  Nuts/bolts/etc. @ Bauhaus, Zona Franca, Barcelona&lt;br /&gt;
    108.8 Nuts/bolts/etc. @ Bauhaus, Zona Franca, Barcelona&lt;br /&gt;
    10    2x USB cable male/female @ Worten San Antoni, Barcelona&lt;br /&gt;
    5     2x mini USB (USB-B) cable @ Tienda Cables, Barcelona&lt;br /&gt;
    178   2x Canon IXUS 175 compact digital camera @ Amazon.es&lt;br /&gt;
    60    4x Wooden beams (listones) @ Fustes Fargas, Barcelona&lt;br /&gt;
    ============================================&lt;br /&gt;
    497&lt;br /&gt;
&lt;br /&gt;
Donations:&lt;br /&gt;
&lt;br /&gt;
    70    Raspberry Pi 3 (Model B), 2x SD cards, 5V charger, rulers [M] @ Barcelona&lt;br /&gt;
&lt;br /&gt;
We also used nuts and bolts, etc. found in the workshop of Calafou (in Catalunya, near Barcelona).  The conclusion is that it should be possible to build this scanner from about 5-600 EUR anywhere in (Western) Europe.  We did not use all the materials we bought and if we would do it again then we tried to use real wood instead of FDM, because the latter proved very fragile.  However, this could make the platen more heavy, and it is already quite heavy.&lt;br /&gt;
&lt;br /&gt;
== Parts / componentes ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[File:01 SHOPPING 001.JPG|500px|none|left]]&lt;br /&gt;
&amp;lt;i&amp;gt;Going for shopping&amp;lt;/i&amp;gt;&lt;br /&gt;
[[File:01 SHOPPING 003.JPG|500px|none|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;From Bauhaus, Barcelona (Zona Franca):&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:01 SHOPPING 027.JPG|500px|none|left]]&lt;br /&gt;
 -1x Varilla roscada&lt;br /&gt;
 -4x Listones de madera  ~800x40x90mm&lt;br /&gt;
 -2x (Tablero) MD sheet 600x400x10mm&lt;br /&gt;
 -2x (Tablero) MDF sheet 600x400x16mm&lt;br /&gt;
[[File:01 SHOPPING 023.JPG|500px|none|left]]&lt;br /&gt;
 - Cola blanca rápida para madera&lt;br /&gt;
[[File:01 SHOPPING 024.JPG|500px|none|left]]&lt;br /&gt;
 - Base 4 tomas 1.5 mts&lt;br /&gt;
[[File:01 SHOPPING 007.JPG|500px|none|left]]&lt;br /&gt;
 - Tuerca hexagonal&lt;br /&gt;
[[File:01 SHOPPING 020.JPG|500px|none|left]]&lt;br /&gt;
 - Guia corredera&lt;br /&gt;
[[File:01 SHOPPING 004.JPG|500px|none|left]]&lt;br /&gt;
 - Arandelas ancha&lt;br /&gt;
[[File:01 SHOPPING 005.JPG|500px|none|left]]&lt;br /&gt;
 - Aldabon BPF 185 M&lt;br /&gt;
[[File:01 SHOPPING 006.JPG|500px|none|left]]&lt;br /&gt;
 - Tornillo mad.cab.pl.&lt;br /&gt;
[[File:01 SHOPPING 010.JPG|500px|none|left]]&lt;br /&gt;
 - Tornillo cab. avell.&lt;br /&gt;
[[File:01 SHOPPING 011.JPG|500px|none|left]]&lt;br /&gt;
 - Tornillo p/metal&lt;br /&gt;
[[File:01 SHOPPING 008.JPG|500px|none|left]]&lt;br /&gt;
 - Tuerca palomilla&lt;br /&gt;
[[File:01 SHOPPING 021.JPG|500px|none|left]]&lt;br /&gt;
 - Guia cajones 10kg&lt;br /&gt;
[[File:01 SHOPPING 022.JPG|500px|none|left]]&lt;br /&gt;
 - Spax cab. red zinc&lt;br /&gt;
[[File:01 SHOPPING 012.JPG|500px|none|left]]&lt;br /&gt;
 - Tirador dorado&lt;br /&gt;
[[File:01 SHOPPING 014.JPG|500px|none|left]]&lt;br /&gt;
 - 4x Abrazaderas&lt;br /&gt;
[[File:01 SHOPPING 025.JPG|500px|none|left]]&lt;br /&gt;
 - Mosqueton-fw&lt;br /&gt;
&lt;br /&gt;
 - 2 x plexi glass 380x280x3mm from a company in Igualada that J. found.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Other things from other shops:&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:01 SHOPPING 013.JPG|500px|none|left]]&lt;br /&gt;
 - 4x Cable USB 2.0 male-female @ Worten Sant Antoni&lt;br /&gt;
 - 2x USB-A -&amp;gt; USB-B (mini) cables @ Tienda de Cables&lt;br /&gt;
 - 2x SD Cards 16 Gg&lt;br /&gt;
[[File:01 SHOPPING 017.JPG|500px|none|left]]&lt;br /&gt;
 - 2x Manfrotto brazo flexible MF237 (Nr. referencia AFP018036) @ FotoK, Ronda Universitat&lt;br /&gt;
[[File:01 SHOPPING 015.JPG|500px|none|left]]&lt;br /&gt;
 - 2x Tira LED 1 nice&lt;br /&gt;
&lt;br /&gt;
 - 2 x digital cameras (see below for background information): Canon IXUS 175 (Powershot / ELPH 180)&lt;br /&gt;
   - https://www.amazon.es/Canon-IXUS-175-compacta-estabilizador/dp/B01A8QU70I/&lt;br /&gt;
   - http://chdk.wikia.com/wiki/ELPH180&lt;br /&gt;
   - https://www.canon-europe.com/for_home/product_finder/cameras/digital_camera/ixus/ixus_175/specification.aspx&lt;br /&gt;
&lt;br /&gt;
For electronic parts a good shop in Barcelona is Diotronic: https://diotronic.com/&lt;br /&gt;
&lt;br /&gt;
 - usb charger to give power to the external trigger trough USB cable&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Things to take in consideration from the building experience&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* the MDF pieces are fragile and gave us problems when introducing screws&lt;br /&gt;
* we considered varnishing the wood pieces to limit humidity impact&lt;br /&gt;
* the plexiglas we bought generates many reflections&lt;br /&gt;
* passing from inches to cm was uncomfortable&lt;br /&gt;
* the position and height of the light should be tested before being fixed&lt;br /&gt;
* the upper part of the light stand should be easier to mount and unmount for transportation&lt;br /&gt;
* the wood used for the platen should be without texture or painted with mate dark paint&lt;br /&gt;
* the gooseneck (Manfrotto) option for the camera stand is not a good option: it is very easy to put the camera in the way of the cradle; the calibration process is difficult because the cameras can move in the three axis; they are springy and difficult to set in an exact position&lt;br /&gt;
&lt;br /&gt;
== Cameras ==&lt;br /&gt;
&lt;br /&gt;
The bookscanner is basically a glorified camera stand that allows us to take good pictures of book pages.  So the cameras are an important part of the bookscanner.  Summary of research about cameras for book scanners:&lt;br /&gt;
&lt;br /&gt;
Basically there are three categories of cameras that can be used for book scanners (from cheapest to most expensive).&lt;br /&gt;
&lt;br /&gt;
1. Remote control support&lt;br /&gt;
&lt;br /&gt;
The cheapest option is any camera with remote trigger support, so we can take pictures without pushing the button on the camera.  This is important because when you press the button the camera position may be disadjusted to the physical pressure.&lt;br /&gt;
&lt;br /&gt;
2. CHDK firmware&lt;br /&gt;
&lt;br /&gt;
Middle category is CHDK firmware compatibles.  CHDK is a third party open source firmware that allows the customisation of cameras.  CHDK firmware is for Canon Powershot cameras, which are the cheaper compact digital camera product line.  We have 200 euros in the budget for cameras, so we went with this option.&lt;br /&gt;
&lt;br /&gt;
3. Magic Lantern support&lt;br /&gt;
&lt;br /&gt;
Magic Lantern is a third party open source firmware that is more advanced.  However, it only works with Canon DLSR cameras (these are the cameras that have a reflex mirror to look at the shot through a small hole before you take the picture, and they usually have big lenses). The scanner we have now uses Canon 1100D, which are the cheapest type suported by Magic Lantern, but they still cost a few hundred euros.&lt;br /&gt;
&lt;br /&gt;
4. Connecting to a monitor&lt;br /&gt;
&lt;br /&gt;
It is very useful to be able to connect the cameras to a monitor during the calibration process. The cameras we bought use the same output for connecting to a remote control and to a monitor making it not practical at all using a monitor. Furthermore, connection with usb to a computer monitor was not an option, we had to look for RC connectors or adapters.&lt;br /&gt;
&lt;br /&gt;
== Building process ==&lt;br /&gt;
&lt;br /&gt;
=== Base / Base ===&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 035.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 043.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 040.JPG|500px|none|left]]&lt;br /&gt;
We cut the wood beams to have 2 pieces of 56x4cm and 2 of 31.5x4cm&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 044.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 041.JPG|500px|none|left]]&lt;br /&gt;
we measured with precision tools and marked the position of the drawer sliders on the 54x4cm wood beams &lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 045.JPG|500px|none|left]]&lt;br /&gt;
we drilled holes on the marked position&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 049.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 050.JPG|500px|none|left]]&lt;br /&gt;
[[File:02 BASE 051.JPG|500px|none|left]]&lt;br /&gt;
and then we fixed the sliders&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 033.JPG|500px|none|left]]&lt;br /&gt;
We used the thicker MDF material to cut a 37x29cm sheet&lt;br /&gt;
&lt;br /&gt;
[[File:02 BASE 056.JPG|500px|none|left]]&lt;br /&gt;
The last parts were fixing the other part of the sliders to the MDF sheet and attaching the different part of the base with screws&lt;br /&gt;
&lt;br /&gt;
=== Column / Columna ===&lt;br /&gt;
&lt;br /&gt;
=== Cradle / Cuna ===&lt;br /&gt;
&lt;br /&gt;
=== Platen / Platina ===&lt;br /&gt;
&lt;br /&gt;
=== CHDK ===&lt;br /&gt;
&lt;br /&gt;
= Photos = &lt;br /&gt;
&lt;br /&gt;
== Shopping ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:01_SHOPPING_001.JPG&lt;br /&gt;
File:01_SHOPPING_002.JPG&lt;br /&gt;
File:01_SHOPPING_003.JPG&lt;br /&gt;
File:01_SHOPPING_004.JPG&lt;br /&gt;
File:01_SHOPPING_005.JPG&lt;br /&gt;
File:01_SHOPPING_006.JPG&lt;br /&gt;
File:01_SHOPPING_007.JPG&lt;br /&gt;
File:01_SHOPPING_008.JPG&lt;br /&gt;
File:01_SHOPPING_009.JPG&lt;br /&gt;
File:01_SHOPPING_010.JPG&lt;br /&gt;
File:01_SHOPPING_011.JPG&lt;br /&gt;
File:01_SHOPPING_012.JPG&lt;br /&gt;
File:01_SHOPPING_013.JPG&lt;br /&gt;
File:01_SHOPPING_014.JPG&lt;br /&gt;
File:01_SHOPPING_015.JPG&lt;br /&gt;
File:01_SHOPPING_016.JPG&lt;br /&gt;
File:01_SHOPPING_017.JPG&lt;br /&gt;
File:01_SHOPPING_018.JPG&lt;br /&gt;
File:01_SHOPPING_019.JPG&lt;br /&gt;
File:01_SHOPPING_020.JPG&lt;br /&gt;
File:01_SHOPPING_021.JPG&lt;br /&gt;
File:01_SHOPPING_022.JPG&lt;br /&gt;
File:01_SHOPPING_023.JPG&lt;br /&gt;
File:01_SHOPPING_024.JPG&lt;br /&gt;
File:01_SHOPPING_025.JPG&lt;br /&gt;
File:01_SHOPPING_026.JPG&lt;br /&gt;
File:01_SHOPPING_027.JPG&lt;br /&gt;
File:01_SHOPPING_028.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Base ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:02_BASE_029.JPG&lt;br /&gt;
File:02_BASE_032.JPG&lt;br /&gt;
File:02_BASE_033.JPG&lt;br /&gt;
File:02_BASE_035.JPG&lt;br /&gt;
File:02_BASE_040.JPG&lt;br /&gt;
File:02_BASE_041.JPG&lt;br /&gt;
File:02_BASE_042.JPG&lt;br /&gt;
File:02_BASE_043.JPG&lt;br /&gt;
File:02_BASE_044.JPG&lt;br /&gt;
File:02_BASE_045.JPG&lt;br /&gt;
File:02_BASE_048.JPG&lt;br /&gt;
File:02_BASE_049.JPG&lt;br /&gt;
File:02_BASE_050.JPG&lt;br /&gt;
File:02_BASE_051.JPG&lt;br /&gt;
File:02_BASE_056.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Cradle ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:03_CRADDLE_030.JPG&lt;br /&gt;
File:03_CRADDLE_037.JPG&lt;br /&gt;
File:03_CRADDLE_046.JPG&lt;br /&gt;
File:03_CRADDLE_057.JPG&lt;br /&gt;
File:03_CRADDLE_058.JPG&lt;br /&gt;
File:03_CRADDLE_059.JPG&lt;br /&gt;
File:03_CRADDLE_060.JPG&lt;br /&gt;
File:03_CRADDLE_061.JPG&lt;br /&gt;
File:03_CRADDLE_062.JPG&lt;br /&gt;
File:03_CRADDLE_071.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Platen ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:04_PLATEN_038.JPG&lt;br /&gt;
File:04_PLATEN_039.JPG&lt;br /&gt;
File:04_PLATEN_047.JPG&lt;br /&gt;
File:04_PLATEN_053.JPG&lt;br /&gt;
File:04_PLATEN_054.JPG&lt;br /&gt;
File:04_PLATEN_055.JPG&lt;br /&gt;
File:04_PLATEN_063.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Column ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:05_COLUMN_065.JPG&lt;br /&gt;
File:05_COLUMN_066.JPG&lt;br /&gt;
File:05_COLUMN_067.JPG&lt;br /&gt;
File:05_COLUMN_068.JPG&lt;br /&gt;
File:05_COLUMN_070.JPG&lt;br /&gt;
File:05_COLUMN_072.JPG&lt;br /&gt;
File:05_COLUMN_073.JPG&lt;br /&gt;
File:05_COLUMN_074.JPG&lt;br /&gt;
File:05_COLUMN_075.JPG&lt;br /&gt;
File:05_COLUMN_076.JPG&lt;br /&gt;
File:05_COLUMN_077.JPG&lt;br /&gt;
File:05_COLUMN_078.JPG&lt;br /&gt;
File:05_COLUMN_079.JPG&lt;br /&gt;
File:05_COLUMN_080.JPG&lt;br /&gt;
File:05_COLUMN_083.JPG&lt;br /&gt;
File:05_COLUMN_084.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Mounting ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:06_MOUNTED_064.JPG&lt;br /&gt;
File:06_MOUNTED_086.JPG&lt;br /&gt;
File:06_MOUNTED_088.JPG&lt;br /&gt;
File:06_MOUNTED_089.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:07_GEN_031.JPG&lt;br /&gt;
File:07_GEN_034.JPG&lt;br /&gt;
File:07_GEN_036.JPG&lt;br /&gt;
File:07_GEN_052.JPG&lt;br /&gt;
File:07_GEN_085.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Triggering mechanism ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:08_TRIGERING_069.JPG&lt;br /&gt;
File:08_TRIGERING_081.JPG&lt;br /&gt;
File:08_TRIGERING_082.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Camera holder ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:09_CAM-HOLDERS_087.JPG&lt;br /&gt;
File:09_CAM-HOLDERS_090.JPG&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Day 1 (temporary) ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:IMG_20181012_161540990.jpg&lt;br /&gt;
File:IMG_20181012_161552147.jpg&lt;br /&gt;
File:IMG_20181012_161615376.jpg&lt;br /&gt;
File:IMG_20181012_161626209.jpg&lt;br /&gt;
File:IMG_20181012_161639871.jpg&lt;br /&gt;
File:IMG_20181012_162253725.jpg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Raspberry Pi for the bookscanner =&lt;br /&gt;
&lt;br /&gt;
At the moment we use an electromechanical button documented on the CHDK website that works well but does not do much more than triggering the two cameras to shoot at the same time: http://chdk.wikia.com/wiki/USB_Remote&lt;br /&gt;
&lt;br /&gt;
There are many more possibilities of how to optimise the bookscanning process.  Many ideas start with connecting the cameras to a small computer such as the Raspberry Pi.  We made some experiments and tests with this setup, which is documented below:&lt;br /&gt;
&lt;br /&gt;
It runs [https://www.raspbian.org/ Raspbian].&lt;br /&gt;
&lt;br /&gt;
== CHDKPTP ==&lt;br /&gt;
&lt;br /&gt;
In /home/pi/chdkptp there are precompiled binaries of [https://app.assembla.com/spaces/chdkptp/wiki CHDKPTP] downloaded from [https://app.assembla.com/spaces/chdkptp/documents here].&lt;br /&gt;
&lt;br /&gt;
CHDKPTP is used for remote control of camera running CHDK firmware.&lt;br /&gt;
&lt;br /&gt;
Our setup has two modes of work:&lt;br /&gt;
* mechanical &amp;quot;button&amp;quot; which triggers both camera capturing its photo based on this tutorial: http://chdk.wikia.com/wiki/USB_Remote&lt;br /&gt;
* (in progress) Raspberry Pi mode where an bookscanner operator uses Raspberry Pi to capture photos, preview photos in real time and transfer them already renamed for the next step in Scantailor.&lt;br /&gt;
&lt;br /&gt;
When camera is connected this line will list info about the camera e.g.:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;sudo ./chdkptp.sh -elist&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
and for one of the camera this is what is listed then:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;-1:Canon IXUS 175 b=001 d=030 v=0x4a9 p=0x32c1 s=8B20D62641B041BAA3E1D597D560D110&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
An example of capturing a picture from commandline (once in /home/pi/chdkptp/):&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;sudo ./chdkptp.sh -e&amp;quot;connect -d=030&amp;quot; -erec -eshoot&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
Above line connects to the camera sitting on -d=021, puts it into rec mode (if not already) and capture the photo saving it to SD card already in camera. If one wants to bypass the SD card altogether it should replace -eshoot with -eremoteshoot. In that case ./chdkptp.sh will save photos into the directory from where it was called.&lt;br /&gt;
&lt;br /&gt;
Note: check the SD card is locked.&lt;br /&gt;
&lt;br /&gt;
== ZeroTier ==&lt;br /&gt;
&lt;br /&gt;
It is added to 565799d8f6ebf1a8 public network of [https://zerotier.com ZeroTier] with this command:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;sudo zerotier-cli join 565799d8f6ebf1a8&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
and it got static IP address (in 565799d8f6ebf1a8 network):&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;192.168.192.171/24&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In /home/pi/.ssh/authorized_keys public keys of maxigas and marcell are added.&lt;br /&gt;
&lt;br /&gt;
= Ideas for building Voja&#039;s scanner =&lt;br /&gt;
&lt;br /&gt;
Our first idea was to reproduce Voja&#039;s scanner but we had to realise that we are not Voja and the scanner is very beautiful and unique engineering and consequently it is hard to reproduce. &lt;br /&gt;
&lt;br /&gt;
Here are two links to the public documentation of our scanner, built by Voja Antonic:&lt;br /&gt;
&lt;br /&gt;
https://www.memoryoftheworld.org/blog/2012/10/28/our-beloved-bookscanner-2/&lt;br /&gt;
&lt;br /&gt;
https://hackaday.io/project/5604-diy-book-scanner&lt;br /&gt;
&lt;br /&gt;
The electronics is not really documented (which means that it is hard to reproduce) and it is built from basic parts (which means that it takes a lot of time to put it together).  So we brainstorm about an Arduino-based solution instead.  Arduino is a general-purpose programmable microcontroller that has already built-in many of the functions/parts we need.  The idea is that this makes it easier for us to build the scanner and for others to reproduce it.  We also have more experience working with Arduino than with only basic electronic components. &lt;br /&gt;
&lt;br /&gt;
We will also try to use cheaper cameras in order to bring down the budget.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
== About our first bookscanner ==&lt;br /&gt;
&lt;br /&gt;
[https://www.memoryoftheworld.org/blog/2012/10/28/our-beloved-bookscanner/ English]&lt;br /&gt;
&lt;br /&gt;
[https://www.memoryoftheworld.org/es/blog/2012/10/28/our-beloved-bookscanner/ Spanish]&lt;br /&gt;
&lt;br /&gt;
== Principal sources ==&lt;br /&gt;
&lt;br /&gt;
[http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]&lt;br /&gt;
&lt;br /&gt;
== Future activities ==&lt;br /&gt;
&lt;br /&gt;
[[next steps]]&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3440</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3440"/>
		<updated>2019-01-25T15:25:39Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
[[File:CameraSettings.png| 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner 😎 &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* to rename the files properly, place &amp;lt;code&amp;gt;.JPG&amp;lt;/code&amp;gt; in the &amp;quot;After the number&amp;quot; section above. &amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
* WHAT&#039;S THIS ??? -&amp;gt; &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt; ??? does it just talk about the .JPG in the step above?&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
[[File:Rotate.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
[[File:SplitPages.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&lt;br /&gt;
[[File:Deskew main tab.jpeg | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
[[File:Content main tab.jpeg | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[[File:Margins to zero.png | 300px |none | left ]]&lt;br /&gt;
[[File:Center.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
[[File:Dewarping.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3436</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3436"/>
		<updated>2019-01-25T15:10:47Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
[[File:CameraSettings.png| 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
[[File:Rotate.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
[[File:SplitPages.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&lt;br /&gt;
[[File:Deskew main tab.jpeg | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
[[File:Content main tab.jpeg | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[[File:Margins to zero.png | 300px |none | left ]]&lt;br /&gt;
[[File:Center.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
[[File:Dewarping.png | 300px |none | left ]]&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3435</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3435"/>
		<updated>2019-01-25T15:03:53Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
[[File:CameraSettings.png|frameless|none|set the functions to &amp;quot;AUTO&amp;quot;]]&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
[[File:Rotate.png]]&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
[[File:SplitPages.png]]&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&lt;br /&gt;
[[File:Deskew main tab.jpeg]]&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
[[File:Content main tab.jpeg]]&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[[File:Margins to zero.png]]&lt;br /&gt;
[[File:Center.png]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
[[File:Dewarping.png]]&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3434</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3434"/>
		<updated>2019-01-25T14:55:12Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
[[File:CameraSettings.png|frameless|none|set the functions to &amp;quot;AUTO&amp;quot;]]&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3433</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3433"/>
		<updated>2019-01-25T14:49:36Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:CameraSettings.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3432</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3432"/>
		<updated>2019-01-25T14:48:16Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
File:CameraSettings.png&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3431</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3431"/>
		<updated>2019-01-25T14:41:30Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
File:CameraSettings.png&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=File:CameraSettings.png&amp;diff=3430</id>
		<title>File:CameraSettings.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=File:CameraSettings.png&amp;diff=3430"/>
		<updated>2019-01-25T14:40:50Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3429</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3429"/>
		<updated>2019-01-25T14:33:11Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
We recommend to use &#039;&#039;&#039;Calibre&#039;&#039;&#039; for your e-book management tasks -&amp;gt; https://calibre-ebook.com/&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3428</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3428"/>
		<updated>2019-01-25T14:26:24Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Pdf shuffler&lt;br /&gt;
&lt;br /&gt;
Calibre&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3425</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3425"/>
		<updated>2019-01-25T12:24:56Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;. &lt;br /&gt;
* If you want to use the button to make the cameras trigger automatically, you must lock the SD cards before putting them in. Also, plug the many USB cables in.&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3424</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3424"/>
		<updated>2019-01-25T12:04:42Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3423</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3423"/>
		<updated>2019-01-25T11:55:15Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;span style=&#039;width:32em;font-size:18px;font-weight:light;font-family:Courier,Sans;display:block;padding:4px;background:#DAA520;color:black;&#039;&amp;gt;Marron_scanner &amp;lt;/span&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3422</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3422"/>
		<updated>2019-01-25T11:32:41Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the &amp;lt;Marron_scanner | body style=&#039;background-color:brown&#039;&amp;gt;&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3421</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3421"/>
		<updated>2019-01-25T11:22:10Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron_scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3420</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3420"/>
		<updated>2019-01-25T11:16:56Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the {{font color | brown | Marron_scanner }} &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3419</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3419"/>
		<updated>2019-01-25T11:14:48Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron_scanner &amp;lt;span style=background-color:color&amp;gt; &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3418</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3418"/>
		<updated>2019-01-25T11:11:53Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the {{ font color | bg=brown | Marron_scanner }} &#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3417</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3417"/>
		<updated>2019-01-25T11:04:55Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to &#039;&#039;&#039;automatic&#039;&#039;&#039; with menu/lamp setting set to &#039;&#039;&#039;&amp;quot;off&amp;quot;&#039;&#039;&#039; to avoid the use of the red light.&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3416</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3416"/>
		<updated>2019-01-25T11:01:43Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* set the camera option in &#039;&#039;&#039;AUTO&#039;&#039;&#039;.&lt;br /&gt;
* set the camera&#039;s flash &#039;&#039;&#039;OFF&#039;&#039;&#039;.&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3415</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3415"/>
		<updated>2019-01-25T10:56:16Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Scanning */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* check all x, y and z angles.&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3414</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3414"/>
		<updated>2019-01-25T10:34:08Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
    - Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
    - Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
    - Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3413</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3413"/>
		<updated>2019-01-25T10:30:56Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
* Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
* Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
* Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Dewarping.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=File:Dewarping.png&amp;diff=3412</id>
		<title>File:Dewarping.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=File:Dewarping.png&amp;diff=3412"/>
		<updated>2019-01-25T10:29:54Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3411</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3411"/>
		<updated>2019-01-25T10:29:22Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider. Check every tab on the column that emerged on the right:&lt;br /&gt;
* Picture Zones: use the tool to select areas with photographs, illustrations or special icons.&lt;br /&gt;
* Dewarping: manage the grid to stregthen your page&#039;s content.&lt;br /&gt;
* Desplekling: remove dust and dots from the page.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[Output manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Output]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3409</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3409"/>
		<updated>2019-01-25T09:59:38Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Margins management manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Page-Layout]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3408</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3408"/>
		<updated>2019-01-25T09:56:05Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
file:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=File:Center.png&amp;diff=3407</id>
		<title>File:Center.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=File:Center.png&amp;diff=3407"/>
		<updated>2019-01-25T09:55:03Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3406</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3406"/>
		<updated>2019-01-25T09:54:26Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3405</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3405"/>
		<updated>2019-01-25T09:53:00Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Center.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3404</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3404"/>
		<updated>2019-01-25T09:51:50Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins so they are set to zero. Then, place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Margins to zero.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=File:Margins_to_zero.png&amp;diff=3403</id>
		<title>File:Margins to zero.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=File:Margins_to_zero.png&amp;diff=3403"/>
		<updated>2019-01-25T09:50:15Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3402</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3402"/>
		<updated>2019-01-25T09:49:31Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg]&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3401</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3401"/>
		<updated>2019-01-25T09:43:06Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;1st step: Fix orientation&#039;&#039;&#039;. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;2nd step: Split pages&#039;&#039;&#039;. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;3rd step: Deskew&#039;&#039;&#039;. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;4th step: Select content&#039;&#039;&#039;. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg]&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*&#039;&#039;&#039;5th step: Margins&#039;&#039;&#039;&#039;. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*&#039;&#039;&#039;6th step: Output&#039;&#039;&#039;. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3400</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3400"/>
		<updated>2019-01-24T19:31:40Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*3rd step: Deskew. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg]&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3399</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3399"/>
		<updated>2019-01-24T19:30:44Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*3rd step: Deskew. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Deskew main tab.jpeg&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Content main tab.jpeg]&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3398</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3398"/>
		<updated>2019-01-24T19:26:57Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*3rd step: Deskew. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
[[File:Deskew main tab.jpeg|thumb]]&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
[[File:Content main tab.jpeg|thumb]]&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=File:Content_main_tab.jpeg&amp;diff=3397</id>
		<title>File:Content main tab.jpeg</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=File:Content_main_tab.jpeg&amp;diff=3397"/>
		<updated>2019-01-24T19:26:04Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3396</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3396"/>
		<updated>2019-01-24T19:24:50Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*3rd step: Deskew. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
[[File:Deskew main tab.jpeg|thumb]]&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[Select content manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Select-Content]&lt;br /&gt;
&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3394</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3394"/>
		<updated>2019-01-24T19:22:30Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*3rd step: Deskew. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
[[File:Deskew main tab.jpeg|thumb]]&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=File:Deskew_main_tab.jpeg&amp;diff=3393</id>
		<title>File:Deskew main tab.jpeg</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=File:Deskew_main_tab.jpeg&amp;diff=3393"/>
		<updated>2019-01-24T19:21:14Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3392</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3392"/>
		<updated>2019-01-24T19:20:40Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
&lt;br /&gt;
*3rd step: Deskew. Drag and determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
&lt;br /&gt;
[Deskew manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Deskew]&lt;br /&gt;
&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3391</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3391"/>
		<updated>2019-01-24T19:17:05Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
*3rd step: Deskew. Determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3390</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3390"/>
		<updated>2019-01-24T19:15:37Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]&lt;br /&gt;
*3rd step: Deskew. Determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3388</id>
		<title>Bookscanning</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=Bookscanning&amp;diff=3388"/>
		<updated>2019-01-24T19:14:41Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: /* Postproduction */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are many ways to scan, this is the current state of the art in Calafou. We use only free software and the documentation is for Debian GNU/Linux, but it should work with some small modifications on any UNIX based system running the bash shell. There are some parts where proprietary software such as ABBYY FineReader can be more effective. However, this workflow produces near perfect books in PDF format that we are very happy with. One thing we could definitely improve is the size of the final PDF file, which is quite big (can be more than 100 megabytes).&lt;br /&gt;
&lt;br /&gt;
= Scanning =&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;The amount of work in the postproduction phase depends on how good quality images you can make in the scanning phase!&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
# Setting up the cameras (calibration): the most important part.&lt;br /&gt;
&lt;br /&gt;
* open the book in the middle (at a central page with normal text on both sides)&lt;br /&gt;
* &#039;&#039;&#039;camera should look directly on the middle of the page, parallel to the cradle, at 45 degrees compared to horizontal&#039;&#039;&#039;&lt;br /&gt;
* all the page should be in the image, but it is not a problem if more things outside of the book are visible&lt;br /&gt;
* check if the pages fold/curve; if so, place something underneath to straighten it (like a sponge, or another book…)&lt;br /&gt;
* camera settings: fully automatic, perhaps with manual focus&lt;br /&gt;
* back up and empty the SD cards in the cameras&lt;br /&gt;
* most subtle mistake: one camera sees letters bigger than the other camera (this can be a difference in the zoom level or the distance between camera and page)&lt;br /&gt;
* use a post-it or similar to mark the exact position of the book in relation to the lower edge of the cradle, to ensure it remains in the same position throughout the scanning&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Push the big button on the scanner to scan.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* maybe you have to put your finger to the side of the plexiglass which is closer to you when it is “down”, because the plexiglass is not always exactly the same angle as the book pages&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Download the images from the SD cards and put the scanner to sleep.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* from the camera on the left, copy the images to a folder called “odd”&lt;br /&gt;
* from the camera on the right, copy the images to a folder called “even”&lt;br /&gt;
* upload the two folders now to to &amp;lt;code&amp;gt;ftp://omnius.calafou/HackTheBiblio/scanning/$bookname--$yourname/&amp;lt;/code&amp;gt; folder&lt;br /&gt;
* remember to delete the pictures from the SD cards and put them back to the cameras, and maybe put the camera batteries to charge&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Additional information using the Marron scanner&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Check before starting that the SD card are locked: the external trigger that controls the cameras requires the SD cards to be locked. If they are not locked, the pictures are not saved when using the external trigger.&lt;br /&gt;
* Camera settings: we use two IXUS 175 set to automatic with menu/lamp setting set to &amp;quot;off&amp;quot; to avoid the use of the red light&lt;br /&gt;
* While taking pictures, if you need to check the last picture taken: long press the green play button to enter slideshow mode, long press the green play button to go back to picture mode (half pressure on the camera trigger also works)&lt;br /&gt;
* If you decide to use the zoom of the camera (not the digital zoom), be careful not to turn off the camera or you will loose your zoom setting&lt;br /&gt;
&lt;br /&gt;
= Dependencies =&lt;br /&gt;
&lt;br /&gt;
Using an up-to-date Debian operating system, you can install the following programs for the postproduction steps:&lt;br /&gt;
&lt;br /&gt;
* scantailor&lt;br /&gt;
* gprename&lt;br /&gt;
* pdftk&lt;br /&gt;
* tesseract-ocr&lt;br /&gt;
* tesseract-ocr-eng&lt;br /&gt;
* tesseract-ocr-spa&lt;br /&gt;
* calibre&lt;br /&gt;
&lt;br /&gt;
You can install all these programs with the following invocation from the command line (also called the terminal):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;sudo apt install scantailor gprename pdftk tesseract-ocr tesseract-ocr-eng tesseract-ocr-spa calibre&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Postproduction =&lt;br /&gt;
&lt;br /&gt;
You start with two folders such as &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; with files like IMG_1234.JPG. It is not good to talk about &amp;lt;code&amp;gt;right&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;left&amp;lt;/code&amp;gt; because it can be very confusing: are you talking about the image from the right camera that takes pictures of the left page of the book, or the image of the left page of the book that is from the right camera? On the other hand, &amp;lt;code&amp;gt;odd&amp;lt;/code&amp;gt; (1, 3, 5, …) and &amp;lt;code&amp;gt;even&amp;lt;/code&amp;gt; (2, 4, 6, …) are good words for describing what is on the image without ambiguity!&lt;br /&gt;
&lt;br /&gt;
The basic workflow is like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol start=&amp;quot;0&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; [process] → [program] → [output]&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Merge pictures from the two cameras → gprename → 1.jpg, 2.jpg, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Edit the pictures to adjust contents → scantailor → 1.tif, 2.tif, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Character recognition → tesseract → 1.pdf, 2.pdf, …&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the pdf file → pdftk → book.pdf&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Create the ebook → calibre → book.epub&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt; Disseminate → libgen.org → http://libgen.org/book/index.php?md5=B6916395FDE00D91DB4F52DCB8F069BF&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;etc.&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are some bash oneliners which can be useful (on Debian based systems):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt; &lt;br /&gt;
* enter &amp;lt;code&amp;gt;gprename&amp;lt;/code&amp;gt;  using a Terminal&lt;br /&gt;
* go to the Directory with the odd files&lt;br /&gt;
* select all files&lt;br /&gt;
* go to the numerical tab&lt;br /&gt;
* set starting number to 1 and increment by 2&lt;br /&gt;
* set the naming pattern&lt;br /&gt;
&lt;br /&gt;
[[File:Gprename.png|gprename window for renaming files]]&lt;br /&gt;
&lt;br /&gt;
* repeat the operation for even files&lt;br /&gt;
* merge the two folders&lt;br /&gt;
* &amp;lt;code&amp;gt;FIXME&amp;lt;/code&amp;gt; we can probably write a script to rename the files properly…&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt; &amp;lt;code&amp;gt; scantailor &amp;lt;/code&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;p&amp;gt;You can edit the captures appropriately with [http://scantailor.org/ scantailor]. It invites you to follow these steps:&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
*1st step: Fix orientation. All odd pages need to be turned in one direction, while even pages need to be turned in the other direction.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:Rotate.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
Rotate image nr 1 and click on &amp;quot;apply to every other page&amp;quot;. Then select image nr 2, rotate in the opposite direction so it stays still, and also click on &amp;quot;apply to every other page&amp;quot;.&lt;br /&gt;
[[Fix Orientation Manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Fix-Orientation]]&lt;br /&gt;
*2nd step: Split pages. If you import all files renamed, odd and even pages will be recognized as single pages, so this step is just to confirm that the edges of the pages are set properly; drag the rectangles to fit in the page&#039;s area.&lt;br /&gt;
&amp;lt;gallery&amp;gt;&lt;br /&gt;
File:SplitPages.png&lt;br /&gt;
&amp;lt;/gallery&amp;gt;&lt;br /&gt;
[[Split pages manual -&amp;gt; https://github.com/scantailor/scantailor/wiki/Split-Pages]]&lt;br /&gt;
*3rd step: Deskew. Determine the angle which the page needs to be turned for the text and images to be properly horizontal&lt;br /&gt;
*4th step: Select content. Frame all elements to be shown as content, within one single area (beware of including for example page numbers). The outer limit of these margins affects the size of the output file.&lt;br /&gt;
*5th step: Margins. Check out all margins place the content in a manner that will help it being read &amp;quot;centralized&amp;quot;.&lt;br /&gt;
*6th step: Output. Consider the visibility/readability of pages with images and/or mixed img-txt, managing the thickness slider.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Does Optical Character Recognition (OCR) on all images in folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; time for i in *tif; do b=$(basename $i .tif); tesseract -l spa &amp;amp;quot;$i&amp;amp;quot; &amp;amp;quot;$b&amp;amp;quot; pdf; done&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Merges all the pdf files in folder into one single file:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk *pdf cat output book.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Exports the pdf metadata to a text file, to edit:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf  dump_data output report.txt&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Imports the metadata of report.txt back to the PDF:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt; pdftk book.pdf update_info report.txt output bookcopy.pdf&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Distribution =&lt;br /&gt;
&lt;br /&gt;
Think about how people who would be interested in this book could know about it!&lt;br /&gt;
&lt;br /&gt;
Repositories:&lt;br /&gt;
&lt;br /&gt;
* General “educational materials”: [https://libgen.io/ Library Genesis]&lt;br /&gt;
* Academic radical: [https://aaaaarg.org/ Aaaaarg]&lt;br /&gt;
* Artist radical: [https://monoskop.org/ Monoskop]&lt;br /&gt;
* Anarchist (including fanzines): [https://theanarchistlibrary.org/special/index Anarchist Library]&lt;br /&gt;
* There are many Zine Libraries you can find on the Internet…&lt;br /&gt;
&lt;br /&gt;
You may consider spreading the word on relevant mailing lists, social media, etc.&lt;br /&gt;
&lt;br /&gt;
= Biblio-graphy =&lt;br /&gt;
&lt;br /&gt;
* [https://www.memoryoftheworld.org/wp-content/uploads/2014/12/scanning_manual_v1.2.pdf Scanning Manual from Memory of the World]: a quite long document in PDF&lt;br /&gt;
* [https://www.memoryoftheworld.org/ Memory of the World]: Digital Public Libraries&lt;br /&gt;
* [https://www.memoryoftheworld.org/es/ Spanish pages on Memory of the World]: Digital Public Libraries in Spanish&lt;br /&gt;
* [http://en.flossmanuals.net/e-book-enlightenment/ Reading And Leading With One Laptop Per Child]: Book digitalisation manual&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
	<entry>
		<id>http://wiki.calafou.org//index.php?title=File:SplitPages.png&amp;diff=3386</id>
		<title>File:SplitPages.png</title>
		<link rel="alternate" type="text/html" href="http://wiki.calafou.org//index.php?title=File:SplitPages.png&amp;diff=3386"/>
		<updated>2019-01-24T19:11:47Z</updated>

		<summary type="html">&lt;p&gt;Jxxx: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Jxxx</name></author>
	</entry>
</feed>