Open Access Manifesto

Information is power. But like all power, there are those who want to keep it
for themselves. The world's entire scientific and cultural heritage, published
over centuries in books and journals, is increasingly being digitized and locked
up by a handful of private corporations. Want to read the papers featuring the
most famous results of the sciences? You'll need to send enormous amounts to
publishers like Reed Elsevier.
There are those struggling to change this. The Open Access Movement has fought
valiantly to ensure that scientists do not sign their copyrights away but
instead ensure their work is published on the Internet, under terms that allow
anyone to access it. But even under the best scenarios, their work will only
apply to things published in the future.  Everything up until now will have been
lost.
That is too high a price to pay. Forcing academics to pay money to read the work
of their colleagues? Scanning entire libraries but only allowing the folks at
Google to read them?  Providing scientific articles to those at elite
universities in the First World, but not to children in the Global South? It's
outrageous and unacceptable.
"I agree," many say, "but what can we do? The companies hold the copyrights,
they make enormous amounts of money by charging for access, and it's perfectly
legal - there's nothing we can do to stop them." But there is something we can,
something that's already being done: we can fight back.
Those with access to these resources - students, librarians, scientists - you
have been given a privilege. You get to feed at this banquet of knowledge while
the rest of the world is locked out. But you need not - indeed, morally, you
cannot - keep this privilege for yourselves. You have a duty to share it with
the world. And you have: trading passwords with colleagues, filling download
requests for friends.
Meanwhile, those who have been locked out are not standing idly by. You have
been sneaking through holes and climbing over fences, liberating the information
locked up by the publishers and sharing them with your friends.
But all of this action goes on in the dark, hidden underground. It's called
stealing or piracy, as if sharing a wealth of knowledge were the moral
equivalent of plundering a ship and murdering its crew. But sharing isn't
immoral - it's a moral imperative. Only those blinded by greed would refuse to
let a friend make a copy.
Large corporations, of course, are blinded by greed. The laws under which they
operate require it - their shareholders would revolt at anything less. And the
politicians they have bought off back them, passing laws giving them the
exclusive power to decide who can make copies.
There is no justice in following unjust laws. It's time to come into the light
and, in the grand tradition of civil disobedience, declare our opposition to
this private theft of public culture.
We need to take information, wherever it is stored, make our copies and share
them with the world. We need to take stuff that's out of copyright and add it to
the archive. We need to buy secret databases and put them on the Web. We need to
download scientific journals and upload them to file sharing networks. We need
to fight for Guerilla Open Access.
With enough of us, around the world, we'll not just send a strong message
opposing the privatization of knowledge - we'll make it a thing of the past.
Will you join us?
Aaron Swartz
July 2008, Eremo, Italy
via | Open Access Manifesto

Reading in e-book era

Reading without surveillance, publishing without after-the-fact censorship, owning books without having to account for your ongoing use of them: these are rights that are older than copyright. They predate publishing. They are fundamentals that every bookseller, every publisher, every distributor, every reader, should desire. They are foundational to a free press and to a free society. If you sell an ebook reader is designed to allow Kafkaesque repossessions, you are a fool if you expect anything but Kafkaesque repossessions in their future. We’ve been fighting over book-bans since the time of Martin Luther and before. There is no excuse for being surprised when your attractive nuisance attracts nuisances.
via Boing Boing.

I agree completely.Though cases like these are going to become more common, unless we switch to a technology which we can see that is Free as in Freedom. Governments and corporates are going to use this technology against the people who are using it. It will create profiles of “dangerous” people who are reading revolutionary material, for example. It will go unchecked if we just are using the technology without questioning it.
Also see RMS’s view on this topic.

On-line Education | RMS

Educators, and all those who wish to contribute to on-line educational works: please do not to let your work be made non-free. Offer your assistance and text to educational works that carry free/libre licenses, preferably copyleft licenses so that all versions of the work must respect teachers’ and students’ freedom. Then invite educational activities to use and redistribute these works on that freedom-respecting basis, if they will. Together we can make education a domain of freedom.

via On-line Education|RMS
Mostly people don’t bother about what they get for gratis on the Internet, but institutions cannot adopt the same approach. Licensing is as much important as much as the actual content. But an archaic system will not go down till it is compelled to, and it will fight till the very end.

When Kings Rode To Delhi…

Recently read a book titled When Kings Rode to Delhi by  Gabrielle Festing, which is available here. In the book there is a chapter on Sivaji called The Mountain Rat, title supposedly given by Aurangazeb to Sivaji. After the killing of Afzal Khan, this is what the author has to say:

 In the eyes of a Maratha, who believed himself Bhavani’s chosen warrior, such treachery was meritorious, and the slaughter of the envoy was an act of devotion.

Further the author describes various exploits and acts of Shivaji and in the end he says:

An attempt has been made to cast a glamour about him and his hordes, as patriots, deliverers of their country from foreign rule, devoted heroes who faced desperate odds. After a dispassionate survey no glamour remains. Sivaji was a typical Maratha of the best kind that is to say, he was as unlike the Rajputs from whom he claimed descent as the South African Boer from the good Lord James of Douglas. Never, unless they were driven to it, did the Marathas fight a pitched battle in open field ; the joy of fighting, which made the Rajput deck himself with the bridal coronet, the desperate valour which heaped the plain of Samugarh with yellow robes till it looked like a meadow of saffron, was incomprehensible to the wolves of the Deccan. They fought, not for a point of honour, or because they enjoyed fighting, but in a commercial. spirit, for the sake of what they could get; their word for “to conquer in battle” means simply “to spoil an enemy.” The Rajput was indolent, when not roused by pride or the thirst for battle; the Maratha was untiringly energetic as long as he had anything to gain, but would sacrifice nothing for pride or scruple.
This must be said for Sivaji, that while he lived his followers were forbidden to plunder mosques or women ; after his death his son pursued a different policy.

Free Software Tools for scanning and making e-books

How to give a new life to books which are out of copyright!
Here is a short summary of the Free Software tools that I have found useful for converting hard copies into readable/searchable formats  in GNU/Linux!
Typically the making a soft-copy from a hard-copy involves following steps:

Step1:
Scan the Hard copy using a scanner / camera. This step generates image files
typically .tiff, .png or .jpeg. Some scanning programs also have option of directly generating to .pdf
Basically at this stage you have all the data, if you compress the folder into a comic book reader format .cbr or .cbz format you are good to go. But for a more professional touch read on. The main step to scan the books properly. Some do’s and dont’s
Align the pages to the sides of the scanner.
If the book is small size scan 2 pages at once.
If the book is too large adjust the scan in the image preview side so that only one page is scanned.
If these steps are done properly there is a little that we have to do in the second step. And we can directly jump to Step 3.
Preferably scan in the binary grayscale form, unless there are colored images in the text. This will help reduce the final size of the file.
Scan at minimum 300 dpi, this is the optimum level that I have come to after trials and errors with different resolutions, their final results and the time taken for each scan. Of course this can differ depending on what is that you are scanning. Many people do the scanning at 600 dpi, but I am happy at 300 dpi. Note: The 300 dpi images can be upscaled in scan-tailor to 600 dpi.
First of all for the scanning itself. Most of the scanners come with an installation disk for M$-Windows or Mac-OSX. But for GNU/Linux there seems to be no ‘installation disk’. The Xsane package allows quite a few scanners which are detected and are ready for use as soon as you plug them in.
The list of the scanners which are supported by Xsane can be found here:
http://www.sane-project.org/sane-mfgs.html
When we bought our scanner we had to search this list to get the compatible scanner.
What is the problem with the manufacturers, why do they not want to sell more, to people who are using Free Software?
If your scanner is not in the list, then you might have to do some R&D before your scanner is up and running like I had to do for my old HP 2400 Scanjet at my home.
Once your scanner is up and running.  You scan the images preferably in .tiff format as they can be processed and compressed without much loss of quality. This again I have found by trial and error.
Step2:
Crop the files and rotate them to remove unwanted white spaces or
accidental entries of adjoining pages from the images that were obtained. When the pages are scanned as 2 pages in one image, we may need to separate the pages.
Initially I did it manually, it was the second most boring part after the scanning. But I have found a very wonderful tool for this work.
Imagemagick provides a set of tools which work like magick in images, hence the name I guess 🙂
This is one of the best tools for batch processing image files.
Then I found out the dream tool that I was looking for.
The is called Scan-Tailor, as the name suggests it is meant for processing of scanned images.
Scan Tailor can be found at http://scantailor.sourceforge.net/ or directly from Ubuntu Software Centre.
Step by step scan tailor cleans and creates amazingly good output files from relatively unclean images.
There are a total of 6 steps in scan-tailor which produces the desired output.
You have to choose the folder in which your scanned images are. Scan-tailor produces a directory called out in the same folder by default. The steps are as follows

  1. Change the Orientation: This enables one to change the orientation of all the files in the directory. This is good option in case you have scanned the book in a different orientation.
  2. Split Pages: This step will tell whether the scans that we have made are single page scans, single page with some marginal text from other page or two page scans. Most of the times the auto detection works well with single page and two page scans. But it is a good idea to check manually whether all the pages have been divided correctly, so that it does not create problems later. If you find that a page has been divided incorrectly then we can slide the margin to correct it. In case of two page scans the two pages are shown with a semitransparent blue or red layer on top of them. After looking at all the pages we commit the result.
  3. Deskew: After the pages have been split we need to change the orientation for better alignment of the text. Here in my experience most of the auto-orientation works fine. But still it is a good idea to check manually the pages, in case something is missed.
  4. Select Content: This is the one step that I have found as the most useful one in the scan-tailor. Here you can select the portion of the text that will appear in the final output. So that you can say goodbye to all the dark lines that come inevitably as part of scanning. Also some library marks can be removed easily by this step. The auto option works well when the text is in nice box shape, but it may leave wide areas open also. The box shape can be changed the way we want. If you want a blank page, remove the content box, by right clicking on the box.
  5. Page Layout: Here one can set the dimensions for the output page and how each page content will be on the page.
  6. Output: Produces the final output with all the above changes.

The output is stored in a directory called Out in the same folder. The original images are not changed, so that in case you want some changes or something goes wrong we can always go  back to the original files. Also numbering of the images is done.
So we have cleaned pages of same size from the scanned pages.
Update: The latest scantailor has image -de-warping facility. See the amazing thing at work here:

Step 3:

Collate the processed files in Step 2 to one single PDF. For this I have used the convert command.
Typical synatax is like this

convert *.tiff output.pdf
This command will take all the .tiff files in the given directory and collate these files into a pdf named output.pdf

http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
Alternative to Step 3
Another alternative is to use gscan2pdf for joining the image files into pdf and doing the OCR which can be tesseract or cunieform. gscan2pdf is also able to scan files and stich them into pdf , but I would recommend that you use scantailor as one of the intermediate steps.
Also using gscan2pdf gives you an option for editing the files, if, for example, you might want to remove some marks from the images. For this it opens the image in GIMP.

Step 4: 
OCR the PDF file.
Now this is again tricky, I could not find a good application which would OCR the pdf file and embed the resulting text on the pdf file. But I have found a hack on the following link which seems to work fine 🙂
http://blog.konradvoelkel.de/2010/01/linux-ocr-and-pdf-problem-solved/
The hack is a bash script which does the required work.
Alternate
gscan2pdf can do OCR for you using cunieform or tesseract as backends. The end result is a searchable text, but it does not sit on the image, as it would happen in a vector pdf, but is embedded on the page as “note” at the top-left-hand corner.
Step 5:
Optimize the PDF file generated in Step 4.
Here there is a nautilus shell script which I have found in the link below which does optimization.
http://www.webupd8.org/2010/11/download-compress-pdf-12-nautilus.html
Step 6: 
In case you want to convert the .pdf to .djvu there is one step solution for that also

pdf2djvu -o output.djvu input.pdf

 
The tips and tricks here are by no means complete or the best. But this is what I have found to be useful. Some of the professional and non-free softwares can do all these, but the point of writing this article was to make a list of Free and Open Source Softwares for this purpose.
Comments and suggestions are welcome!

Of Bibliophilia…



Well the other day while surfing the net I found some thing about me. Something about the things that I do has been so clearly defined,I never even wondered that there could be people who have defined and categorised terms like this one.

See this and you will understand. [Or is it this?]
This is one attribute that I certainly have. Collecting and reading books is a passion that I nurtured from my childhood. The ones that I had and read in my childhood were the comics. I read a whole lot of them, covering entire series. So when I went to collect `old’ comics at the Sita Bardi old book sellers, I did got interested in the other books they were selling. So I started buying them also. Initially the budgets were very low, so….
The major ones that I brought in this time were the Russian published Mir titles. I collected a lot over the years and they form one of the most prized collections that I have.
When I shifted to Pune visiting the Deccan `bridge’ became almost a ritual. Almost all the books I acquired during my stay in Pune were brought from Mr. Prabhakar and co. I don’t even have a photo of these guys, maybe next time I go, I will get one…

Update: This is the photo of Mr. Prabhakar that I took in the last trip..
I became one of the regulars there. And so were others….
Also another incident happened in Pune, which really made me in this regard. Me and Samir went to a certain prestigious library, where we were told by the librarian “We don’t need people like you in our library.” Well this really changed my attitude towards possession of books. Books are the key to the code of that knowledge, why it should not be open to all in a free society…
Ebooks are going to change this. You don’t loose an e-book when you give it to someone.
There is another thing that is a bit strange which has happened with me many times. It is cannot be certainly be put in rational sense. The idea that I have is that books call me! Yes you read it right. I many times feel incredibly attracted towards a book when I see it. I mean,  I feel that I have to have this book, there is no compromise….  I don’t know how to explain this, but the books that I have got by this `intuition’ have proved to be immensely useful to me one way or the other. They have at times opened an entirely different world altogether for me.
Some of the titles that I got by this `intuition’ are Larry Collins, Dominique La Pierre – Freedom at Midnight, Douglas Hofstadter, Daniel Dennett – Mind’s I, Martin Gardner – Why’s of a  Philosophical Scrivener among others.
I did not know before that such books even exist. Let alone their content. But when I saw these books I felt this very strong `urge’ that the book is saying, “Take Me with You.” Maybe you are wondering that this guy is nuts, maybe I am but this is what I have experienced.
My life is taking me westwards, literally. Nagpur to Pune, now Pune to Mumbai. Further west is the sea, where do I go from there?
Leaving Pune among other things I had one pain of leaving `the bridge’. Because I had become addicted to go there. Even if I had less money, had no other work, I had to go there. I dunno, maybe it had become an OCD.
Also another thing that I want to tell is about what I feel when I am going through a stack of books at the book seller. I have got used to the shops that I visit frequently so that I know where to look what I want. In exhibition it is  many times much more messier, as the organizers themselves don’t know what the stock of books is. Also when I scan a set of books I look for certain features that I cannot describe, maybe it is like the irrational Logic of Scientific Discovery which Karl Popper proposes. But here again I can find books which others cannot spot.
When I came to Bombay, I became a regular at the Fort and Matunga areas. It has been quite some months since I have visited Matunga, but fort I do frequent a lot.
Each time I go I have another subject or theme  which is added in the books that I look for. The broader subjects include
Physics, Mathematics, Astronomy, Electronics, Chemistry, History, Philosophy, Art, Education, Science fiction, psychology and so on…
Also with the ebooks, this collection has been taken to altogether another dimension, now I have about 7000 e-books [and counting]. In this case maybe the bibliomaniac definition is true for me.
Try these and I hope that you won’t be disappointed
ALL CREDITS TO THE ORIGINAL UPLOADERS!!
Space [both mental and physical] really becomes a problem when you have such more books to handle than you can. Anyways it has been and I guess will be a problem for me throughout my life. But I am happy that I have this problem.
Till then wish me another book…

Old Ex Libris for me!