Remaking ebooks from existing pdfs, djvu

Suppose you have an ebook or an article in pdf format, which unfortunately is not cleaned. By not cleaned we mean

  • Single page scan with edge darkening, pages not aligned that is text is rotated differently , page size different, library and use marks marks etc.
  • 2-in-1 scan: Two pages simultaneously scanned together, the central spine dark band, pages not rotated properly, edge and wear marks,  library marks etc.

In this case we cannot use the tools like scantailor for cleaning the images directly. For this we first need to extract images from the PDF file and then do a processing on these images. One can do extract the images one by one and process them, but then we can do it in a better way also.
First we split the pdf file into single PDFs by using the most versatile pdftk
For this in the terminal type
$ pdftk file.pdf burst
It will create as many pdf files as there are pages. with names like pg_0000.pdf etc.
Now next task is to convert these pdf to images, for this we use the convert command, but we don’t want to convert files one by one by
convert pg_0000.pdf pg_0000.tiff
But this is not very useful for large number of files, we want to make this in one go. So we do the following
$ for i in $(ls | grep pdf;);
do
convert -density 600 $i $i.tiff;
done
Lets see what these commands do:
ls
will list all the files in that directory
ls | grep pdf
This will filter out the files with pdf in the filename and provide us with a list
On this list we can do a lot of operations as we do in on any other list
for i in $(ls | grep pdf)
is calling each member of this list that we generated and treating it as variable i
and for each memberwe
do
the following
convert -density 600 $i $i.tiff
and after this is over the task is
done
We can set the dpi for the output images by passing the number, above it is set as 600. The output images will be named same as the input pdf files.
Now we can happily run scantailor on these images to clean them up!
PS:
Instead of a PDF if you have a djvu file we have another approach.
Step 1
Convert the djvu file into a multipage tif file, by using ddjvu command.
$ddjvu -format=tiff -verbose -quality=uncompressed input_file.djvu output_file.tif
With this command we will get a tiff format, with same resolution as the original djvu file.
Once the multipage tif file is there, it can be split into its original pages by tiffsplit command.
$tiffsplit input_file.tif
And we are done. Now we can happily run scantailor on these tiff files.
 

On the Marathas…

Meanwhile the task of resisting Aurangzeb called less for a saint than for a man of action ; and such a man appeared in the person of Sivaji Bonsla, the son of a chief of no great property in the neighbourhood of the Western Ghauts to the east of Bombay. Born in 1627 – the year when George Villiers, Duke of Bucking- ham, led his abortive expedition to Rochelle – he was brought up at Puna, and early conceived the ambition of dispossessing the Mohammedans of the south, and setting up a Hindu kingdom in their stead. His men were hardy peasants from the mountains ; his horses, not less important than his men, were drawn from the valleys; and with these he sallied forth to capture hill-fortresses, and to use them as bases for raids upon the surrounding country. Being a great military genius he rapidly achieved success; and by 1664 had carried his incursions so far as to seize and sack the imperial city of Gujarat. This was a direct defiance to Aurangzeb, who sent an army to crush him, and succeeded in forcing him to surrender upon terms; but the wily chief soon contrived to escape, and returning to the Dekhan quickly reestablished and widened his ascendancy. He died in 1680, but he had already done his work in founding the power of the Marathas.
What the Marathas exactly were or are no one seems able accurately to define. They were not a caste, they were not a sect, they were not a nation; and, though some of them claim to be of Rajput origin, this pretension seems to be disposed of by anthropometric tests. Their name is taken from the territory of Maharashtra, and their language is called Marathi ; but they are not the only inhabitants of that territory nor the only speakers of that tongue. In 1901 they numbered only five millions; and yet in the seventeenth century they ruined the armies of Aurangzeb, shattered the might of the Moguls and bade fair to become the masters of India. It is difficult therefore to predicate anything certain of them except that they were and are emphatically a power, and that they rose to that eminence wholly by the sword. Yet, though they were valiant warriors, their military organisation was loose enough ; while their military tactics, if one may coin an expression, were of the offensive-elusive order. They swarmed out as great disorderly bodies of horse, devouring the country like locusts, carefully avoiding anything like a pitched battle, but hovering always about their enemy’s flanks and communications, swift to see and to make profit of the slightest advantage, equally swift to perceive and to avoid any danger. Thus they wore out the Mogul armies, and broke the hearts of their generals by remaining always near enough to inflict much mischief, but always remote enough to suffer no harm. If they were suddenly compelled to assume the defensive, they had a perfect genius for choosing and occupying a position where they could resist attack ; and woe to the army that retreated before them. Their leaders have always included some of the deepest and subtlest intellects in India ; and yet their genius, so long as their ascendancy lasted, revealed itself as mainly destructive, and their instincts as wholly predatory. They levied tribute remorselessly, under pain of pillage, upon vast districts, and on condition of payment suffered them to escape famine and desolation. They showed, indeed, remarkable administrative talent in the collection of that tribute; but there their constructive work came to an end. It is therefore hard to see how India could have improved – how indeed it could have failed to deteriorate – under their mastery. The history of the country, so far as we have traced it, has been a continuous record of wars, revolts and intestine divisions ; in the midst of which, at rare intervals of precarious repose, there had sprung up noble monuments of art and literature. There was nothing creative about the Marathas. Their reign, it is true, was short; but, even had it been prolonged, we can hardly conceive of the association of poetry or architecture with their name. For all their valiance and subtlety their rule was a blight rather than an influence. Once indeed, and in one particular, they imitated a foreign model in their own domain of war ; and we must now examine where they found this model, and how it was turned to their own ruin.
via text of “Narrative of the visit to India of their majesties, King George V. and Queen Mary, and of the coronation durbar held at Delhi, 12th December, 1911” by Fortescue, John, Sir, 1859-1933.

A parable on…

A Parable

Once upon a time, in a far away country, there was a community that had a wonderful machine. The machine had been built by most inventive of their people … generation after generation of men and women toiling to construct its parts… experimenting with individual components until each was perfected… fitting them together until the whole mechanism ran smoothly. They had built its outer casing of burnished metal and on one side, they had attached a complex control panel. The name of the machine, KNOWLEDGE, was engraved on a plaque  set in the centre of the control panel.
The community used the machine in their efforts to understand the world and to solve all kinds of problems. But the leaders of the community were not satisfied. It was a competitive world… they wanted more problems solved and they wanted them solved faster.
The main limitation for the use of machine was the rate at which data could be prepared for input. Specialist machine operators called ‘predictors’, carried out this exacting and time consuming task… naturally the number of problems solved each year depended directly on the number and skill of the predictors.
The community leaders focussed on the problem of training predictors. The traditional method, whereby promising girls and boys were taken into long-term apprenticeship, was deemed too slow and too expensive. Surely, they reasoned, we can find more efficient approach. So saying,  they called the elders together and asked them to think about the matter.
After a few months, the elders reported that they had devised an approach that showed promise. In summary, they suggested that the machine be disassembled. Then each component could be studied and understood with ease… the operation of machine would become an open book to all who cared to look.
Their plan was greeted with enthusiasm. So, the burnished covers were pulled off, and the major mechanisms of the machine fell out… they had plaques with labels like HISTORY and GEOGRAPHY and PHYSICS and MATHEMATICS. These mechanisms were pulled apart in their turn… of course, care was taken to keep all the pieces in separate piles. Eventually, the technicians had reduced the machine to little heaps of metal plates and rods and nuts and bolts and springs and gear wheels. Each heap was put in a box, carefully labelled with the name of the mechanism whose part it contained, and the boxes were lined up for the community to inspect.
The members of the community were delighted. Their leaders were ecstatic. They ‘oohed’ and ‘aahed’ over the quality of components, the obvious skill that had gone in their construction, the beauty of designs. Here, displayed for all, were the inner workings of KNOWLEDGE.
In his exuberance, one man plunged his hand into a box and scooped up a handful of tiny, jewel-like  gear wheels and springs. He held them out to his daughter and glancing, at the label on the box, said:
“Look, my child! Look! Mathematics! ”
From: Turtle Speaks Mathematics by Barry Newell
You can get the book (and another nice little book Turtle Confusion) here.
 

Max Planck – on scientific truth

A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.

via Max Planck – Wikiquote.
This I think in general applies to the state of educational system in general. All the people who are opposing the use of technology in the classrooms, will never see the light, but instead will just die and the newer generation will induced to teach with use of technology.