book scanning

Discussion in 'The Tablet PC Life' started by leaftye, Jun 7, 2009.

Thread Status:
Not open for further replies.
  1. leaftye

    leaftye Old timer Super Moderator

    Messages:
    4,127
    Likes Received:
    20
    Trophy Points:
    106
    The issue with programs like ReadIRIS is that they're not perfect, and sometimes VERY far from perfect. Pure images aren't going to have bungled characters, and hard drive space is plentiful and cheap, so I scan as images with OCR in the background.

    I've attached an example of what my scans look like. They're scanned in at 300 dpi. I've adjusted the black and white levels, then reduced the number of colors. As you can see if you zoom in all the way, these scans look very good.

    Sample pages

    Outside of the actual time to scan the pages, I probably spend about 30 minutes hands-on per book setting up the batch file to run, merging the images together into a pdf, performing ocr, splitting the book into chapters, adding bookmarks, adding links, and adding logical page numbering.

    If I tried to use OCR to convert the images into text and tables and such, my time spent per book would multiply by at least an order of magnitude due to all the extra checking I'd have to do.
     
    Last edited by a moderator: May 18, 2015
  2. CalebSchmerge

    CalebSchmerge Woof

    Messages:
    137
    Likes Received:
    0
    Trophy Points:
    31
    Well, with the X41 hard drive space isn't so pleasant. I have 40GB, its slow, and very expensive to upgrade. I will play with the OCR, in my experience with this program (a lot of other book scanning), with minimal work on my part its nearly perfect.
     
  3. Frank

    Frank Scribbler - Standard Member Senior Member

    Messages:
    3,847
    Likes Received:
    3
    Trophy Points:
    116
    It also depends on what type of book you scan.
    I scan books with a lot of mathematical formula, words, charts, ... which don't get recognized properly by any OCR software.
    But if you scan plain text, like a science fiction book, novel, ..., then it should be possible to convert everything to text only with only few post processing.

    To give you an idea how big a few of my books are when I keep the pages in the PDF as images (@300DPI):
    color, 1406 pages, 197x283mm (7,76x11,14in) = 740MB
    greyscale, 712 pages, 140x227mm (5,51x8,96in) = 186MB
     
  4. CalebSchmerge

    CalebSchmerge Woof

    Messages:
    137
    Likes Received:
    0
    Trophy Points:
    31
    I suppose that those files would be manageable. My medical microbiology book would be a breeze to scan, while my organic chemistry book would take more work. I might end up picking and choosing. I really only intend to scan books (or chapters) as necessary so that when I go home I can leave the physical book behind. I might end up liking it enough to scan everything, but I find it doubtful right now.
     
  5. leaftye

    leaftye Old timer Super Moderator

    Messages:
    4,127
    Likes Received:
    20
    Trophy Points:
    106
    My books aren't much smaller than Frank's. My Organic Chemistry book is 1330 pages, scanned in color at 300 dpi, 7.42x10.31 inches, and comes out to 520 megs after OCR. I think OCR adds about 100 megs. The solutions manual for that Ochem book is 712 pages, scanned in greyscale at 300 dpi, 6.89x9.96 inches, and comes out to 166 megs after OCR.

    The publisher for my Ochem book published the first three chapters, and naturally those files would be as small as this data can get. Here how my chapters compare with those of the publisher:
    Chapter 1: mine=15.7MB, publisher=9.07MB
    Chapter 2: mine=16.8MB, publisher=9.17MB
    Chapter 3: mine=19.0MB, publisher=14.5 MB

    As you can see, it's not going to get much smaller. I know your hard drive is limited, but what about putting the books on a flash card in the card reader? I used to do that for a while, but I stopped because I kept accidentally ejecting the card.

    It's probably too late now, but the benefit to scanning it early is that you can sell the textbook to a classmate on the first day of class, often at a profit to you, and a savings to your classmate.
     
  6. CalebSchmerge

    CalebSchmerge Woof

    Messages:
    137
    Likes Received:
    0
    Trophy Points:
    31
    I'm not wild about selling the books - again, I don't plan to scan the whole books as of now. But, I will play with things in the coming weeks and see what I think.
     
  7. fgruber

    fgruber Pen Pal - Newbie

    Messages:
    83
    Likes Received:
    0
    Trophy Points:
    15
  8. rod88

    rod88 Pen Pal - Newbie

    Messages:
    34
    Likes Received:
    0
    Trophy Points:
    16
    I'll be getting an OpticBook 3600 by the end of this semester. Hopefully I will have time to scan all my textbooks before my summer classes start.

    Here is my experience with book scanning:

    I've tried to use my my Fujifilm FinePix S1000fd together with a tripod and a heavy piece of nonglare glass.
    The process is very simple, yet very tedious. I first align the book with my camera so it points downwards making a 90 degrees angle with the floor. Then I get my piece of glass and lay it flat on the top of a page, take a shoot, lift the piece of glass up, turn the page, and take another shot.

    The only problem with this method is that it is EXTREMELY tedious. I have done that for 2 of my textbooks. I had to sit down on the floor all day and do it over 2000 times. By the end of the day, my back was in serious pain.

    Another problem is also that the book usually moves when you turn the pages, even if you tape the back of it to the ground. This specifically makes batch processing much much harder, since every chapter or so you have to adjust the rotation and alignment of the pages.

    I'm very concerned with the quality of the pictures, so I've also spent a great amount of time processing them using Photoshop batch processing.

    So, if you don't care much about quality, then the camera method is not bad. But be prepared to spend at least one day scanning a whole book.
     
  9. Frank

    Frank Scribbler - Standard Member Senior Member

    Messages:
    3,847
    Likes Received:
    3
    Trophy Points:
    116
    But using the OpticBook is also tedious, because you'll have to flip the whole book after each page, therefore the quality is superb.

    So in my opinion:

    • Plustek OpticBook:
      • Pro: perfect quality, can scan anything (new/used/borrowed book/magazine and other stuff)
      • Con: slow, expensive compared with normal flatbed scanners
    • ADF scanner:
      • Pro: fast, good quality
      • Con: you'll have to 'destroy' the book, can't scan borrowed books or single pages from a book, limited use (only paper)
    • Photo:
      • Pro: cheap and easy
      • Con: poor quality, moderate speed


    I bought the OpticBook and am happy that I did it. The quality is superb and I can use it to scan other things, too, or only a few pages of a book.
    I also like the simplicity of the software, not that bloated like the software other scanners use.
    It's just a great scanner, sadly a bit expensive.
     
  10. rod88

    rod88 Pen Pal - Newbie

    Messages:
    34
    Likes Received:
    0
    Trophy Points:
    16
    Frank, how much did you pay for yours?

    I think this question has already been asked before, but how long do you take to scan, let's say, a book that is 1000 pages?
     
Loading...
Thread Status:
Not open for further replies.

Share This Page