When your resources aren’t perfect

This post is a bit of a “what would YOU do?” request for input. Let me begin with the background:

I am, on behalf of my office, working on a project to digitize (*cough*scan*cough*) all of the D.C. Laws from Council periods 1 through 7. These are unofficial copies, and are online in concert with the unofficial D.C. Code that while unofficial is the easiest to use in most circumstances. The scanning has been done by two interns placed with my office through Urban Alliance (I wrote about Urban Alliance before), the first who did a huge amount of work and the second who is going through and picking up where things were missed or scanned from poor copies.

And thus we have the phenomenon that leads me to my question. I now have in some cases two scanned versions of a law, each with its own problem. You know how you can have two of (a) good, (b) fast, and (c) cheap, but not all three? Well, in some cases I can have two of (a) legible margins (that is, from a flat original), (b) bottom lines of text not cut off, or (c) consistent appearance (that is, all of the pages from the same “original” and not combined from two separate scans).

In an ideal world, we would go and search out the original and scan from there. But that’s not happening here. It isn’t an ideal world, we don’t have perfect resources, and these aren’t intended to be archival quality. (There IS a risk that they could be used as “oh, someone’s already scanned these, yay we don’t have to.”)

Here is an example of this situation. What would you do? 1. Use the scan with poor margins. (law 5-129) 2. Use the scan with the bottom of the first page cut off. (L5-129) 3. Use page 1 of the scan with poor margins and the rest of the pages from the other scan. 4. Throw it all together in a single PDF because the scan with the problem on the first page also doesn’t have page numbers.

L5-129 law 5-129


4 comments for “When your resources aren’t perfect

  1. March 20, 2015 at 11:45 am

    especially if you’re willing to sacrifice consistent appearance for continuous legibility, if you have access to Acrobat Pro or similar, you might be able to interweave the two PDFs, selecting the better copy of each page…

    [this, of course, could get expensive time-wise, but may yet yield the best (i.e., most usable) result…]

    • dcdotnerd
      March 20, 2015 at 11:50 am

      Hi Buffalo, thanks for this suggestion. It’s actually what I’ve been doing in most cases, and my real problem is what to do when a single page has problems in both copies. I’m probably giving this project much too much of my attention!

      • infinitebuffalo
        March 20, 2015 at 1:34 pm

        Are there enough “good enough” pages between the two that you can just have an intern rescan the remainder from source?

        • dcdotnerd
          March 20, 2015 at 1:37 pm

          Maybe? Right now I’m leaving the re-scanning for the documents that are missing full pages.

Leave a Reply

Your email address will not be published. Required fields are marked *