Wednesday, January 13, 2010

Data organization and record keeping, Part 2

OK, so the previous post firmly established the importance of good data organization and record keeping. But why discuss it further? If everyone did it well, and/or if it were obvious how to do it well, then this would be the end of the discussion. But, it's not obvious how to do it well, and everyone doesn't do it well. And more personally, I think I could learn to do it better.

Since data organization and record keeping are clearly such an integral part of doing good science, one would expect an emphasis on training scientists to do it well, and ensuring that trainees are doing it well. In my experience, most scientist do get some training on the subjects, though usually not at the post-graduate level. And as for anyone checking up on my record keeping, that has never explicitly occurred, though perhaps it would if I obviously struggled with finding older results when they come up in discussions.

As for typical training on data organization and record keeping, I think most of it occurs before the graduate level. My first encounter with scientific record-keeping training came from elementary school and the first time I did a science fair project. That's when I learned about the "scientific process" which is essentially the stuff that's supposed to go in a lab notebook entry. The goal or hypothesis, the methods, the results acquired, and the conclusions and future directions. From there, it's been more about practicing how to do that efficiently, accurately, and appropriately for the experiment at hand.

Here's where I think the problem occurs: there is a major discontinuity between the scale of pre-graduate projects that give data organization and record keeping training and the scale of graduate level projects. It's like jumping from a big-wheel to a fancy, multi-geared road bike, with no tricycle and no training-wheels sessions in-between. Sure, you can do it, but there's bound to be a lot more scraped knees in the process. (See below for illustrative images.)

Anyone can ride on a Big Wheel

On a big-kid bike, undesired results may occur with inadequate training

What does it mean to have a discontinuity in scale between training and graduate level work, and why is it a problem? Stay tuned for more posts.

