Wednesday, January 13, 2010

Data organization and record keeping, Part 3

On the previous post, I stated that one of the problems with data organization and record keeping training is that the training typically occurs on the scale of undergrad projects. Then we encounter a major discontinuity in the scale of projects between undergrad and grad work, but typically don't have extra training at the graduate level.

So what do I mean by the scale of projects? Well, in undergrad lab classes, I generally had a lab (= set of experiments) to complete once per week for the duration of the semester. Each lab was designed to take a few hours. Most of the labs were minimally related to each other, if related at all. And of course, like most pre-graduate work, labs were usually handed to us as a package deal: here's a lab related to this discrete set of information, here's the equipment and materials, here's the procedure, and here's some results and conclusions you should be able to draw. Writing that up was pretty easy-peasy.

Of course, I did encounter projects of larger scale. Notably, science fair projects from elementary, middle, and high school were a more relevant scale than my college-level labs. And I did undergraduate research. It was undergrad research where I learned that even good scientists sometimes discover parameters that are affecting their results after-the-fact. But still, none of these experiences really prepared me for the scale of my graduate level project.

Now let's think about the scale of graduate level work. Graduate level work is almost the opposite of the pre-packaged labs we're given in undergrad. We're given an unsolved problem to solve. It could take anywhere from a few hours to a few years. It may require knowledge from 1 discipline or 10. We likely design, build, and troubleshoot the methods and/or equipment. In fact, our problem may not even be a solvable problem with the current levels of knowledge and technology, and we may have to or want to change the direction of the work as we go. Now writing that up? Not so easy-peasy.

Just think about the numbers for graduate-level work. If a Ph.D. student might take an average of 6 years to graduate, and does relevant work 50 weeks per year, 5 days per week, that's 6 years * 250 days/year = 1500 working days. A daily record is easily more than 1 page per day, so that's more than 1500 pages of daily records. My lab notebooks are about 150 pages each, so I should have more than 1500 pages of records, more than 10 lab notebooks.

So fine, for a graduate career, with a daily record we end up with more than 1500 pages of notes. That's not really a big deal. It's an indexing challenge for sure, and one that should be discussed. But beyond the indexing challenge, what I've seen is that most people don't keep daily records and don't end up with nearly this amount of notes. Why not?

And when it comes to checking up on trainees data organization and record-keeping skills, generally by the time a PI would notice a problem, a lot of time would already be wasted. Most PI's at best have weekly discussions with trainees, so if a trainee seems on top of things week to week, everything seems fine. But what happens when it's time to write up a paper or make a talk that includes some older results? Or what happens when someone is supposed to return to some old results? This is the time when poor organization and record-keeping comes to light. But then it's too late. The damage is done and the time will have to be taken to sort through poorly organized notes, or redo large portions of poorly documented work.

As a sort-of aside, I have heard stories about PI's checking up on people's lab notebooks, usually in response to the PI's doubts that the trainee has been doing anything for a long period of time. This method of checking hardly seems fair, because typically the PI hasn't told the trainees that they should keep up with their daily progress in their lab notebook. While it may seem obvious to keep a daily record, many trainees tend to "save" their lab notebook for more final experiments, notes and results. The day-to-day drivel of notes and troubleshooting and uninterpretable results are often not written up at all, or, as biochem belle mentions, jotted down on paper towels and gloves and post-it notes, which often get trashed somewhere along the way.

So, some problems with data organization and record keeping are established. Training is inadequate to really cover the change in scale between undergrad and grad level projects, and no one really checks up on how trainees are doing until it's too late and mucho amounts of time are wasted.

No comments:

Post a Comment