Archive for June, 2011


How I name my samples

In Sample-naming algorithm on June 12, 2011 by gcbenison

In programming and in other things, sometimes half your job comes down to picking good names for things

Working in a biochemistry lab means constantly generating new samples- sometimes dozens per day.  Some of these samples become the source of multiple data points.  Some samples end up being archived for months or even years.  Projects are interleaved in time.  All of this requires that every researcher adopt a system for attaching names to samples.  There is no choice here – you have a system, even if you didn’t consciously adopt it!

I want my sample names to meet these four criteria:

  • Unique – the name must be obviously distinct from thousands of others, and remain unique over time
  • Terse – long, descriptive names won’t fit on 1 cm.  tubes
  • Typical – a sample identifier should stand out in my notes as such, amid all the other writing
  • Easily generated – at the time of sample generation I want to spend zero effort and time on picking the name and ensuring that it meets the other three criteria

I solve the problem by choosing sample identifiers from a namespace composed of: the letter ‘G’ + the date in ‘DDMMYY’ format + one letter [A-Z].  For example, G120611A, G120611B, etc.  This provides 24 unique sample names per day.  I print them out on daily sample manifests with one unique name per line with space to pencil in a more verbose description:

Daily sample manifest

To assign a name to a sample, I just pick the next empty line on the manifest, and I have confidence that the chosen name is unique and will remain so. Keeping these manifests in a folder provides a succinct running record of all samples I have generated. The names are terse (8 characters) and easily fit on all common lab containers (note that I can and usually do include descriptive information on containers in addition to the unique identifier.)  I include an extra copy of the identifier on each line so I can cut it out and paste it to the physical sample.  Of course it can also be written with a Sharpie:

Eppie tube with unique labelUnique names on Falcon tubes

Having the unique identifier on the physical sample makes it easy to look up information about it in my notebook, and to be sure that the notes refer to that exact sample:

Notebook page

Here is a shell script that will generate sample manifests for the next 30 days:

# Generate pages with 24 unique sample names per page, one name per line,
# for the next 30 days.
# GCB 12jun2011


# header section
cat <<EOF
/title-font /Helvetica findfont 13 scalefont def
/mini-font  /Helvetica findfont 7  scalefont def
/std-font   /Helvetica findfont 10 scalefont def

/inch {72 mul} def

/page-height 9.5 inch def
/page-width 7.0 inch def
/n-rows 26 def
/v-delta page-height 20 sub n-rows div def

/title {
  1.1 inch dup translate

  0 page-height 0.5 inch sub translate
  0 0 moveto
  title-font setfont show

  2 inch 0 moveto
  std-font setfont
  (Mellies Lab, Reed College; unique sample labels) show
} def

/label {
  /txt exch def
  0 v-delta -1 mul translate

  0 0 moveto
  std-font setfont
  txt show

  0 -5 moveto
  0.6 setlinewidth
  page-width 0 rlineto stroke

  0.3 setgray
  (G666666A) stringwidth pop 10 add -5 moveto
  0 20 rlineto stroke

  mini-font setfont
  page-width (G666666A) stringwidth pop 2 mul sub 0 moveto
  txt show
  10 0 rmoveto
  txt show

} def


for delta in `seq -4 $n_days`
  date "+(%a, %b %d %Y) title" --d "+$delta days";
  for idx in A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
    date "+(G%d%m%y$idx) label" --d "+$delta days";
  echo "showpage";