Skip to content

How to create a sample annotation file before analysis

Step 1

Create a table in Libreoffice or Excel with one or more columns that will uniquely identify each of your sample. For example,

  1. In a plate reader or qPCR experiment, you will want a plate number column and a well number column for each measured well.
  2. In a flow cytometry experiment, create a column with the unique specimen number or the unique filename for each sample.
  3. In a deep sequencing experiment, it will be the unique index for each library.
  4. In a Western blot, it will be a gel number column and a lane number column.

Step 2

Create one meta data column for each biological and technical variable that you are changing in your experiment.

For example:

  • you did your experiment with three different gene knockouts
  • each knockout had two different fluorescent reporters
  • each fluorescent reporter is induced at 3 different concentrations
  • there are 3 technical replicates for each strain.

So the metadata columns you will create are:

  1. knockoutgene
  2. reporter
  3. inductionlevel
  4. replicate

In addition, create a column called samplelabel with a succinct unique name for each sample (see the example below).

Step 3

Save the file as sampleannotations.csv by choosing CSV file type and , as field separator.

Important Pointers

  • Remember that each variable gets a column. For example, if you collected the same sample at different time points, then create a timepoint metadata column.
  • You can insert NA in columns for which the metadata is not applicable – for example, use NA for an empty well in a plate reader experiment.
  • Create a comment column that can include any atypical explanation.
  • Do not insert spaces, capital letters or non-alphanumeric characters into your annotations file.

Examples

sampleannotations.csv for a plate reader experiment

plate well knockoutgene yfpreporter iptginductionlevel replicate comment
1 A1 NA NA NA NA blank
1 A2 sera tcg 0 1
1 A3 sera tcg 0 2
1 A4 sera tcg 0 3
1 A5 sera tcg 10 1
1 A6 sera tcg 10 2
1 A7 sera tcg 10 3
1 A8 sera tcg 100 1
1 A9 sera tcg 100 2
1 A10 sera tcg 100 3
1 A11 NA NA NA NA blank
1 A12 NA NA NA NA blank
1 B1 NA NA NA NA blank
1 B2 sera agc 0 1
1 B3 sera agc 0 2
1 B4 sera agc 0 3
1 B5 sera agc 10 1
1 B6 sera agc 10 2
1 B7 sera agc 10 3
1 B8 sera agc 100 1
1 B9 sera agc 100 2
1 B10 sera agc 100 3
1 B11 NA NA NA NA blank
1 B12 NA NA NA NA blank
1 C1 NA NA NA NA blank
1 C2 serb tcg 0 1
1 C3 serb tcg 0 2
1 C4 serb tcg 0 3
1 C5 serb tcg 10 1
1 C6 serb tcg 10 2
1 C7 serb tcg 10 3
1 C8 serb tcg 100 1
1 C9 serb tcg 100 2
1 C10 serb tcg 100 3
1 C11 NA NA NA NA blank
1 C12 NA NA NA NA blank
1 D1 NA NA NA NA blank
1 D2 serb agc 0 1
1 D3 serb agc 0 2
1 D4 serb agc 0 3
1 D5 serb agc 10 1
1 D6 serb agc 10 2
1 D7 serb agc 10 3
1 D8 serb agc 100 1
1 D9 serb agc 100 2
1 D10 serb agc 100 3
1 D11 NA NA NA NA blank
1 D12 NA NA NA NA blank
2 A1 NA NA NA NA blank
2 A2 serc tcg 0 1
2 A3 serc tcg 0 2
2 A4 serc tcg 0 3
2 A5 serc tcg 10 1
2 A6 serc tcg 10 2
2 A7 serc tcg 10 3
2 A8 serc tcg 100 1
2 A9 serc tcg 100 2
2 A10 serc tcg 100 3
2 A11 NA NA NA NA blank
2 A12 NA NA NA NA blank
2 B1 NA NA NA NA blank
2 B2 serc agc 0 1
2 B3 serc agc 0 2
2 B4 serc agc 0 3
2 B5 serc agc 10 1
2 B6 serc agc 10 2
2 B7 serc agc 10 3
2 B8 serc agc 100 1
2 B9 serc agc 100 2
2 B10 serc agc 100 3
2 B11 NA NA NA NA blank
2 B12 NA NA NA NA blank

sampleannotations.csv for a qPCR experiment

well rt amplicon initiation codonmutation replicate template qpcrlabel strain
B03 yes gpd na na 1 84+1 84q11 scHP15-2
B04 yes gpd ctgc 5xcgg 1 84+4 84q21 scHP286-1
B05 yes gpd aaaa 5xcgg 1 84+5 84q31 scHP291-1
B06 yes gpd ctgc 5xaga 1 84+8 84q41 scHP314-1
B07 yes gpd aaaa 5xaga 1 84+9 84q51 scHP315-1
B08 no gpd aaaa 5xaga 1 84-9 84q61 scHP315-1
C03 yes yfp na na 1 84+1 84q12 scHP15-2
C04 yes yfp ctgc 5xcgg 1 84+4 84q22 scHP286-1
C05 yes yfp aaaa 5xcgg 1 84+5 84q32 scHP291-1
C06 yes yfp ctgc 5xaga 1 84+8 84q42 scHP314-1
C07 yes yfp aaaa 5xaga 1 84+9 84q52 scHP315-1
C08 no yfp aaaa 5xaga 1 84-9 84q62 scHP315-1
D03 yes gpd na na 2 84+1 84q11 scHP15-2
D04 yes gpd ctgc 5xcgg 2 84+4 84q21 scHP286-1
D05 yes gpd aaaa 5xcgg 2 84+5 84q31 scHP291-1
D06 yes gpd ctgc 5xaga 2 84+8 84q41 scHP314-1
D07 yes gpd aaaa 5xaga 2 84+9 84q51 scHP315-1
D08 no gpd aaaa 5xaga 2 84-9 84q61 scHP315-1
E03 yes yfp na na 2 84+1 84q12 scHP15-2
E04 yes yfp ctgc 5xcgg 2 84+4 84q22 scHP286-1
E05 yes yfp aaaa 5xcgg 2 84+5 84q32 scHP291-1
E06 yes yfp ctgc 5xaga 2 84+8 84q42 scHP314-1
E07 yes yfp aaaa 5xaga 2 84+9 84q52 scHP315-1
E08 no yfp aaaa 5xaga 2 84-9 84q62 scHP315-1
F03 yes gpd na na 3 84+1 84q11 scHP15-2
F04 yes gpd ctgc 5xcgg 3 84+4 84q21 scHP286-1
F05 yes gpd aaaa 5xcgg 3 84+5 84q31 scHP291-1
F06 yes gpd ctgc 5xaga 3 84+8 84q41 scHP314-1
F07 yes gpd aaaa 5xaga 3 84+9 84q51 scHP315-1
F08 no gpd aaaa 5xaga 3 84-9 84q61 scHP315-1
G03 yes yfp na na 3 84+1 84q12 scHP15-2
G04 yes yfp ctgc 5xcgg 3 84+4 84q22 scHP286-1
G05 yes yfp aaaa 5xcgg 3 84+5 84q32 scHP291-1
G06 yes yfp ctgc 5xaga 3 84+8 84q42 scHP314-1
G07 yes yfp aaaa 5xaga 3 84+9 84q52 scHP315-1
G08 no yfp aaaa 5xaga 3 84-9 84q62 scHP315-1

sampleannotations.csv for a flow cytometry experiment

plate file strain stallcodon ncodonrepeats stallsites initiation gene gpdmkate2 citrine replicate
1 Specimen001B2B02001 by4741 na na na na na no no 1
1 Specimen001B3B03002 schp15 na na na na na yes no 1
1 Specimen001B4B04003 schp19 cgg 6 1 caaa maxhis3 yes yes 1
1 Specimen001B5B05004 schp20 aga 6 na caaa maxhis3 yes yes 1
1 Specimen001B6B06005 schp76 aga 5 na caaa pgk1 yes yes 1
1 Specimen001B7E07006 schp91 cgg 5 5 caaa pgk1 yes yes 1
1 Specimen001B8B08007 schp617 cca 8 na caaa pgk1 yes yes 1
1 Specimen001B9B09008 schp618 cca 8 na ccgc pgk1 yes yes 1
1 Specimen001B10B10009 schp619 cca 8 na ccaa pgk1 yes yes 1
1 Specimen001B11B11010 schp620 cca 8 na ccac pgk1 yes yes 1
1 Specimen001C2C02011 schp621 cca 8 na ccga pgk1 yes yes 1
1 Specimen001C3C03012 schp622 cca 8 na ctgc pgk1 yes yes 1
1 Specimen001C4C04013 schp623 cca 8 na aaaa pgk1 yes yes 1
1 Specimen001C5C05014 schp624 cca 8 na acgc pgk1 yes yes 1
1 Specimen001C6C06015 schp625 cca 8 na ctg pgk1 yes yes 1
1 Specimen001C7C07016 schp626 ccg 8 1 caaa pgk1 yes yes 1
1 Specimen001C8C08017 schp627 ccg 8 1 ccgc pgk1 yes yes 1
1 Specimen001C9C09018 schp628 ccg 8 1 ccaa pgk1 yes yes 1
1 Specimen001C10C10019 schp629 ccg 8 1 ccac pgk1 yes yes 1
1 Specimen001C11C11020 schp630 ccg 8 1 ccga pgk1 yes yes 1
1 Specimen001D2D02021 schp631 ccg 8 1 ctgc pgk1 yes yes 1
1 Specimen001D3D03022 schp632 ccg 8 1 aaaa pgk1 yes yes 1
1 Specimen001D4D04023 schp633 ccg 8 1 acgc pgk1 yes yes 1
1 Specimen001D5D05024 schp634 ccg 8 1 ctg pgk1 yes yes 1
2 Specimen002E2E02001 by4741 na na na na na no no 2
2 Specimen002E3E03002 schp15 na na na na na yes no 2
2 Specimen002E4E04003 schp19 cgg 6 1 caaa maxhis3 yes yes 2
2 Specimen002E5E05004 schp20 aga 6 na caaa maxhis3 yes yes 2
2 Specimen002E6E06005 schp76 aga 5 na caaa pgk1 yes yes 2
2 Specimen002E7E07006 schp91 cgg 5 5 caaa pgk1 yes yes 2
2 Specimen002E8E08007 schp617 cca 8 na caaa pgk1 yes yes 2
2 Specimen002E9E09008 schp618 cca 8 na ccgc pgk1 yes yes 2
2 Specimen002E10E10009 schp619 cca 8 na ccaa pgk1 yes yes 2
2 Specimen002E11E11010 schp620 cca 8 na ccac pgk1 yes yes 2
2 Specimen002F2F02011 schp621 cca 8 na ccga pgk1 yes yes 2
2 Specimen002F3F03012 schp622 cca 8 na ctgc pgk1 yes yes 2
2 Specimen002F4F04013 schp623 cca 8 na aaaa pgk1 yes yes 2
2 Specimen002F5F05014 schp624 cca 8 na acgc pgk1 yes yes 2
2 Specimen002F6F06015 schp625 cca 8 na ctg pgk1 yes yes 2
2 Specimen002F7F07016 schp626 ccg 8 1 caaa pgk1 yes yes 2
2 Specimen002F8F08017 schp627 ccg 8 1 ccgc pgk1 yes yes 2
2 Specimen002F9F09018 schp628 ccg 8 1 ccaa pgk1 yes yes 2
2 Specimen002F10F10019 schp629 ccg 8 1 ccac pgk1 yes yes 2
2 Specimen002F11F11020 schp630 ccg 8 1 ccga pgk1 yes yes 2
2 Specimen002G2G02021 schp631 ccg 8 1 ctgc pgk1 yes yes 2
2 Specimen002G3G03022 schp632 ccg 8 1 aaaa pgk1 yes yes 2
2 Specimen002G4G04023 schp633 ccg 8 1 acgc pgk1 yes yes 2
2 Specimen002G5G05024 schp634 ccg 8 1 ctg pgk1 yes yes 2

sampleannotations.csv for a deep sequencing experiment

index samplename genotype treatment type replicate
GTAGCC wtuntreatedmono wt untreated mono 1
AAGCTA wtifnmono wt ifn mono 1
GCCTAA rack11untreatedmono rack1 untreated mono 1
CGTGAT rack11ifnmono rack1 ifn mono 1
GATCTG rack12untreatedmono rack1 untreated mono 2
ATTGGC rack12ifnmono rack1 ifn mono 2
CACGAT wtuntreatedtotal wt untreated total 1
CAACTA wtifntotal wt ifn total 1
GGTAGC rack11untreatedtotal rack1 untreated total 1
GTAGAG rack11ifntotal rack1 ifn total 1
CAAAAG rack12untreatedtotal rack1 untreated total 2
ATGAGC rack12ifntotal rack1 ifn total 2

References

  1. Tidy data