How to create a sample annotation file before analysis
Step 1¶
Create a table in LibreOffice or Excel with one or more columns that will uniquely identify each of your sample. For example,
- In a plate reader or qPCR experiment, you will want a plate number column and a well number column for each measured well.
- In a flow cytometry experiment, create a column with the unique specimen number or the unique filename for each sample.
- In a deep sequencing experiment, it will be the unique index for each library.
- In a Western blot, it will be a gel number column and a lane number column.
💡 Tip¶
To avoid mistyping filenames, you can use simple shell commands to quickly set up your sample annotations file. For example, to annotate a flow cytometry experiment:
echo file,sample, > sampleannotations.csv # to create the annotations file
for file in *.fcs; do
echo ${file},, > sampleannotations.csv;
done
This will produce a spreadsheet with two columns, file and sample,
where each file ending with .fcs appears once.
The spreadsheet can then be edited further using Excel or LibreOffice.
For experiments with a lot of references to filenames (e.g. Western blots),
extra lines can be added to the for loop.
For example, to annotate a Western blot from a gel that had nine lanes:
echo file,lane, > sampleannotations.csv # to create the annotations file
for file in *.tif; do
for lane in {1..9}; do
echo ${file},${lane}, >> sampleannotations.csv;
done;
done
This will produce a spreadsheet with each of 9 lanes assigned to every image file that ends in .tif.
The spreadsheet can then be edited further using Excel or LibreOffice.
Step 2¶
Create one meta data column for each biological and technical variable that you are changing in your experiment.
For example:
- you did your experiment with three different gene knockouts
- each knockout had two different fluorescent reporters
- each fluorescent reporter is induced at 3 different concentrations
- there are 3 technical replicates for each strain.
So the metadata columns you will create are:
- knockoutgene
- reporter
- inductionlevel
- replicate
In addition, create a column called samplelabel with a succinct unique name for each sample (see the example below).
Step 3¶
Save the file as sampleannotations.csv by choosing CSV file type and , as field separator.
Important Pointers¶
- Remember that each variable gets a column. For example, if you collected the same sample at different time points, then create a
timepointmetadata column. - You can insert
NAin columns for which the metadata is not applicable – for example, useNAfor an empty well in a plate reader experiment. - Create a
commentcolumn that can include any atypical explanation. - Do not insert spaces, capital letters or non-alphanumeric characters into your annotations file.
Examples¶
sampleannotations.csv for a plate reader experiment¶
| plate | well | knockoutgene | yfpreporter | iptginductionlevel | replicate | comment |
|---|---|---|---|---|---|---|
| 1 | A1 | NA | NA | NA | NA | blank |
| 1 | A2 | sera | tcg | 0 | 1 | |
| 1 | A3 | sera | tcg | 0 | 2 | |
| 1 | A4 | sera | tcg | 0 | 3 | |
| 1 | A5 | sera | tcg | 10 | 1 | |
| 1 | A6 | sera | tcg | 10 | 2 | |
| 1 | A7 | sera | tcg | 10 | 3 | |
| 1 | A8 | sera | tcg | 100 | 1 | |
| 1 | A9 | sera | tcg | 100 | 2 | |
| 1 | A10 | sera | tcg | 100 | 3 | |
| 1 | A11 | NA | NA | NA | NA | blank |
| 1 | A12 | NA | NA | NA | NA | blank |
| 1 | B1 | NA | NA | NA | NA | blank |
| 1 | B2 | sera | agc | 0 | 1 | |
| 1 | B3 | sera | agc | 0 | 2 | |
| 1 | B4 | sera | agc | 0 | 3 | |
| 1 | B5 | sera | agc | 10 | 1 | |
| 1 | B6 | sera | agc | 10 | 2 | |
| 1 | B7 | sera | agc | 10 | 3 | |
| 1 | B8 | sera | agc | 100 | 1 | |
| 1 | B9 | sera | agc | 100 | 2 | |
| 1 | B10 | sera | agc | 100 | 3 | |
| 1 | B11 | NA | NA | NA | NA | blank |
| 1 | B12 | NA | NA | NA | NA | blank |
| 1 | C1 | NA | NA | NA | NA | blank |
| 1 | C2 | serb | tcg | 0 | 1 | |
| 1 | C3 | serb | tcg | 0 | 2 | |
| 1 | C4 | serb | tcg | 0 | 3 | |
| 1 | C5 | serb | tcg | 10 | 1 | |
| 1 | C6 | serb | tcg | 10 | 2 | |
| 1 | C7 | serb | tcg | 10 | 3 | |
| 1 | C8 | serb | tcg | 100 | 1 | |
| 1 | C9 | serb | tcg | 100 | 2 | |
| 1 | C10 | serb | tcg | 100 | 3 | |
| 1 | C11 | NA | NA | NA | NA | blank |
| 1 | C12 | NA | NA | NA | NA | blank |
| 1 | D1 | NA | NA | NA | NA | blank |
| 1 | D2 | serb | agc | 0 | 1 | |
| 1 | D3 | serb | agc | 0 | 2 | |
| 1 | D4 | serb | agc | 0 | 3 | |
| 1 | D5 | serb | agc | 10 | 1 | |
| 1 | D6 | serb | agc | 10 | 2 | |
| 1 | D7 | serb | agc | 10 | 3 | |
| 1 | D8 | serb | agc | 100 | 1 | |
| 1 | D9 | serb | agc | 100 | 2 | |
| 1 | D10 | serb | agc | 100 | 3 | |
| 1 | D11 | NA | NA | NA | NA | blank |
| 1 | D12 | NA | NA | NA | NA | blank |
| 2 | A1 | NA | NA | NA | NA | blank |
| 2 | A2 | serc | tcg | 0 | 1 | |
| 2 | A3 | serc | tcg | 0 | 2 | |
| 2 | A4 | serc | tcg | 0 | 3 | |
| 2 | A5 | serc | tcg | 10 | 1 | |
| 2 | A6 | serc | tcg | 10 | 2 | |
| 2 | A7 | serc | tcg | 10 | 3 | |
| 2 | A8 | serc | tcg | 100 | 1 | |
| 2 | A9 | serc | tcg | 100 | 2 | |
| 2 | A10 | serc | tcg | 100 | 3 | |
| 2 | A11 | NA | NA | NA | NA | blank |
| 2 | A12 | NA | NA | NA | NA | blank |
| 2 | B1 | NA | NA | NA | NA | blank |
| 2 | B2 | serc | agc | 0 | 1 | |
| 2 | B3 | serc | agc | 0 | 2 | |
| 2 | B4 | serc | agc | 0 | 3 | |
| 2 | B5 | serc | agc | 10 | 1 | |
| 2 | B6 | serc | agc | 10 | 2 | |
| 2 | B7 | serc | agc | 10 | 3 | |
| 2 | B8 | serc | agc | 100 | 1 | |
| 2 | B9 | serc | agc | 100 | 2 | |
| 2 | B10 | serc | agc | 100 | 3 | |
| 2 | B11 | NA | NA | NA | NA | blank |
| 2 | B12 | NA | NA | NA | NA | blank |
sampleannotations.csv for a qPCR experiment¶
| well | rt | amplicon | initiation | codonmutation | replicate | template | qpcrlabel | strain |
|---|---|---|---|---|---|---|---|---|
| B03 | yes | gpd | na | na | 1 | 84+1 | 84q11 | scHP15-2 |
| B04 | yes | gpd | ctgc | 5xcgg | 1 | 84+4 | 84q21 | scHP286-1 |
| B05 | yes | gpd | aaaa | 5xcgg | 1 | 84+5 | 84q31 | scHP291-1 |
| B06 | yes | gpd | ctgc | 5xaga | 1 | 84+8 | 84q41 | scHP314-1 |
| B07 | yes | gpd | aaaa | 5xaga | 1 | 84+9 | 84q51 | scHP315-1 |
| B08 | no | gpd | aaaa | 5xaga | 1 | 84-9 | 84q61 | scHP315-1 |
| C03 | yes | yfp | na | na | 1 | 84+1 | 84q12 | scHP15-2 |
| C04 | yes | yfp | ctgc | 5xcgg | 1 | 84+4 | 84q22 | scHP286-1 |
| C05 | yes | yfp | aaaa | 5xcgg | 1 | 84+5 | 84q32 | scHP291-1 |
| C06 | yes | yfp | ctgc | 5xaga | 1 | 84+8 | 84q42 | scHP314-1 |
| C07 | yes | yfp | aaaa | 5xaga | 1 | 84+9 | 84q52 | scHP315-1 |
| C08 | no | yfp | aaaa | 5xaga | 1 | 84-9 | 84q62 | scHP315-1 |
| D03 | yes | gpd | na | na | 2 | 84+1 | 84q11 | scHP15-2 |
| D04 | yes | gpd | ctgc | 5xcgg | 2 | 84+4 | 84q21 | scHP286-1 |
| D05 | yes | gpd | aaaa | 5xcgg | 2 | 84+5 | 84q31 | scHP291-1 |
| D06 | yes | gpd | ctgc | 5xaga | 2 | 84+8 | 84q41 | scHP314-1 |
| D07 | yes | gpd | aaaa | 5xaga | 2 | 84+9 | 84q51 | scHP315-1 |
| D08 | no | gpd | aaaa | 5xaga | 2 | 84-9 | 84q61 | scHP315-1 |
| E03 | yes | yfp | na | na | 2 | 84+1 | 84q12 | scHP15-2 |
| E04 | yes | yfp | ctgc | 5xcgg | 2 | 84+4 | 84q22 | scHP286-1 |
| E05 | yes | yfp | aaaa | 5xcgg | 2 | 84+5 | 84q32 | scHP291-1 |
| E06 | yes | yfp | ctgc | 5xaga | 2 | 84+8 | 84q42 | scHP314-1 |
| E07 | yes | yfp | aaaa | 5xaga | 2 | 84+9 | 84q52 | scHP315-1 |
| E08 | no | yfp | aaaa | 5xaga | 2 | 84-9 | 84q62 | scHP315-1 |
| F03 | yes | gpd | na | na | 3 | 84+1 | 84q11 | scHP15-2 |
| F04 | yes | gpd | ctgc | 5xcgg | 3 | 84+4 | 84q21 | scHP286-1 |
| F05 | yes | gpd | aaaa | 5xcgg | 3 | 84+5 | 84q31 | scHP291-1 |
| F06 | yes | gpd | ctgc | 5xaga | 3 | 84+8 | 84q41 | scHP314-1 |
| F07 | yes | gpd | aaaa | 5xaga | 3 | 84+9 | 84q51 | scHP315-1 |
| F08 | no | gpd | aaaa | 5xaga | 3 | 84-9 | 84q61 | scHP315-1 |
| G03 | yes | yfp | na | na | 3 | 84+1 | 84q12 | scHP15-2 |
| G04 | yes | yfp | ctgc | 5xcgg | 3 | 84+4 | 84q22 | scHP286-1 |
| G05 | yes | yfp | aaaa | 5xcgg | 3 | 84+5 | 84q32 | scHP291-1 |
| G06 | yes | yfp | ctgc | 5xaga | 3 | 84+8 | 84q42 | scHP314-1 |
| G07 | yes | yfp | aaaa | 5xaga | 3 | 84+9 | 84q52 | scHP315-1 |
| G08 | no | yfp | aaaa | 5xaga | 3 | 84-9 | 84q62 | scHP315-1 |
sampleannotations.csv for a flow cytometry experiment¶
| plate | file | strain | stallcodon | ncodonrepeats | stallsites | initiation | gene | gpdmkate2 | citrine | replicate |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Specimen001B2B02001 | by4741 | na | na | na | na | na | no | no | 1 |
| 1 | Specimen001B3B03002 | schp15 | na | na | na | na | na | yes | no | 1 |
| 1 | Specimen001B4B04003 | schp19 | cgg | 6 | 1 | caaa | maxhis3 | yes | yes | 1 |
| 1 | Specimen001B5B05004 | schp20 | aga | 6 | na | caaa | maxhis3 | yes | yes | 1 |
| 1 | Specimen001B6B06005 | schp76 | aga | 5 | na | caaa | pgk1 | yes | yes | 1 |
| 1 | Specimen001B7E07006 | schp91 | cgg | 5 | 5 | caaa | pgk1 | yes | yes | 1 |
| 1 | Specimen001B8B08007 | schp617 | cca | 8 | na | caaa | pgk1 | yes | yes | 1 |
| 1 | Specimen001B9B09008 | schp618 | cca | 8 | na | ccgc | pgk1 | yes | yes | 1 |
| 1 | Specimen001B10B10009 | schp619 | cca | 8 | na | ccaa | pgk1 | yes | yes | 1 |
| 1 | Specimen001B11B11010 | schp620 | cca | 8 | na | ccac | pgk1 | yes | yes | 1 |
| 1 | Specimen001C2C02011 | schp621 | cca | 8 | na | ccga | pgk1 | yes | yes | 1 |
| 1 | Specimen001C3C03012 | schp622 | cca | 8 | na | ctgc | pgk1 | yes | yes | 1 |
| 1 | Specimen001C4C04013 | schp623 | cca | 8 | na | aaaa | pgk1 | yes | yes | 1 |
| 1 | Specimen001C5C05014 | schp624 | cca | 8 | na | acgc | pgk1 | yes | yes | 1 |
| 1 | Specimen001C6C06015 | schp625 | cca | 8 | na | ctg | pgk1 | yes | yes | 1 |
| 1 | Specimen001C7C07016 | schp626 | ccg | 8 | 1 | caaa | pgk1 | yes | yes | 1 |
| 1 | Specimen001C8C08017 | schp627 | ccg | 8 | 1 | ccgc | pgk1 | yes | yes | 1 |
| 1 | Specimen001C9C09018 | schp628 | ccg | 8 | 1 | ccaa | pgk1 | yes | yes | 1 |
| 1 | Specimen001C10C10019 | schp629 | ccg | 8 | 1 | ccac | pgk1 | yes | yes | 1 |
| 1 | Specimen001C11C11020 | schp630 | ccg | 8 | 1 | ccga | pgk1 | yes | yes | 1 |
| 1 | Specimen001D2D02021 | schp631 | ccg | 8 | 1 | ctgc | pgk1 | yes | yes | 1 |
| 1 | Specimen001D3D03022 | schp632 | ccg | 8 | 1 | aaaa | pgk1 | yes | yes | 1 |
| 1 | Specimen001D4D04023 | schp633 | ccg | 8 | 1 | acgc | pgk1 | yes | yes | 1 |
| 1 | Specimen001D5D05024 | schp634 | ccg | 8 | 1 | ctg | pgk1 | yes | yes | 1 |
| 2 | Specimen002E2E02001 | by4741 | na | na | na | na | na | no | no | 2 |
| 2 | Specimen002E3E03002 | schp15 | na | na | na | na | na | yes | no | 2 |
| 2 | Specimen002E4E04003 | schp19 | cgg | 6 | 1 | caaa | maxhis3 | yes | yes | 2 |
| 2 | Specimen002E5E05004 | schp20 | aga | 6 | na | caaa | maxhis3 | yes | yes | 2 |
| 2 | Specimen002E6E06005 | schp76 | aga | 5 | na | caaa | pgk1 | yes | yes | 2 |
| 2 | Specimen002E7E07006 | schp91 | cgg | 5 | 5 | caaa | pgk1 | yes | yes | 2 |
| 2 | Specimen002E8E08007 | schp617 | cca | 8 | na | caaa | pgk1 | yes | yes | 2 |
| 2 | Specimen002E9E09008 | schp618 | cca | 8 | na | ccgc | pgk1 | yes | yes | 2 |
| 2 | Specimen002E10E10009 | schp619 | cca | 8 | na | ccaa | pgk1 | yes | yes | 2 |
| 2 | Specimen002E11E11010 | schp620 | cca | 8 | na | ccac | pgk1 | yes | yes | 2 |
| 2 | Specimen002F2F02011 | schp621 | cca | 8 | na | ccga | pgk1 | yes | yes | 2 |
| 2 | Specimen002F3F03012 | schp622 | cca | 8 | na | ctgc | pgk1 | yes | yes | 2 |
| 2 | Specimen002F4F04013 | schp623 | cca | 8 | na | aaaa | pgk1 | yes | yes | 2 |
| 2 | Specimen002F5F05014 | schp624 | cca | 8 | na | acgc | pgk1 | yes | yes | 2 |
| 2 | Specimen002F6F06015 | schp625 | cca | 8 | na | ctg | pgk1 | yes | yes | 2 |
| 2 | Specimen002F7F07016 | schp626 | ccg | 8 | 1 | caaa | pgk1 | yes | yes | 2 |
| 2 | Specimen002F8F08017 | schp627 | ccg | 8 | 1 | ccgc | pgk1 | yes | yes | 2 |
| 2 | Specimen002F9F09018 | schp628 | ccg | 8 | 1 | ccaa | pgk1 | yes | yes | 2 |
| 2 | Specimen002F10F10019 | schp629 | ccg | 8 | 1 | ccac | pgk1 | yes | yes | 2 |
| 2 | Specimen002F11F11020 | schp630 | ccg | 8 | 1 | ccga | pgk1 | yes | yes | 2 |
| 2 | Specimen002G2G02021 | schp631 | ccg | 8 | 1 | ctgc | pgk1 | yes | yes | 2 |
| 2 | Specimen002G3G03022 | schp632 | ccg | 8 | 1 | aaaa | pgk1 | yes | yes | 2 |
| 2 | Specimen002G4G04023 | schp633 | ccg | 8 | 1 | acgc | pgk1 | yes | yes | 2 |
| 2 | Specimen002G5G05024 | schp634 | ccg | 8 | 1 | ctg | pgk1 | yes | yes | 2 |
sampleannotations.csv for a deep sequencing experiment¶
| index | samplename | genotype | treatment | type | replicate |
|---|---|---|---|---|---|
| GTAGCC | wtuntreatedmono | wt | untreated | mono | 1 |
| AAGCTA | wtifnmono | wt | ifn | mono | 1 |
| GCCTAA | rack11untreatedmono | rack1 | untreated | mono | 1 |
| CGTGAT | rack11ifnmono | rack1 | ifn | mono | 1 |
| GATCTG | rack12untreatedmono | rack1 | untreated | mono | 2 |
| ATTGGC | rack12ifnmono | rack1 | ifn | mono | 2 |
| CACGAT | wtuntreatedtotal | wt | untreated | total | 1 |
| CAACTA | wtifntotal | wt | ifn | total | 1 |
| GGTAGC | rack11untreatedtotal | rack1 | untreated | total | 1 |
| GTAGAG | rack11ifntotal | rack1 | ifn | total | 1 |
| CAAAAG | rack12untreatedtotal | rack1 | untreated | total | 2 |
| ATGAGC | rack12ifntotal | rack1 | ifn | total | 2 |