How to create a sample annotation file before analysis
Step 1¶
Create a table in LibreOffice or Excel with one or more columns that will uniquely identify each of your sample. For example,
- In a plate reader or qPCR experiment, you will want a plate number column and a well number column for each measured well.
- In a flow cytometry experiment, create a column with the unique specimen number or the unique filename for each sample.
- In a deep sequencing experiment, it will be the unique index for each library.
- In a Western blot, it will be a gel number column and a lane number column.
💡 Tip¶
To avoid mistyping filenames, you can use simple shell commands to quickly set up your sample annotations file. For example, to annotate a flow cytometry experiment:
echo file,sample, > sampleannotations.csv # to create the annotations file
for file in *.fcs; do
echo ${file},, > sampleannotations.csv;
done
This will produce a spreadsheet with two columns, file
and sample
,
where each file ending with .fcs
appears once.
The spreadsheet can then be edited further using Excel or LibreOffice.
For experiments with a lot of references to filenames (e.g. Western blots),
extra lines can be added to the for
loop.
For example, to annotate a Western blot from a gel that had nine lanes:
echo file,lane, > sampleannotations.csv # to create the annotations file
for file in *.tif; do
for lane in {1..9}; do
echo ${file},${lane}, >> sampleannotations.csv;
done;
done
This will produce a spreadsheet with each of 9 lanes assigned to every image file that ends in .tif
.
The spreadsheet can then be edited further using Excel or LibreOffice.
Step 2¶
Create one meta data column for each biological and technical variable that you are changing in your experiment.
For example:
- you did your experiment with three different gene knockouts
- each knockout had two different fluorescent reporters
- each fluorescent reporter is induced at 3 different concentrations
- there are 3 technical replicates for each strain.
So the metadata columns you will create are:
- knockoutgene
- reporter
- inductionlevel
- replicate
In addition, create a column called samplelabel
with a succinct unique name for each sample (see the example below).
Step 3¶
Save the file as sampleannotations.csv
by choosing CSV file type and ,
as field separator.
Important Pointers¶
- Remember that each variable gets a column. For example, if you collected the same sample at different time points, then create a
timepoint
metadata column. - You can insert
NA
in columns for which the metadata is not applicable – for example, useNA
for an empty well in a plate reader experiment. - Create a
comment
column that can include any atypical explanation. - Do not insert spaces, capital letters or non-alphanumeric characters into your annotations file.
Examples¶
sampleannotations.csv
for a plate reader experiment¶
plate | well | knockoutgene | yfpreporter | iptginductionlevel | replicate | comment |
---|---|---|---|---|---|---|
1 | A1 | NA | NA | NA | NA | blank |
1 | A2 | sera | tcg | 0 | 1 | |
1 | A3 | sera | tcg | 0 | 2 | |
1 | A4 | sera | tcg | 0 | 3 | |
1 | A5 | sera | tcg | 10 | 1 | |
1 | A6 | sera | tcg | 10 | 2 | |
1 | A7 | sera | tcg | 10 | 3 | |
1 | A8 | sera | tcg | 100 | 1 | |
1 | A9 | sera | tcg | 100 | 2 | |
1 | A10 | sera | tcg | 100 | 3 | |
1 | A11 | NA | NA | NA | NA | blank |
1 | A12 | NA | NA | NA | NA | blank |
1 | B1 | NA | NA | NA | NA | blank |
1 | B2 | sera | agc | 0 | 1 | |
1 | B3 | sera | agc | 0 | 2 | |
1 | B4 | sera | agc | 0 | 3 | |
1 | B5 | sera | agc | 10 | 1 | |
1 | B6 | sera | agc | 10 | 2 | |
1 | B7 | sera | agc | 10 | 3 | |
1 | B8 | sera | agc | 100 | 1 | |
1 | B9 | sera | agc | 100 | 2 | |
1 | B10 | sera | agc | 100 | 3 | |
1 | B11 | NA | NA | NA | NA | blank |
1 | B12 | NA | NA | NA | NA | blank |
1 | C1 | NA | NA | NA | NA | blank |
1 | C2 | serb | tcg | 0 | 1 | |
1 | C3 | serb | tcg | 0 | 2 | |
1 | C4 | serb | tcg | 0 | 3 | |
1 | C5 | serb | tcg | 10 | 1 | |
1 | C6 | serb | tcg | 10 | 2 | |
1 | C7 | serb | tcg | 10 | 3 | |
1 | C8 | serb | tcg | 100 | 1 | |
1 | C9 | serb | tcg | 100 | 2 | |
1 | C10 | serb | tcg | 100 | 3 | |
1 | C11 | NA | NA | NA | NA | blank |
1 | C12 | NA | NA | NA | NA | blank |
1 | D1 | NA | NA | NA | NA | blank |
1 | D2 | serb | agc | 0 | 1 | |
1 | D3 | serb | agc | 0 | 2 | |
1 | D4 | serb | agc | 0 | 3 | |
1 | D5 | serb | agc | 10 | 1 | |
1 | D6 | serb | agc | 10 | 2 | |
1 | D7 | serb | agc | 10 | 3 | |
1 | D8 | serb | agc | 100 | 1 | |
1 | D9 | serb | agc | 100 | 2 | |
1 | D10 | serb | agc | 100 | 3 | |
1 | D11 | NA | NA | NA | NA | blank |
1 | D12 | NA | NA | NA | NA | blank |
2 | A1 | NA | NA | NA | NA | blank |
2 | A2 | serc | tcg | 0 | 1 | |
2 | A3 | serc | tcg | 0 | 2 | |
2 | A4 | serc | tcg | 0 | 3 | |
2 | A5 | serc | tcg | 10 | 1 | |
2 | A6 | serc | tcg | 10 | 2 | |
2 | A7 | serc | tcg | 10 | 3 | |
2 | A8 | serc | tcg | 100 | 1 | |
2 | A9 | serc | tcg | 100 | 2 | |
2 | A10 | serc | tcg | 100 | 3 | |
2 | A11 | NA | NA | NA | NA | blank |
2 | A12 | NA | NA | NA | NA | blank |
2 | B1 | NA | NA | NA | NA | blank |
2 | B2 | serc | agc | 0 | 1 | |
2 | B3 | serc | agc | 0 | 2 | |
2 | B4 | serc | agc | 0 | 3 | |
2 | B5 | serc | agc | 10 | 1 | |
2 | B6 | serc | agc | 10 | 2 | |
2 | B7 | serc | agc | 10 | 3 | |
2 | B8 | serc | agc | 100 | 1 | |
2 | B9 | serc | agc | 100 | 2 | |
2 | B10 | serc | agc | 100 | 3 | |
2 | B11 | NA | NA | NA | NA | blank |
2 | B12 | NA | NA | NA | NA | blank |
sampleannotations.csv
for a qPCR experiment¶
well | rt | amplicon | initiation | codonmutation | replicate | template | qpcrlabel | strain |
---|---|---|---|---|---|---|---|---|
B03 | yes | gpd | na | na | 1 | 84+1 | 84q11 | scHP15-2 |
B04 | yes | gpd | ctgc | 5xcgg | 1 | 84+4 | 84q21 | scHP286-1 |
B05 | yes | gpd | aaaa | 5xcgg | 1 | 84+5 | 84q31 | scHP291-1 |
B06 | yes | gpd | ctgc | 5xaga | 1 | 84+8 | 84q41 | scHP314-1 |
B07 | yes | gpd | aaaa | 5xaga | 1 | 84+9 | 84q51 | scHP315-1 |
B08 | no | gpd | aaaa | 5xaga | 1 | 84-9 | 84q61 | scHP315-1 |
C03 | yes | yfp | na | na | 1 | 84+1 | 84q12 | scHP15-2 |
C04 | yes | yfp | ctgc | 5xcgg | 1 | 84+4 | 84q22 | scHP286-1 |
C05 | yes | yfp | aaaa | 5xcgg | 1 | 84+5 | 84q32 | scHP291-1 |
C06 | yes | yfp | ctgc | 5xaga | 1 | 84+8 | 84q42 | scHP314-1 |
C07 | yes | yfp | aaaa | 5xaga | 1 | 84+9 | 84q52 | scHP315-1 |
C08 | no | yfp | aaaa | 5xaga | 1 | 84-9 | 84q62 | scHP315-1 |
D03 | yes | gpd | na | na | 2 | 84+1 | 84q11 | scHP15-2 |
D04 | yes | gpd | ctgc | 5xcgg | 2 | 84+4 | 84q21 | scHP286-1 |
D05 | yes | gpd | aaaa | 5xcgg | 2 | 84+5 | 84q31 | scHP291-1 |
D06 | yes | gpd | ctgc | 5xaga | 2 | 84+8 | 84q41 | scHP314-1 |
D07 | yes | gpd | aaaa | 5xaga | 2 | 84+9 | 84q51 | scHP315-1 |
D08 | no | gpd | aaaa | 5xaga | 2 | 84-9 | 84q61 | scHP315-1 |
E03 | yes | yfp | na | na | 2 | 84+1 | 84q12 | scHP15-2 |
E04 | yes | yfp | ctgc | 5xcgg | 2 | 84+4 | 84q22 | scHP286-1 |
E05 | yes | yfp | aaaa | 5xcgg | 2 | 84+5 | 84q32 | scHP291-1 |
E06 | yes | yfp | ctgc | 5xaga | 2 | 84+8 | 84q42 | scHP314-1 |
E07 | yes | yfp | aaaa | 5xaga | 2 | 84+9 | 84q52 | scHP315-1 |
E08 | no | yfp | aaaa | 5xaga | 2 | 84-9 | 84q62 | scHP315-1 |
F03 | yes | gpd | na | na | 3 | 84+1 | 84q11 | scHP15-2 |
F04 | yes | gpd | ctgc | 5xcgg | 3 | 84+4 | 84q21 | scHP286-1 |
F05 | yes | gpd | aaaa | 5xcgg | 3 | 84+5 | 84q31 | scHP291-1 |
F06 | yes | gpd | ctgc | 5xaga | 3 | 84+8 | 84q41 | scHP314-1 |
F07 | yes | gpd | aaaa | 5xaga | 3 | 84+9 | 84q51 | scHP315-1 |
F08 | no | gpd | aaaa | 5xaga | 3 | 84-9 | 84q61 | scHP315-1 |
G03 | yes | yfp | na | na | 3 | 84+1 | 84q12 | scHP15-2 |
G04 | yes | yfp | ctgc | 5xcgg | 3 | 84+4 | 84q22 | scHP286-1 |
G05 | yes | yfp | aaaa | 5xcgg | 3 | 84+5 | 84q32 | scHP291-1 |
G06 | yes | yfp | ctgc | 5xaga | 3 | 84+8 | 84q42 | scHP314-1 |
G07 | yes | yfp | aaaa | 5xaga | 3 | 84+9 | 84q52 | scHP315-1 |
G08 | no | yfp | aaaa | 5xaga | 3 | 84-9 | 84q62 | scHP315-1 |
sampleannotations.csv
for a flow cytometry experiment¶
plate | file | strain | stallcodon | ncodonrepeats | stallsites | initiation | gene | gpdmkate2 | citrine | replicate |
---|---|---|---|---|---|---|---|---|---|---|
1 | Specimen001B2B02001 | by4741 | na | na | na | na | na | no | no | 1 |
1 | Specimen001B3B03002 | schp15 | na | na | na | na | na | yes | no | 1 |
1 | Specimen001B4B04003 | schp19 | cgg | 6 | 1 | caaa | maxhis3 | yes | yes | 1 |
1 | Specimen001B5B05004 | schp20 | aga | 6 | na | caaa | maxhis3 | yes | yes | 1 |
1 | Specimen001B6B06005 | schp76 | aga | 5 | na | caaa | pgk1 | yes | yes | 1 |
1 | Specimen001B7E07006 | schp91 | cgg | 5 | 5 | caaa | pgk1 | yes | yes | 1 |
1 | Specimen001B8B08007 | schp617 | cca | 8 | na | caaa | pgk1 | yes | yes | 1 |
1 | Specimen001B9B09008 | schp618 | cca | 8 | na | ccgc | pgk1 | yes | yes | 1 |
1 | Specimen001B10B10009 | schp619 | cca | 8 | na | ccaa | pgk1 | yes | yes | 1 |
1 | Specimen001B11B11010 | schp620 | cca | 8 | na | ccac | pgk1 | yes | yes | 1 |
1 | Specimen001C2C02011 | schp621 | cca | 8 | na | ccga | pgk1 | yes | yes | 1 |
1 | Specimen001C3C03012 | schp622 | cca | 8 | na | ctgc | pgk1 | yes | yes | 1 |
1 | Specimen001C4C04013 | schp623 | cca | 8 | na | aaaa | pgk1 | yes | yes | 1 |
1 | Specimen001C5C05014 | schp624 | cca | 8 | na | acgc | pgk1 | yes | yes | 1 |
1 | Specimen001C6C06015 | schp625 | cca | 8 | na | ctg | pgk1 | yes | yes | 1 |
1 | Specimen001C7C07016 | schp626 | ccg | 8 | 1 | caaa | pgk1 | yes | yes | 1 |
1 | Specimen001C8C08017 | schp627 | ccg | 8 | 1 | ccgc | pgk1 | yes | yes | 1 |
1 | Specimen001C9C09018 | schp628 | ccg | 8 | 1 | ccaa | pgk1 | yes | yes | 1 |
1 | Specimen001C10C10019 | schp629 | ccg | 8 | 1 | ccac | pgk1 | yes | yes | 1 |
1 | Specimen001C11C11020 | schp630 | ccg | 8 | 1 | ccga | pgk1 | yes | yes | 1 |
1 | Specimen001D2D02021 | schp631 | ccg | 8 | 1 | ctgc | pgk1 | yes | yes | 1 |
1 | Specimen001D3D03022 | schp632 | ccg | 8 | 1 | aaaa | pgk1 | yes | yes | 1 |
1 | Specimen001D4D04023 | schp633 | ccg | 8 | 1 | acgc | pgk1 | yes | yes | 1 |
1 | Specimen001D5D05024 | schp634 | ccg | 8 | 1 | ctg | pgk1 | yes | yes | 1 |
2 | Specimen002E2E02001 | by4741 | na | na | na | na | na | no | no | 2 |
2 | Specimen002E3E03002 | schp15 | na | na | na | na | na | yes | no | 2 |
2 | Specimen002E4E04003 | schp19 | cgg | 6 | 1 | caaa | maxhis3 | yes | yes | 2 |
2 | Specimen002E5E05004 | schp20 | aga | 6 | na | caaa | maxhis3 | yes | yes | 2 |
2 | Specimen002E6E06005 | schp76 | aga | 5 | na | caaa | pgk1 | yes | yes | 2 |
2 | Specimen002E7E07006 | schp91 | cgg | 5 | 5 | caaa | pgk1 | yes | yes | 2 |
2 | Specimen002E8E08007 | schp617 | cca | 8 | na | caaa | pgk1 | yes | yes | 2 |
2 | Specimen002E9E09008 | schp618 | cca | 8 | na | ccgc | pgk1 | yes | yes | 2 |
2 | Specimen002E10E10009 | schp619 | cca | 8 | na | ccaa | pgk1 | yes | yes | 2 |
2 | Specimen002E11E11010 | schp620 | cca | 8 | na | ccac | pgk1 | yes | yes | 2 |
2 | Specimen002F2F02011 | schp621 | cca | 8 | na | ccga | pgk1 | yes | yes | 2 |
2 | Specimen002F3F03012 | schp622 | cca | 8 | na | ctgc | pgk1 | yes | yes | 2 |
2 | Specimen002F4F04013 | schp623 | cca | 8 | na | aaaa | pgk1 | yes | yes | 2 |
2 | Specimen002F5F05014 | schp624 | cca | 8 | na | acgc | pgk1 | yes | yes | 2 |
2 | Specimen002F6F06015 | schp625 | cca | 8 | na | ctg | pgk1 | yes | yes | 2 |
2 | Specimen002F7F07016 | schp626 | ccg | 8 | 1 | caaa | pgk1 | yes | yes | 2 |
2 | Specimen002F8F08017 | schp627 | ccg | 8 | 1 | ccgc | pgk1 | yes | yes | 2 |
2 | Specimen002F9F09018 | schp628 | ccg | 8 | 1 | ccaa | pgk1 | yes | yes | 2 |
2 | Specimen002F10F10019 | schp629 | ccg | 8 | 1 | ccac | pgk1 | yes | yes | 2 |
2 | Specimen002F11F11020 | schp630 | ccg | 8 | 1 | ccga | pgk1 | yes | yes | 2 |
2 | Specimen002G2G02021 | schp631 | ccg | 8 | 1 | ctgc | pgk1 | yes | yes | 2 |
2 | Specimen002G3G03022 | schp632 | ccg | 8 | 1 | aaaa | pgk1 | yes | yes | 2 |
2 | Specimen002G4G04023 | schp633 | ccg | 8 | 1 | acgc | pgk1 | yes | yes | 2 |
2 | Specimen002G5G05024 | schp634 | ccg | 8 | 1 | ctg | pgk1 | yes | yes | 2 |
sampleannotations.csv
for a deep sequencing experiment¶
index | samplename | genotype | treatment | type | replicate |
---|---|---|---|---|---|
GTAGCC | wtuntreatedmono | wt | untreated | mono | 1 |
AAGCTA | wtifnmono | wt | ifn | mono | 1 |
GCCTAA | rack11untreatedmono | rack1 | untreated | mono | 1 |
CGTGAT | rack11ifnmono | rack1 | ifn | mono | 1 |
GATCTG | rack12untreatedmono | rack1 | untreated | mono | 2 |
ATTGGC | rack12ifnmono | rack1 | ifn | mono | 2 |
CACGAT | wtuntreatedtotal | wt | untreated | total | 1 |
CAACTA | wtifntotal | wt | ifn | total | 1 |
GGTAGC | rack11untreatedtotal | rack1 | untreated | total | 1 |
GTAGAG | rack11ifntotal | rack1 | ifn | total | 1 |
CAAAAG | rack12untreatedtotal | rack1 | untreated | total | 2 |
ATGAGC | rack12ifntotal | rack1 | ifn | total | 2 |