Data preparation - QTLs information files
The aim is to create a "target genotype" (ideotype) with all the favorable alleles at the QTL positions. So, before running OptiMAS it is necessary to define the parental alleles to assemble.
In addition, as the QTL position is rarely located at a marker; QTL alleles are unknown and must be inferred from flanking markers. Thus, it is very important to select a subset of markers as informative as possible (especially in multi-parental context) to follow the favorable parental alleles (based on haplotypes). The number of markers selected per QTL should be in the range of 2-6 in order to avoid intensive computation time (more progress in this area will be made in the next version).
Two QTL files are supplied by the user to (i) specify the information regarding the QTL position and identification of favorable alleles, and (ii) define the QTL region, meaning affiliate a set of marker that will be used to compute the allele transmission.
First QTL file (.qtlpos)
In the first QTL file (see tab below), each QTL is characterized by its estimated position in cM (pos) on a chromosome (chr), and the identification of the parent carrying the favorable allele (all+). The information on the confidence interval, i.e. the interval which is likely to include the QTL position (CI min and CI max), will be considered in a future version and therefore can be left empty.
QTL | chr | pos | CI min | CI max | all+ |
---|---|---|---|---|---|
QTL1 | 1 | 70.0 | a | ||
QTL2 | 2 | 55.0 | b/c |
QTL: name of the QTL, without blank in character chain.
chr: index (numerical value) of the chromosome where the QTL is located.
pos: estimated QTL position coming from the QTL detection results.
all+: identification of the parental allele(s) considered as being favorable. For QTL1, the favorable allele "a" refers to the parental line named "IL1" (see columns "P1"/"P2" of the genotype/pedigree file). For QTL2, "b/c" refers to parental lines (IL2 and IL3) which can be considered as favorable relatively to other parental lines.
Second QTL file
In the second file, either .qtll or .qtln or .qtlw (described in the 3 respective tabs below), the QTL region is defined respectively as, an explicit list of marker names, a number of flanking markers, or a window defined on either side of the QTL position.
A list of markers explicitly assigned to each QTL
QTL | mrk_list |
---|---|
QTL1 | Marker1 Marker2 Marker3 Marker4 |
QTL2 | Marker5 Marker6 Marker7 Marker8 Marker9 |
QTL: names of the QTLs, without blank in character chain, and have to appear in the same order than those in the qtlpos file.
mrk_list: list of marker names that have to match those in the map file.
Number of flanking markers. Markers closest to the QTL position are taken from the map file. This implies that the resulting set of marker might not include both side of the QTL position.
QTL | mrk_nb |
---|---|
QTL1 | 4 |
QTL2 | 5 |
QTL: names of the QTLs, without blank in character chain, and have to appear in the same order than those in the qtlpos file.
mrk_nb: integer value indicating for each QTL the number of flanking markers.
Genetic distance defined on either side of the QTL position set in the qtlpos file. The marker set consists of the markers from the map file included in the resulting window.
QTL | window |
---|---|
QTL1 | 30 |
QTL2 | 20.5 |
QTL: names of the QTLs, without blank in character chain, and have to appear in the same order than those in the qtlpos file.
window: genetic distance in cM.
Note:
- Never change the header field names.
- It is recommended to code parental alleles ("all+" column) by only one character (e.g. a, A, b, 1 ...).
- For numerical values, use decimal point (e.g. 0.00) and not decimal comma (e.g. 0,00).
- Every file must be in plain-text, tab-delimited format. So, use tabulation and not spaces between fields.
- The markers names in the .qtll file must match those in the map file.
- Both QTL information files have to share the same base name as the map file (e.g. maize.map, maize.qtlpos and maize.qtll).