A map in CARTHAGENE defines the parameters of a probabilistic model of the genetic/RH data available on the currently active set of markers. In the simplest cases (eg. one genetic population), a map is completely defined by two components: an order of the markers and the recombination probabilities between adjacent markers. As in most existing mapping software (eg. MapMaker), all the probabilistic models used in CARTHAGENE assume a complete absence of interference on genetic data. In the case of RH data, recombination probabilities are replaced by breakage probabilities along with a retention probability parameter. The probabilistic model used is the so-called ``equal retention model'' [LBLC95] which seems to be an ideal compromise between simplicity, efficiency and realism. Given a map, since all the parameters of the probabilistic model are fixed by the map, it is possible to compute the probability of the data given the model often coined as the likelihood of the data. Note that the continuous part of the probabilistic model (probabilities) are always estimated using a maximum-likelihood criterion in CARTHAGENE. This criterion is a rigorous criterion which has several attractive theoretical properties. It is optimised in CARTHAGENE using variants of the EM algorithm [DLR77].
The probabilistic parameters used in the model can be converted to
so-called distances using traditional mapping functions: Haldane and
Kosambi that convert recombination probabilities to Morgans for
genetic data and the usual
for mapping breakage
probabilities to Rays. Note that the probabilistic model used in
consistent with Haldane mapping function. The Kosambi mapping
function, which is usually the favorite one, has theoretical
weaknesses and we advise not to use it.
When several datasets are merged using the dsmergen command, since the underlying assumption is that all datasets share the same parameters, a single ``consensus'' map suffices to define a probabilistic model for all the data. But such maps do not suffice for more complex models, eg. when several datasets are merged using the dsmergor command. In this case, the assumption is that all dataset merged using dsmergor share a consistent order but each with different specific distances (i.e., probabilities).
In this case, a map is defined by a collection of simple maps (as above) that share a consistent order.