Classes

Host

class transmission_models.classes.host.host(id, index, genetic_data=[], t_inf=0, t_sample=None)[source]

Bases: object

Represents a host that has been infected with a virus.

A host object contains information about an infected individual, including their genetic data, infection time, sampling time, and other attributes.

Variables:
  • index (int) – The index of the host.

  • sampled (bool) – Indicates whether the host has been sampled or not.

  • genetic_data (list) – The genetic data of the host.

  • dict_attributes (dict) – A dictionary to store additional attributes.

  • t_inf (int) – Time of infection.

  • t_sample (int, optional) – The time the host was sampled.

  • id (str) – The identifier of the host.

t_inf : property

Getter and setter for the time of infection attribute.

get_genetic_str() : str

Returns the genetic data as a string.

__str__() : str

Returns a string with the id of the host.

__int__() : int

Returns the index of the host.

Examples

>>> h = host('host1', 1, ['A', 'T', 'C', 'G'], 10, t_sample=15)
>>> print(h.t_inf)
10
>>> h.t_inf = 20
>>> print(h.t_inf)
20
>>> print(h.get_genetic_str())
ATCG
>>> print(h)
host1

Notes

This class follows the Python naming convention for class names (using PascalCase).

__init__(id, index, genetic_data=[], t_inf=0, t_sample=None)[source]

Initialize a new instance of the Host class.

Parameters:
  • id (str) – The id of the host.

  • index (int) – The index of the host.

  • genetic_data (list, optional) – The genetic data of the host. Defaults to an empty list.

  • t_inf (int, optional) – Time of infection. Defaults to 0.

  • t_sample (int, optional) – The time the host was sampled. Defaults to None.

property t_inf

Getter for the time of infection attribute.

Returns:

The time of infection.

Return type:

int

get_genetic_str()[source]

Return the genetic data of the host as a string.

Returns:

The genetic data as a string.

Return type:

str

__str__()[source]

Return a string with the id of the host.

Returns:

The id of the host.

Return type:

str

__int__()[source]

Return the index of the host.

Returns:

The index of the host.

Return type:

int

transmission_models.classes.host.create_genome(chain_length)[source]

Create a random genome sequence of specified length.

Parameters:

chain_length (int) – The length of the genome sequence to create.

Returns:

A list of random nucleotides (A, G, C, T) of length chain_length.

Return type:

list

Examples

>>> genome = create_genome(10)
>>> print(genome)
['A', 'T', 'C', 'G', 'A', 'T', 'C', 'G', 'A', 'T']
transmission_models.classes.host.binom_mutation(chain_length, p, genome)[source]

Perform binomial mutation on a given genome.

This function generates changes in a genome by randomly selecting ‘k’ positions to mutate, where ‘k’ follows a binomial distribution with parameters ‘chain_length’ and ‘p’. The elements at the selected positions are replaced with new randomly chosen nucleotides.

Parameters:
  • chain_length (int) – The length of the genome chain.

  • p (float) – The probability of mutation for each element in the chain.

  • genome (str or list) – The original genome sequence.

Returns:

The mutated genome sequence.

Return type:

list

Notes

The function operates as follows:

  1. Calculates the number of positions to mutate, ‘k’, by sampling from a binomial distribution with ‘chain_length’ trials and success probability ‘p’.

  2. Randomly selects ‘k’ positions from the range [0, chain_length) without replacement.

  3. Creates a new list ‘new_genome’ from the original genome.

  4. Iterates over the selected positions and replaces the corresponding elements in ‘new_genome’ with randomly chosen nucleotides based on the original nucleotide at that position:

    • If the original nucleotide is ‘A’, it is replaced with a randomly chosen nucleotide from ‘CTG’.

    • If the original nucleotide is ‘C’, it is replaced with a randomly chosen nucleotide from ‘ATG’.

    • If the original nucleotide is ‘T’, it is replaced with a randomly chosen nucleotide from ‘ACG’.

    • If the original nucleotide is ‘G’, it is replaced with a randomly chosen nucleotide from ‘ACT’.

  5. Returns the mutated genome sequence as ‘new_genome’.

Examples

>>> genome = ['A', 'T', 'C', 'G', 'G', 'A', 'T', 'C', 'G', 'A']
>>> mutated_genome = binom_mutation(len(genome), 0.2, genome)
>>> print(mutated_genome)
['A', 'T', 'C', 'A', 'G', 'A', 'T', 'C', 'G', 'A']

See also

one_mutation

Perform a single mutation on a genome

transmission_models.classes.host.one_mutation(chain_length, p, genome)[source]

Perform one mutation on a given genome.

This function generates a single mutation in a genome by randomly selecting one position to mutate. The selected position is replaced with a new randomly chosen nucleotide.

Parameters:
  • chain_length (int) – The length of the genome chain.

  • p (float) – The probability of mutation for each element in the chain.

  • genome (str or list) – The original genome sequence.

Returns:

The mutated genome sequence.

Return type:

list

Notes

The function operates as follows:

  1. Randomly selects one position from the range [0, chain_length) to mutate.

  2. Creates a new list ‘new_genome’ from the original genome.

  3. Checks the original nucleotide at the selected position and replaces it with a randomly chosen nucleotide based on the following rules:

    • If the original nucleotide is ‘A’, it is replaced with a randomly chosen nucleotide from ‘CTG’.

    • If the original nucleotide is ‘C’, it is replaced with a randomly chosen nucleotide from ‘ATG’.

    • If the original nucleotide is ‘T’, it is replaced with a randomly chosen nucleotide from ‘ACG’.

    • If the original nucleotide is ‘G’, it is replaced with a randomly chosen nucleotide from ‘ACT’.

  4. Returns the mutated genome sequence as ‘new_genome’.

Examples

>>> genome = ['A', 'T', 'C', 'G', 'G', 'A', 'T', 'C', 'G', 'A']
>>> mutated_genome = one_mutation(len(genome), 0.2, genome)
>>> print(mutated_genome)
['A', 'T', 'C', 'A', 'G', 'A', 'T', 'C', 'G', 'T']

See also

binom_mutation

Perform binomial mutation on a genome

transmission_models.classes.host.average_mutations(mu, P_mut, tau, Dt, host_genetic)[source]

Generate a list of mutations proportional to a time interval.

The number of mutations is proportional to a given time interval (Dt) where the proportion factor is the mutation rate (mu).

Parameters:
  • mu (float) – The mutation rate.

  • P_mut (float) – The probability of mutation.

  • tau (float) – The current time.

  • Dt (float) – The time interval.

  • host_genetic (list) – The genetic sequence of the host.

Returns:

A tuple containing:

  • mutationslist

    List of mutated genetic sequences.

  • t_mutationslist

    List of mutation times.

Return type:

tuple

Notes

The function calculates the number of mutations as int(mu * Dt / P_mut) and generates that many mutations using the one_mutation function.

Didelot Unsampled

class transmission_models.classes.didelot_unsampled.didelot_unsampled(sampling_params, offspring_params, infection_params, T=None)[source]

Bases: object

Didelot unsampled transmission model.

This class implements the Didelot et al. (2017) framework for transmission tree inference with unsampled hosts. It provides methods for building transmission networks, computing likelihoods, and performing MCMC sampling.

The model incorporates three main components: 1. Sampling model: Gamma distribution for sampling times 2. Offspring model: Negative binomial distribution for offspring number 3. Infection model: Gamma distribution for infection times

Parameters:
  • sampling_params (dict) – Parameters for the sampling model containing: - pi : float, sampling probability - k_samp : float, shape parameter for gamma distribution - theta_samp : float, scale parameter for gamma distribution

  • offspring_params (dict) – Parameters for the offspring model containing: - r : float, rate of infection - p_inf : float, probability of infection

  • infection_params (dict) – Parameters for the infection model containing: - k_inf : float, shape parameter for gamma distribution - theta_inf : float, scale parameter for gamma distribution

Variables:
  • T (networkx.DiGraph) – The transmission tree.

  • host_dict (dict) – Dictionary mapping host IDs to host objects.

  • log_likelihood (float) – Current log likelihood of the model.

  • genetic_prior (genetic_prior_tree, optional) – Prior for genetic data.

  • same_location_prior (same_location_prior_tree, optional) – Prior for location data.

References

Didelot, X., Gardy, J., & Colijn, C. (2017). Bayesian inference of transmission chains using timing of events, contact and genetic data. PLoS computational biology, 13(4), e1005496.

Core Methods

__init__(sampling_params, offspring_params, infection_params, T=None)[source]

Initialize the Didelot unsampled transmission model.

Parameters:
  • sampling_params (dict) – Parameters for the sampling model containing: - pi : float, sampling probability - k_samp : float, shape parameter for gamma distribution - theta_samp : float, scale parameter for gamma distribution

  • offspring_params (dict) – Parameters for the offspring model containing: - r : float, rate of infection - p_inf : float, probability of infection

  • infection_params (dict) – Parameters for the infection model containing: - k_inf : float, shape parameter for gamma distribution - theta_inf : float, scale parameter for gamma distribution

  • T (networkx.DiGraph, optional) – The transmission tree. If provided, the model will be initialized with this tree. Default is None.

Raises:

KeyError – If any required parameter is missing from the input dictionaries.

add_root(t_sampl, id='0', genetic_data=[], t_inf=0, t_sample=None)[source]

Add the root host to the transmission tree.

Parameters:
  • t_sampl (float) – Sampling time of the root host.

  • id (str, optional) – Identifier for the root host. Default is “0”.

  • genetic_data (list, optional) – Genetic data for the root host. Default is empty list.

  • t_inf (float, optional) – Infection time of the root host. Default is 0.

  • t_sample (float, optional) – Sampling time of the root host. Default is None.

Returns:

The root host object.

Return type:

host

successors(host)[source]

Get the successors (children) of a given host in the transmission tree.

Parameters:

host (host) – The host node whose successors are to be returned.

Returns:

An iterator over the successors of the host.

Return type:

iterator

parent(host)[source]

Get the parent (infector) of a given host in the transmission tree.

Parameters:

host (host) – The host node whose parent is to be returned.

Returns:

The parent host object.

Return type:

host

out_degree(host)[source]

Get the out-degree (number of children) of a host in the transmission tree.

Parameters:

host (host) – The host node whose out-degree is to be returned.

Returns:

The out-degree of the host.

Return type:

int

choose_successors(host, k=1)[source]

Choose k unique successors of a given host.

Parameters:
  • host (host) – Host whose successors will be chosen.

  • k (int, optional) – Number of successors to choose. Default is 1.

Returns:

List of k randomly chosen successors of the host.

Return type:

list

Tree Structure Methods

get_root_subtrees()[source]

Retrieve the root subtrees of the transmission tree.

This method searches for the first sampled siblings of the root host in the transmission tree and stores them in the roots_subtrees attribute.

Returns:

A list of root subtrees.

Return type:

list

get_unsampled_hosts()[source]

Get the list of unsampled hosts in the transmission tree (excluding the root).

Returns:

List of unsampled host nodes.

Return type:

list

get_candidates_to_chain()[source]

Get the list of candidate hosts for chain moves in the transmission tree.

Returns:

List of candidate host nodes for chain moves.

Return type:

list

get_N_candidates_to_chain(recompute=False)[source]

Get the number of candidate hosts for chain moves, optionally recomputing the list.

Parameters:

recompute (bool, optional) – If True, recompute the list of candidates. Default is False.

Returns:

Number of candidate hosts for chain moves.

Return type:

int

Likelihood Methods

get_sampling_model_likelihood(hosts=None, T=None, update=False)[source]

Compute the likelihood of the sampling model.

Computes the likelihood of the sampling model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the sampling model given the list of hosts

Return type:

float

get_sampling_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the sampling model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the sampling model given the list of hosts

Return type:

float

get_offspring_model_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the offspring model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the offspring model given the list of hosts

Return type:

float

get_offspring_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the offspring model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the offspring model given the list of hosts

Return type:

float

get_infection_model_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the infection model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:
  • hosts (list of host objects) –

  • T (DiGraph object) – Contagious tree which likelihood of the hosts will be computed. If it is None, the network of the model is used.

  • update (bool) – If True, the likelihood of the infection model is updated in the model object.

Returns:

L – The likelihood of the infection model given the list of hosts

Return type:

float

get_infection_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the infection model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:
  • hosts (list of host objects) –

  • T (DiGraph object) – Contagious tree which likelihood of the hosts will be computed. If it is None, the network of the model is used.

  • update (bool) – If True, the likelihood of the infection model is updated in the model object.

Returns:

L – The likelihood of the infection model given the list of hosts

Return type:

float

log_likelihood_host(host, T=None)[source]

Computes the log likelihood of a host given the transmission tree. :param host: :type host: host object :param T: :type T: DiGraph object

Returns:

log_likelihood – The log likelihood of the host in the transmission network

Return type:

float

log_likelihood_hosts_list(hosts, T)[source]
log_likelihood_transmission_tree(T)[source]
get_log_likelihood_transmission()[source]

Delta Methods (for MCMC)

Delta_log_sampling(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the sampling model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the sampling model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the sampling model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

Delta_log_offspring(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the offspring model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the offspring model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the offspring model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

Delta_log_infection(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the infection model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the infection model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the infection model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

Delta_log_likelihood_host(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for a host.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the host.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the host at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

MCMC Step Methods

infection_time_from_sampling_step(selected_host=None, metHast=True, verbose=False)[source]

Propose and possibly accept a new infection time for a sampled host using the Metropolis-Hastings algorithm.

This method samples a new infection time for a selected host (or a random sampled host if not provided), computes the acceptance probability, and updates the host’s infection time if the proposal is accepted.

Parameters:
  • selected_host (host, optional) – The host whose infection time will be changed. If None, a random sampled host is selected.

  • metHast (bool, optional) – If True, use the Metropolis-Hastings algorithm to accept or reject the proposal. Default is True.

  • verbose (bool, optional) – If True, print detailed information about the proposal. Default is False.

Returns:

  • t_inf_new (float) – The proposed new infection time.

  • gg (float) – Proposal ratio for the Metropolis-Hastings step.

  • pp (float) – Likelihood ratio for the Metropolis-Hastings step.

  • P (float) – Acceptance probability for the Metropolis-Hastings step.

  • selected_host (host) – The host whose infection time was proposed to change.

infection_time_from_infection_model_step(selected_host=None, metHast=True, Dt_new=None, verbose=False)[source]

Method to change the infection time of a host and then accept the change using the Metropolis Hastings algorithm.

Parameters:
  • selected_host (host object, default=None) – Host whose infection time will be changed. If None, a host is randomly selected.

  • metHast (bool, default=True) – If True, the Metropolis Hastings algorithm is used to accept or reject the change.

  • Dt_new (float, default=None) – New infection time for the host. If None, a new time is sampled.

  • verbose (bool, default=False) – If True, prints the results of the step.

add_unsampled_with_times(selected_host=None, P_add=0.5, P_rewiring=0.5, P_off=0.5, verbose=False, only_geometrical=False, detailed_probs=False)[source]

Method to propose the addition of an unsampled host to the transmission tree and get the probability of the proposal.

Parameters:

selected_host: host object

Host to which the unsampled host will be added. If None, a host is randomly selected.

P_add: float

Probability of proposing to add a new host to the transmission tree.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

verbose: bool

If True, prints the results of the step.

only_geometrical: bool

If True, only the proposal of the new topological structure will be considered.

detailed_probs: bool

If True, the method will return both probabilities of the proposals, of adding and removing a host.

Returns:

T_new: DiGraph object

New transmission tree with the proposed changes.

gg: float

Ratio of the probabilities of the proposals.

g_go: float

Probability of the proposal of adding a host.

g_ret: float

Probability of the proposal of removing a host.

prob_time: float

Probability of the time of infection of the new host.

unsampled: host object

Unsampeld host to be added to the transmission tree.

added: bool

If True, the host was added to the transmission tree.

remove_unsampled_with_times(selected_host=None, P_add=0.5, P_rewiring=0.5, P_off=0.5, only_geometrical=False, detailed_probs=False, verbose=False)[source]

Method to propose the removal of an unsampled host from the transmission tree and get the probability of the proposal. In case that no unsampled hosts are available, a new host is proposed to be added to the transmission tree.

Parameters:

selected_host: host object

Unsampled host to be removed from the transmission tree. If None, a host is randomly selected.

P_add: float

Probability of proposing to add a new host to the transmission tree.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

verbose: bool

If True, prints the results of the step.

only_geometrical: bool

If True, only the proposal of the new topological structure will be considered.

detailed_probs: bool

If True, the method will return both probabilities of the proposals, of adding and removing a host.

Returns:

T_new: DiGraph object

New transmission tree with the proposed changes.

gg: float

Ratio of the probabilities of the proposals.

g_go: float

Probability of the proposal of adding a host.

g_ret: float

Probability of the proposal of removing a host.

prob_time: float

Probability of proposing the time of the selected_host.

added: bool

If True, the host was added to the transmission tree. Else, the node have been removed

add_remove_step(P_add=0.5, P_rewiring=0.5, P_off=0.5, metHast=True, verbose=False)[source]

Method to propose the addition or removal of an unsampled host to the transmission tree and get the probability of the proposal.

Parameters:

P_add: float

Probability of proposing an addition of an unsampled host. Else, an unsampled host is going to be proposed for removal.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

metHast: bool

If True, the Metropolis Hastings algorithm is used to accept or reject the change.

verbose: bool

If True, prints the results of the step.

Returns:

MCMC_step(N_steps, verbose=False)[source]

Prior Methods

add_genetic_prior(mu_gen, gen_dist)[source]

Adds a genetic prior to the model that computes the likelihood that two sampled hosts has a relationship given the genetic distance of the virus of the hosts. Two nodes are considered that has a relationship if the only hosts that are on they are connected through unsampled hosts.

Parameters:
  • mu_gen (float) – Mutation rate

  • gen_dist (np.array) – Genetic distance matrix of the virus of the hosts. The index has to be identical to the index of the hosts.

add_same_location_prior(P_NM, tau, loc_dist)[source]

Adds a genetic prior to the model that computes the likelihood that two sampled hosts has a relationship given the genetic distance of the virus of the hosts. Two nodes are considered that has a relationship if the only hosts that are on they are connected through unsampled hosts.

Parameters:
  • log_K (float) – Log probability of two hosts not being in the same location

  • gen_dist (np.array) – Genetic distance matrix of the virus of the hosts. The index has to be identical to the index of the hosts.

compute_Delta_loc_prior(T_new)[source]

Compute the change in the location prior log-likelihood for a new tree.

Parameters:

T_new (networkx.DiGraph) – The new transmission tree.

Returns:

(Delta log prior, new log prior, old log prior, old correction log-likelihood)

Return type:

tuple

Utility Methods

create_transmision_phylogeny_nets(N, mu, P_mut)[source]

N: Number of hosts mu: Mutation rate P_mut: Prob of mutation

get_newick(lengths=True)[source]
save_json(filename)[source]

Save the transmission tree to a JSON file.

Parameters:

filename (str) – Path to the output JSON file.

show_log_likelihoods(hosts=None, T=None, verbose=False)[source]

Print and return the log-likelihoods for the sampling, offspring, and infection models.

Parameters:
  • hosts (list, optional) – List of host objects to compute log-likelihoods for. If None, computes for all hosts in T.

  • T (networkx.DiGraph, optional) – Transmission tree. If None, uses self.T.

  • verbose (bool, optional) – If True, prints the log-likelihoods. Default is False.

Returns:

(LL_sampling, LL_offspring, LL_infection): Log-likelihoods for the sampling, offspring, and infection models.

Return type:

tuple

__init__(sampling_params, offspring_params, infection_params, T=None)[source]

Initialize the Didelot unsampled transmission model.

Parameters:
  • sampling_params (dict) – Parameters for the sampling model containing: - pi : float, sampling probability - k_samp : float, shape parameter for gamma distribution - theta_samp : float, scale parameter for gamma distribution

  • offspring_params (dict) – Parameters for the offspring model containing: - r : float, rate of infection - p_inf : float, probability of infection

  • infection_params (dict) – Parameters for the infection model containing: - k_inf : float, shape parameter for gamma distribution - theta_inf : float, scale parameter for gamma distribution

  • T (networkx.DiGraph, optional) – The transmission tree. If provided, the model will be initialized with this tree. Default is None.

Raises:

KeyError – If any required parameter is missing from the input dictionaries.

property T
set_T(T)[source]
samp_t_inf_between(h1, h2)[source]

Sample a time of infection between two hosts.

Uses a rejection sampling method to sample the time of infection of the infected host using the chain model from Didelot et al. 2017.

Parameters:
  • h1 (host) – Infector host.

  • h2 (host) – Infected host.

Returns:

Time of infection of the host infected by h1 and the infector of h2.

Return type:

float

Notes

This method implements the rejection sampling algorithm described in Didelot et al. (2017) for sampling infection times in transmission chains.

add_root(t_sampl, id='0', genetic_data=[], t_inf=0, t_sample=None)[source]

Add the root host to the transmission tree.

Parameters:
  • t_sampl (float) – Sampling time of the root host.

  • id (str, optional) – Identifier for the root host. Default is “0”.

  • genetic_data (list, optional) – Genetic data for the root host. Default is empty list.

  • t_inf (float, optional) – Infection time of the root host. Default is 0.

  • t_sample (float, optional) – Sampling time of the root host. Default is None.

Returns:

The root host object.

Return type:

host

successors(host)[source]

Get the successors (children) of a given host in the transmission tree.

Parameters:

host (host) – The host node whose successors are to be returned.

Returns:

An iterator over the successors of the host.

Return type:

iterator

parent(host)[source]

Get the parent (infector) of a given host in the transmission tree.

Parameters:

host (host) – The host node whose parent is to be returned.

Returns:

The parent host object.

Return type:

host

out_degree(host)[source]

Get the out-degree (number of children) of a host in the transmission tree.

Parameters:

host (host) – The host node whose out-degree is to be returned.

Returns:

The out-degree of the host.

Return type:

int

choose_successors(host, k=1)[source]

Choose k unique successors of a given host.

Parameters:
  • host (host) – Host whose successors will be chosen.

  • k (int, optional) – Number of successors to choose. Default is 1.

Returns:

List of k randomly chosen successors of the host.

Return type:

list

compute_Delta_loc_prior(T_new)[source]

Compute the change in the location prior log-likelihood for a new tree.

Parameters:

T_new (networkx.DiGraph) – The new transmission tree.

Returns:

(Delta log prior, new log prior, old log prior, old correction log-likelihood)

Return type:

tuple

get_candidates_to_chain()[source]

Get the list of candidate hosts for chain moves in the transmission tree.

Returns:

List of candidate host nodes for chain moves.

Return type:

list

get_N_candidates_to_chain(recompute=False)[source]

Get the number of candidate hosts for chain moves, optionally recomputing the list.

Parameters:

recompute (bool, optional) – If True, recompute the list of candidates. Default is False.

Returns:

Number of candidate hosts for chain moves.

Return type:

int

get_root_subtrees()[source]

Retrieve the root subtrees of the transmission tree.

This method searches for the first sampled siblings of the root host in the transmission tree and stores them in the roots_subtrees attribute.

Returns:

A list of root subtrees.

Return type:

list

get_unsampled_hosts()[source]

Get the list of unsampled hosts in the transmission tree (excluding the root).

Returns:

List of unsampled host nodes.

Return type:

list

get_sampling_model_likelihood(hosts=None, T=None, update=False)[source]

Compute the likelihood of the sampling model.

Computes the likelihood of the sampling model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the sampling model given the list of hosts

Return type:

float

get_sampling_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the sampling model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the sampling model given the list of hosts

Return type:

float

Delta_log_sampling(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the sampling model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the sampling model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the sampling model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

get_offspring_model_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the offspring model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the offspring model given the list of hosts

Return type:

float

get_offspring_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the offspring model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the offspring model given the list of hosts

Return type:

float

Delta_log_offspring(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the offspring model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the offspring model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the offspring model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

get_infection_model_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the infection model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:
  • hosts (list of host objects) –

  • T (DiGraph object) – Contagious tree which likelihood of the hosts will be computed. If it is None, the network of the model is used.

  • update (bool) – If True, the likelihood of the infection model is updated in the model object.

Returns:

L – The likelihood of the infection model given the list of hosts

Return type:

float

get_infection_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the infection model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:
  • hosts (list of host objects) –

  • T (DiGraph object) – Contagious tree which likelihood of the hosts will be computed. If it is None, the network of the model is used.

  • update (bool) – If True, the likelihood of the infection model is updated in the model object.

Returns:

L – The likelihood of the infection model given the list of hosts

Return type:

float

Delta_log_infection(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the infection model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the infection model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the infection model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

log_likelihood_host(host, T=None)[source]

Computes the log likelihood of a host given the transmission tree. :param host: :type host: host object :param T: :type T: DiGraph object

Returns:

log_likelihood – The log likelihood of the host in the transmission network

Return type:

float

Delta_log_likelihood_host(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for a host.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the host.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the host at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

log_likelihood_hosts_list(hosts, T)[source]
log_likelihood_transmission_tree(T)[source]
log_posterior_transmission_tree()[source]

Compute the log-posterior of the current transmission tree.

This method calculates the log-posterior probability of the current transmission tree by summing the log-likelihood of the tree and any additional prior log-probabilities, such as genetic and location priors, if they are defined.

Returns:

The computed log-posterior of the current transmission tree.

Return type:

float

Notes

The log-posterior is computed as:

log_posterior = log_likelihood + genetic_log_prior (if defined) + same_location_log_prior (if defined)

The method uses the following attributes:
  • self.log_likelihood: Log-likelihood of the transmission tree.

  • self.genetic_log_prior: Log-prior from the genetic model (if defined).

  • self.same_location_log_prior: Log-prior from the location model (if defined).

get_log_posterior_transmission_tree(T)[source]

Compute and update the log-posterior of the transmission tree.

This method calculates the log-posterior probability of the given transmission tree T by combining the log-likelihood of the tree with any additional prior log-probabilities, such as genetic and location priors, if they are defined. The computed log-posterior and any relevant prior log-likelihoods are stored as attributes of the object.

Parameters:

T (networkx.DiGraph) – The transmission tree for which to compute the log-posterior.

Returns:

The computed log-posterior of the transmission tree.

Return type:

float

Notes

The log-posterior is computed as:

log_posterior = log_likelihood + genetic_log_prior (if defined) + same_location_log_prior (if defined)

The method also updates the following attributes:
  • self.log_posterior

  • self.genetic_log_prior (if applicable)

  • self.same_location_log_prior (if applicable)

show_log_likelihoods(hosts=None, T=None, verbose=False)[source]

Print and return the log-likelihoods for the sampling, offspring, and infection models.

Parameters:
  • hosts (list, optional) – List of host objects to compute log-likelihoods for. If None, computes for all hosts in T.

  • T (networkx.DiGraph, optional) – Transmission tree. If None, uses self.T.

  • verbose (bool, optional) – If True, prints the log-likelihoods. Default is False.

Returns:

(LL_sampling, LL_offspring, LL_infection): Log-likelihoods for the sampling, offspring, and infection models.

Return type:

tuple

log_likelihood_transmission_tree_old(T)[source]

Compute the log-likelihood of the entire transmission tree using the old method.

Parameters:

T (networkx.DiGraph) – Transmission tree to compute the log-likelihood for.

Returns:

The log-likelihood of the transmission tree.

Return type:

float

get_log_likelihood_transmission()[source]
add_genetic_prior(mu_gen, gen_dist)[source]

Adds a genetic prior to the model that computes the likelihood that two sampled hosts has a relationship given the genetic distance of the virus of the hosts. Two nodes are considered that has a relationship if the only hosts that are on they are connected through unsampled hosts.

Parameters:
  • mu_gen (float) – Mutation rate

  • gen_dist (np.array) – Genetic distance matrix of the virus of the hosts. The index has to be identical to the index of the hosts.

add_same_location_prior(P_NM, tau, loc_dist)[source]

Adds a genetic prior to the model that computes the likelihood that two sampled hosts has a relationship given the genetic distance of the virus of the hosts. Two nodes are considered that has a relationship if the only hosts that are on they are connected through unsampled hosts.

Parameters:
  • log_K (float) – Log probability of two hosts not being in the same location

  • gen_dist (np.array) – Genetic distance matrix of the virus of the hosts. The index has to be identical to the index of the hosts.

create_transmision_phylogeny_nets(N, mu, P_mut)[source]

N: Number of hosts mu: Mutation rate P_mut: Prob of mutation

get_newick(lengths=True)[source]
save_json(filename)[source]

Save the transmission tree to a JSON file.

Parameters:

filename (str) – Path to the output JSON file.

classmethod json_to_tree(filename, sampling_params=None, offspring_params=None, infection_params=None)[source]

Load a transmission model from a JSON file and reconstruct the model object.

Parameters:
  • filename (str) – Path to the JSON file.

  • sampling_params (dict, optional) – Sampling parameters to override those in the file. Default is None.

  • offspring_params (dict, optional) – Offspring parameters to override those in the file. Default is None.

  • infection_params (dict, optional) – Infection parameters to override those in the file. Default is None.

Returns:

The reconstructed transmission model.

Return type:

didelot_unsampled

infection_time_from_sampling_step(selected_host=None, metHast=True, verbose=False)[source]

Propose and possibly accept a new infection time for a sampled host using the Metropolis-Hastings algorithm.

This method samples a new infection time for a selected host (or a random sampled host if not provided), computes the acceptance probability, and updates the host’s infection time if the proposal is accepted.

Parameters:
  • selected_host (host, optional) – The host whose infection time will be changed. If None, a random sampled host is selected.

  • metHast (bool, optional) – If True, use the Metropolis-Hastings algorithm to accept or reject the proposal. Default is True.

  • verbose (bool, optional) – If True, print detailed information about the proposal. Default is False.

Returns:

  • t_inf_new (float) – The proposed new infection time.

  • gg (float) – Proposal ratio for the Metropolis-Hastings step.

  • pp (float) – Likelihood ratio for the Metropolis-Hastings step.

  • P (float) – Acceptance probability for the Metropolis-Hastings step.

  • selected_host (host) – The host whose infection time was proposed to change.

infection_time_from_infection_model_step(selected_host=None, metHast=True, Dt_new=None, verbose=False)[source]

Method to change the infection time of a host and then accept the change using the Metropolis Hastings algorithm.

Parameters:
  • selected_host (host object, default=None) – Host whose infection time will be changed. If None, a host is randomly selected.

  • metHast (bool, default=True) – If True, the Metropolis Hastings algorithm is used to accept or reject the change.

  • Dt_new (float, default=None) – New infection time for the host. If None, a new time is sampled.

  • verbose (bool, default=False) – If True, prints the results of the step.

add_unsampled_with_times(selected_host=None, P_add=0.5, P_rewiring=0.5, P_off=0.5, verbose=False, only_geometrical=False, detailed_probs=False)[source]

Method to propose the addition of an unsampled host to the transmission tree and get the probability of the proposal.

Parameters:

selected_host: host object

Host to which the unsampled host will be added. If None, a host is randomly selected.

P_add: float

Probability of proposing to add a new host to the transmission tree.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

verbose: bool

If True, prints the results of the step.

only_geometrical: bool

If True, only the proposal of the new topological structure will be considered.

detailed_probs: bool

If True, the method will return both probabilities of the proposals, of adding and removing a host.

Returns:

T_new: DiGraph object

New transmission tree with the proposed changes.

gg: float

Ratio of the probabilities of the proposals.

g_go: float

Probability of the proposal of adding a host.

g_ret: float

Probability of the proposal of removing a host.

prob_time: float

Probability of the time of infection of the new host.

unsampled: host object

Unsampeld host to be added to the transmission tree.

added: bool

If True, the host was added to the transmission tree.

remove_unsampled_with_times(selected_host=None, P_add=0.5, P_rewiring=0.5, P_off=0.5, only_geometrical=False, detailed_probs=False, verbose=False)[source]

Method to propose the removal of an unsampled host from the transmission tree and get the probability of the proposal. In case that no unsampled hosts are available, a new host is proposed to be added to the transmission tree.

Parameters:

selected_host: host object

Unsampled host to be removed from the transmission tree. If None, a host is randomly selected.

P_add: float

Probability of proposing to add a new host to the transmission tree.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

verbose: bool

If True, prints the results of the step.

only_geometrical: bool

If True, only the proposal of the new topological structure will be considered.

detailed_probs: bool

If True, the method will return both probabilities of the proposals, of adding and removing a host.

Returns:

T_new: DiGraph object

New transmission tree with the proposed changes.

gg: float

Ratio of the probabilities of the proposals.

g_go: float

Probability of the proposal of adding a host.

g_ret: float

Probability of the proposal of removing a host.

prob_time: float

Probability of proposing the time of the selected_host.

added: bool

If True, the host was added to the transmission tree. Else, the node have been removed

add_remove_step(P_add=0.5, P_rewiring=0.5, P_off=0.5, metHast=True, verbose=False)[source]

Method to propose the addition or removal of an unsampled host to the transmission tree and get the probability of the proposal.

Parameters:

P_add: float

Probability of proposing an addition of an unsampled host. Else, an unsampled host is going to be proposed for removal.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

metHast: bool

If True, the Metropolis Hastings algorithm is used to accept or reject the change.

verbose: bool

If True, prints the results of the step.

Returns:

MCMC_step(N_steps, verbose=False)[source]

MCMC

class transmission_models.classes.mcmc.mcmc.MCMC(model, P_rewire=0.3333333333333333, P_add_remove=0.3333333333333333, P_t_shift=0.3333333333333333, P_add=0.5, P_rewire_add=0.5, P_offspring_add=0.5, P_to_offspring=0.5)[source]

Bases: object

Markov Chain Monte Carlo sampler for transmission tree inference.

This class implements MCMC sampling algorithms for transmission network inference using various proposal mechanisms.

Parameters:
  • model (didelot_unsampled) – The transmission tree model to sample from.

  • P_rewire (float, optional) – The probability of rewiring a transmission tree. Default is 1/3.

  • P_add_remove (float, optional) – The probability of adding or removing an unsampled host in the transmission tree. Default is 1/3.

  • P_t_shift (float, optional) – The probability of shifting the infection time of the host in the transmission tree. Default is 1/3.

  • P_add (float, optional) – The probability of adding a new host to the transmission tree once the add/remove have been proposed. Default is 0.5.

  • P_rewire_add (float, optional) – The probability of rewiring the new unsampled host once the add have been proposed. Default is 0.5.

  • P_offspring_add (float, optional) – The probability that the new unsampled host is an offspring once the add and rewire have been proposed. Default is 0.5.

  • P_to_offspring (float, optional) – The probability of moving to offspring model during rewiring. Default is 0.5.

Variables:
  • model (didelot_unsampled) – The transmission model being sampled.

  • P_rewire (float) – Probability of rewiring moves.

  • P_add_remove (float) – Probability of add/remove moves.

  • P_t_shift (float) – Probability of time shift moves.

  • P_add (float) – Probability of adding vs removing hosts.

  • P_rewire_add (float) – Probability of rewiring added hosts.

  • P_offspring_add (float) – Probability of offspring vs chain model for added hosts.

  • P_to_offspring (float) – Probability of moving to offspring model.

__init__(model, P_rewire=0.3333333333333333, P_add_remove=0.3333333333333333, P_t_shift=0.3333333333333333, P_add=0.5, P_rewire_add=0.5, P_offspring_add=0.5, P_to_offspring=0.5)[source]

Initialize the MCMC sampler.

Parameters:
  • model (didelot_unsampled) – The transmission tree model to sample from.

  • P_rewire (float, optional) – The probability of rewiring a transmission tree. Default is 1/3.

  • P_add_remove (float, optional) – The probability of adding or removing an unsampled host in the transmission tree. Default is 1/3.

  • P_t_shift (float, optional) – The probability of shifting the infection time of the host in the transmission tree. Default is 1/3.

  • P_add (float, optional) – The probability of adding a new host to the transmission tree once the add/remove have been proposed. Default is 0.5.

  • P_rewire_add (float, optional) – The probability of rewiring the new unsampled host once the add have been proposed. Default is 0.5.

  • P_offspring_add (float, optional) – The probability that the new unsampled host is an offspring once the add and rewire have been proposed. Default is 0.5.

  • P_to_offspring (float, optional) – The probability of moving to offspring model during rewiring. Default is 0.5.

MCMC_iteration(verbose=False)[source]

Perform an MCMC iteration on the transmission tree model.

Parameters:

verbose (bool, optional) – Whether to print the progress of the MCMC iteration. Default is False.

Returns:

A tuple containing:

  • movestr

    The type of move proposed (‘rewire’, ‘add_remove’, or ‘time_shift’).

  • ggfloat

    The ratio of proposal probabilities.

  • ppfloat

    The ratio of posterior probabilities.

  • Pfloat

    The acceptance probability.

  • acceptedbool

    Whether the move was accepted.

  • DLfloat

    The difference in log likelihood.

Return type:

tuple

Notes

The function operates as follows:

  1. Selects a move type at random.

  2. Performs the move and computes acceptance probability.

  3. Returns move details and acceptance status.

Priors

class transmission_models.classes.genetic_prior.genetic_prior_tree(model, mu, distance_matrix)[source]

Bases: object

__init__(model, mu, distance_matrix)[source]

Initialize the genetic prior tree object.

Parameters:
  • model (object) – The transmission model containing the tree structure.

  • mu (float) – The mutation rate parameter for the Poisson distribution.

  • distance_matrix (numpy.ndarray) – Matrix containing pairwise genetic distances between hosts.

Notes

This initializes the genetic prior calculator with: - A Poisson distribution with rate mu for modeling genetic distances - A distance matrix for pairwise host comparisons - A reference to the transmission model

static search_firsts_sampled_siblings(host, T, distance_matrix)[source]

Find all sampled siblings of a host in the transmission tree.

Parameters:
  • host (object) – The host for which to find sampled siblings.

  • T (networkx.DiGraph) – The transmission tree.

  • distance_matrix (numpy.ndarray) – Matrix containing pairwise genetic distances between hosts.

Returns:

List of sampled sibling hosts that have genetic distance data.

Return type:

list

Notes

This method recursively searches through the tree to find all sampled hosts that are descendants of the given host and have valid genetic distance data (non-NaN values in the distance matrix).

static search_first_sampled_parent(host, T, root)[source]

Find the first sampled ancestor of a host in the transmission tree.

Parameters:
  • host (object) – The host for which to find the first sampled parent.

  • T (networkx.DiGraph) – The transmission tree.

  • root (object) – The root host of the transmission tree.

Returns:

The first sampled parent host, or None if no sampled parent is found.

Return type:

object or None

Notes

This method traverses up the tree from the given host until it finds the first sampled ancestor, or reaches the root without finding one.

static get_mut_time_dist(hp, hs)[source]

Calculate the mutation time distance between two hosts.

Parameters:
  • hp (object) – The parent host.

  • hs (object) – The sibling host.

Returns:

The mutation time distance: (hs.t_sample + hp.t_sample - 2 * hp.t_inf).

Return type:

float

Notes

This calculates the time available for mutations to accumulate between the sampling times of two hosts, accounting for their common infection time.

get_closest_sampling_siblings(T=None, verbose=False)[source]

Calculate log-likelihood correction for closest sampling siblings.

Parameters:
  • T (networkx.DiGraph, optional) – The transmission tree. If None, uses self.model.T.

  • verbose (bool, optional) – If True, print detailed information during calculation.

Returns:

The log-likelihood correction value.

Return type:

float

Notes

This method calculates correction terms for the genetic prior by finding the closest sampled siblings for each host and computing the log-likelihood of their genetic distances based on the time difference between sampling events.

prior_host(host, T, parent_dist=False)[source]

Calculate the log prior for a specific host in the transmission tree.

Parameters:
  • host (object) – The host for which to calculate the log prior.

  • T (networkx.DiGraph) – The transmission tree.

  • parent_dist (bool, optional) – If True, include parent distance in the calculation. Default is False.

Returns:

The log prior value for the host.

Return type:

float

Notes

This method calculates the log prior by considering: 1. Direct connections to sampled hosts 2. Connections to sampled siblings through unsampled intermediate hosts 3. Parent distance (if parent_dist=True)

The calculation uses Poisson distributions based on the mutation rate and time differences between sampling events.

prior_pair(h1, h2)[source]

Calculate the log prior for a pair of hosts.

Parameters:
  • h1 (object) – First host in the pair.

  • h2 (object) – Second host in the pair.

Returns:

The log prior value for the pair, or 0 if either host is not sampled.

Return type:

float

Notes

This method calculates the log prior for the genetic distance between two hosts based on their sampling time difference and the Poisson distribution with rate mu * Dt.

log_prior_host_list(host_list, T=None)[source]

Calculate the total log prior for a list of hosts.

Parameters:
  • host_list (list) – List of hosts for which to calculate the log prior.

  • T (networkx.DiGraph, optional) – The transmission tree. If None, uses self.model.T.

Returns:

The sum of log priors for all hosts in the list.

Return type:

float

Notes

This method iterates through the host list and sums the log priors for each individual host using the log_prior_host method.

log_prior_host(host, T=None)[source]

Compute the log prior for a host.

Parameters:
  • host (object) – The host for which to compute the log prior.

  • T (object, optional) – Transmission tree. Default is None.

Returns:

The log prior value for the host.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log prior for the host based on the transmission tree.

  2. Returns the log prior value.

log_prior_T(T, update_up=True, verbose=False)[source]

Calculate the total log prior for an entire transmission tree.

Parameters:
  • T (networkx.DiGraph) – The transmission tree.

  • update_up (bool, optional) – If True, include correction terms for closest sampling siblings. Default is True.

  • verbose (bool, optional) – If True, print detailed information during calculation.

Returns:

The total log prior value for the transmission tree.

Return type:

float

Notes

This method calculates the complete log prior for a transmission tree by: 1. Iterating through all hosts and their connections 2. Computing log-likelihoods for direct connections to sampled hosts 3. Computing log-likelihoods for connections to sampled siblings through unsampled hosts 4. Adding correction terms for closest sampling siblings (if update_up=True)

The calculation uses Poisson distributions based on mutation rates and time differences.

Delta_log_prior(host, T_end, T_ini)[source]

Calculate the difference in log prior between two transmission tree states.

Parameters:
  • host (object) – The host for which to calculate the log prior difference.

  • T_end (networkx.DiGraph) – The final transmission tree state.

  • T_ini (networkx.DiGraph) – The initial transmission tree state.

Returns:

The difference in log prior: log_prior(T_end) - log_prior(T_ini).

Return type:

float

Notes

This method calculates how the log prior changes when a transmission tree transitions from state T_ini to T_end. It considers: 1. Changes in parent relationships 2. Changes in sibling relationships

The calculation is useful for MCMC acceptance ratios where only the difference in log prior is needed, not the absolute values.

transmission_models.classes.genetic_prior.get_roots_data_subtrees(host, T, dist_matrix)[source]

Get all sampled hosts with genetic data in subtrees rooted at a given host.

Parameters:
  • host (object) – The root host of the subtrees to search.

  • T (networkx.DiGraph) – The transmission tree.

  • dist_matrix (numpy.ndarray) – Matrix containing pairwise genetic distances between hosts.

Returns:

List of sampled hosts that have valid genetic distance data.

Return type:

list

Notes

This function recursively searches through all subtrees rooted at the given host and collects all sampled hosts that have non-NaN values in the distance matrix (indicating they have genetic sequence data).

class transmission_models.classes.location_prior.location_distance_prior_tree(model, mu, distance_matrix)[source]

Bases: object

__init__(model, mu, distance_matrix)[source]
static search_firsts_sampled_siblings(host, T)[source]
static search_first_sampleed_parent(host, T, root)[source]
static get_mut_time_dist(hp, hs)[source]
get_closest_sampling_siblings(T=None)[source]
prior_host(host, T, parent_dist=False)[source]
log_prior_T(T, update_up=True, verbose=False)[source]
class transmission_models.classes.location_prior.same_location_prior_tree(model, P_NM, tau, distance_matrix)[source]

Bases: object

Class to compute the prior of the location of the hosts in the tree. The prior model computes which is the probability that a hosts stays where it lives in a characteristic time tau. It will stay where it lives with a probability exp(-t*P_NM/tau) where P is the probability that the host no moves in tau.

__init__(model, P_NM, tau, distance_matrix)[source]
static get_roots_data_subtrees(host, T, dist_matrix)[source]
static search_firsts_sampled_siblings(host, T, distance_matrix)[source]
static get_mut_time_dist(hp, hs)[source]
get_closest_sampling_siblings(T=None)[source]
prior_host(host, T, parent_dist=False)[source]
log_prior_T(T, update_up=True, verbose=False)[source]
transmission_models.classes.location_prior.search_first_sampled_parent(host, T, root, distance_matrix)[source]

Module Documentation

Classes Module

Classes Module.

This module contains all the main classes for the transmission_models package.

Main Classes

host : Host class representing infected individuals didelot_unsampled : Main class implementing the Didelot et al. (2017) framework genetic_prior_tree : Prior distribution for genetic sequence data location_distance_prior_tree : Prior distribution for location distance data same_location_prior_tree : Prior distribution for same location probability MCMC : Markov Chain Monte Carlo sampling algorithms

Submodules

mcmc : MCMC sampling classes and algorithms

class transmission_models.classes.host(id, index, genetic_data=[], t_inf=0, t_sample=None)[source]

Bases: object

Represents a host that has been infected with a virus.

A host object contains information about an infected individual, including their genetic data, infection time, sampling time, and other attributes.

Variables:
  • index (int) – The index of the host.

  • sampled (bool) – Indicates whether the host has been sampled or not.

  • genetic_data (list) – The genetic data of the host.

  • dict_attributes (dict) – A dictionary to store additional attributes.

  • t_inf (int) – Time of infection.

  • t_sample (int, optional) – The time the host was sampled.

  • id (str) – The identifier of the host.

t_inf : property

Getter and setter for the time of infection attribute.

get_genetic_str() : str

Returns the genetic data as a string.

__str__() : str

Returns a string with the id of the host.

__int__() : int

Returns the index of the host.

Examples

>>> h = host('host1', 1, ['A', 'T', 'C', 'G'], 10, t_sample=15)
>>> print(h.t_inf)
10
>>> h.t_inf = 20
>>> print(h.t_inf)
20
>>> print(h.get_genetic_str())
ATCG
>>> print(h)
host1

Notes

This class follows the Python naming convention for class names (using PascalCase).

__init__(id, index, genetic_data=[], t_inf=0, t_sample=None)[source]

Initialize a new instance of the Host class.

Parameters:
  • id (str) – The id of the host.

  • index (int) – The index of the host.

  • genetic_data (list, optional) – The genetic data of the host. Defaults to an empty list.

  • t_inf (int, optional) – Time of infection. Defaults to 0.

  • t_sample (int, optional) – The time the host was sampled. Defaults to None.

property t_inf

Getter for the time of infection attribute.

Returns:

The time of infection.

Return type:

int

get_genetic_str()[source]

Return the genetic data of the host as a string.

Returns:

The genetic data as a string.

Return type:

str

__str__()[source]

Return a string with the id of the host.

Returns:

The id of the host.

Return type:

str

__int__()[source]

Return the index of the host.

Returns:

The index of the host.

Return type:

int

class transmission_models.classes.didelot_unsampled(sampling_params, offspring_params, infection_params, T=None)[source]

Bases: object

Didelot unsampled transmission model.

This class implements the Didelot et al. (2017) framework for transmission tree inference with unsampled hosts. It provides methods for building transmission networks, computing likelihoods, and performing MCMC sampling.

The model incorporates three main components: 1. Sampling model: Gamma distribution for sampling times 2. Offspring model: Negative binomial distribution for offspring number 3. Infection model: Gamma distribution for infection times

Parameters:
  • sampling_params (dict) – Parameters for the sampling model containing: - pi : float, sampling probability - k_samp : float, shape parameter for gamma distribution - theta_samp : float, scale parameter for gamma distribution

  • offspring_params (dict) – Parameters for the offspring model containing: - r : float, rate of infection - p_inf : float, probability of infection

  • infection_params (dict) – Parameters for the infection model containing: - k_inf : float, shape parameter for gamma distribution - theta_inf : float, scale parameter for gamma distribution

Variables:
  • T (networkx.DiGraph) – The transmission tree.

  • host_dict (dict) – Dictionary mapping host IDs to host objects.

  • log_likelihood (float) – Current log likelihood of the model.

  • genetic_prior (genetic_prior_tree, optional) – Prior for genetic data.

  • same_location_prior (same_location_prior_tree, optional) – Prior for location data.

References

Didelot, X., Gardy, J., & Colijn, C. (2017). Bayesian inference of transmission chains using timing of events, contact and genetic data. PLoS computational biology, 13(4), e1005496.

__init__(sampling_params, offspring_params, infection_params, T=None)[source]

Initialize the Didelot unsampled transmission model.

Parameters:
  • sampling_params (dict) – Parameters for the sampling model containing: - pi : float, sampling probability - k_samp : float, shape parameter for gamma distribution - theta_samp : float, scale parameter for gamma distribution

  • offspring_params (dict) – Parameters for the offspring model containing: - r : float, rate of infection - p_inf : float, probability of infection

  • infection_params (dict) – Parameters for the infection model containing: - k_inf : float, shape parameter for gamma distribution - theta_inf : float, scale parameter for gamma distribution

  • T (networkx.DiGraph, optional) – The transmission tree. If provided, the model will be initialized with this tree. Default is None.

Raises:

KeyError – If any required parameter is missing from the input dictionaries.

property T
set_T(T)[source]
samp_t_inf_between(h1, h2)[source]

Sample a time of infection between two hosts.

Uses a rejection sampling method to sample the time of infection of the infected host using the chain model from Didelot et al. 2017.

Parameters:
  • h1 (host) – Infector host.

  • h2 (host) – Infected host.

Returns:

Time of infection of the host infected by h1 and the infector of h2.

Return type:

float

Notes

This method implements the rejection sampling algorithm described in Didelot et al. (2017) for sampling infection times in transmission chains.

add_root(t_sampl, id='0', genetic_data=[], t_inf=0, t_sample=None)[source]

Add the root host to the transmission tree.

Parameters:
  • t_sampl (float) – Sampling time of the root host.

  • id (str, optional) – Identifier for the root host. Default is “0”.

  • genetic_data (list, optional) – Genetic data for the root host. Default is empty list.

  • t_inf (float, optional) – Infection time of the root host. Default is 0.

  • t_sample (float, optional) – Sampling time of the root host. Default is None.

Returns:

The root host object.

Return type:

host

successors(host)[source]

Get the successors (children) of a given host in the transmission tree.

Parameters:

host (host) – The host node whose successors are to be returned.

Returns:

An iterator over the successors of the host.

Return type:

iterator

parent(host)[source]

Get the parent (infector) of a given host in the transmission tree.

Parameters:

host (host) – The host node whose parent is to be returned.

Returns:

The parent host object.

Return type:

host

out_degree(host)[source]

Get the out-degree (number of children) of a host in the transmission tree.

Parameters:

host (host) – The host node whose out-degree is to be returned.

Returns:

The out-degree of the host.

Return type:

int

choose_successors(host, k=1)[source]

Choose k unique successors of a given host.

Parameters:
  • host (host) – Host whose successors will be chosen.

  • k (int, optional) – Number of successors to choose. Default is 1.

Returns:

List of k randomly chosen successors of the host.

Return type:

list

compute_Delta_loc_prior(T_new)[source]

Compute the change in the location prior log-likelihood for a new tree.

Parameters:

T_new (networkx.DiGraph) – The new transmission tree.

Returns:

(Delta log prior, new log prior, old log prior, old correction log-likelihood)

Return type:

tuple

get_candidates_to_chain()[source]

Get the list of candidate hosts for chain moves in the transmission tree.

Returns:

List of candidate host nodes for chain moves.

Return type:

list

get_N_candidates_to_chain(recompute=False)[source]

Get the number of candidate hosts for chain moves, optionally recomputing the list.

Parameters:

recompute (bool, optional) – If True, recompute the list of candidates. Default is False.

Returns:

Number of candidate hosts for chain moves.

Return type:

int

get_root_subtrees()[source]

Retrieve the root subtrees of the transmission tree.

This method searches for the first sampled siblings of the root host in the transmission tree and stores them in the roots_subtrees attribute.

Returns:

A list of root subtrees.

Return type:

list

get_unsampled_hosts()[source]

Get the list of unsampled hosts in the transmission tree (excluding the root).

Returns:

List of unsampled host nodes.

Return type:

list

get_sampling_model_likelihood(hosts=None, T=None, update=False)[source]

Compute the likelihood of the sampling model.

Computes the likelihood of the sampling model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the sampling model given the list of hosts

Return type:

float

get_sampling_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the sampling model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the sampling model given the list of hosts

Return type:

float

Delta_log_sampling(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the sampling model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the sampling model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the sampling model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

get_offspring_model_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the offspring model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the offspring model given the list of hosts

Return type:

float

get_offspring_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the offspring model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:

hosts (list of host objects) –

Returns:

L – The likelihood of the offspring model given the list of hosts

Return type:

float

Delta_log_offspring(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the offspring model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the offspring model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the offspring model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

get_infection_model_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the infection model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:
  • hosts (list of host objects) –

  • T (DiGraph object) – Contagious tree which likelihood of the hosts will be computed. If it is None, the network of the model is used.

  • update (bool) – If True, the likelihood of the infection model is updated in the model object.

Returns:

L – The likelihood of the infection model given the list of hosts

Return type:

float

get_infection_model_log_likelihood(hosts=None, T=None, update=False)[source]

Computes the likelihood of the infection model given a list of hosts. If no list is given, the likelihood of the whole transmission tree is returned.

Parameters:
  • hosts (list of host objects) –

  • T (DiGraph object) – Contagious tree which likelihood of the hosts will be computed. If it is None, the network of the model is used.

  • update (bool) – If True, the likelihood of the infection model is updated in the model object.

Returns:

L – The likelihood of the infection model given the list of hosts

Return type:

float

Delta_log_infection(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for the infection model.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the infection model.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the infection model at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

log_likelihood_host(host, T=None)[source]

Computes the log likelihood of a host given the transmission tree. :param host: :type host: host object :param T: :type T: DiGraph object

Returns:

log_likelihood – The log likelihood of the host in the transmission network

Return type:

float

Delta_log_likelihood_host(hosts, T_end, T_ini=None)[source]

Compute the change in log-likelihood for a host.

Parameters:
  • hosts (list) – List of host objects.

  • T_end (float) – End time.

  • T_ini (float, optional) – Initial time. Default is None.

Returns:

Change in log-likelihood for the host.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log-likelihood for the host at T_end.

  2. If T_ini is provided, subtracts the log-likelihood at T_ini.

  3. Returns the difference.

log_likelihood_hosts_list(hosts, T)[source]
log_likelihood_transmission_tree(T)[source]
log_posterior_transmission_tree()[source]

Compute the log-posterior of the current transmission tree.

This method calculates the log-posterior probability of the current transmission tree by summing the log-likelihood of the tree and any additional prior log-probabilities, such as genetic and location priors, if they are defined.

Returns:

The computed log-posterior of the current transmission tree.

Return type:

float

Notes

The log-posterior is computed as:

log_posterior = log_likelihood + genetic_log_prior (if defined) + same_location_log_prior (if defined)

The method uses the following attributes:
  • self.log_likelihood: Log-likelihood of the transmission tree.

  • self.genetic_log_prior: Log-prior from the genetic model (if defined).

  • self.same_location_log_prior: Log-prior from the location model (if defined).

get_log_posterior_transmission_tree(T)[source]

Compute and update the log-posterior of the transmission tree.

This method calculates the log-posterior probability of the given transmission tree T by combining the log-likelihood of the tree with any additional prior log-probabilities, such as genetic and location priors, if they are defined. The computed log-posterior and any relevant prior log-likelihoods are stored as attributes of the object.

Parameters:

T (networkx.DiGraph) – The transmission tree for which to compute the log-posterior.

Returns:

The computed log-posterior of the transmission tree.

Return type:

float

Notes

The log-posterior is computed as:

log_posterior = log_likelihood + genetic_log_prior (if defined) + same_location_log_prior (if defined)

The method also updates the following attributes:
  • self.log_posterior

  • self.genetic_log_prior (if applicable)

  • self.same_location_log_prior (if applicable)

show_log_likelihoods(hosts=None, T=None, verbose=False)[source]

Print and return the log-likelihoods for the sampling, offspring, and infection models.

Parameters:
  • hosts (list, optional) – List of host objects to compute log-likelihoods for. If None, computes for all hosts in T.

  • T (networkx.DiGraph, optional) – Transmission tree. If None, uses self.T.

  • verbose (bool, optional) – If True, prints the log-likelihoods. Default is False.

Returns:

(LL_sampling, LL_offspring, LL_infection): Log-likelihoods for the sampling, offspring, and infection models.

Return type:

tuple

log_likelihood_transmission_tree_old(T)[source]

Compute the log-likelihood of the entire transmission tree using the old method.

Parameters:

T (networkx.DiGraph) – Transmission tree to compute the log-likelihood for.

Returns:

The log-likelihood of the transmission tree.

Return type:

float

get_log_likelihood_transmission()[source]
add_genetic_prior(mu_gen, gen_dist)[source]

Adds a genetic prior to the model that computes the likelihood that two sampled hosts has a relationship given the genetic distance of the virus of the hosts. Two nodes are considered that has a relationship if the only hosts that are on they are connected through unsampled hosts.

Parameters:
  • mu_gen (float) – Mutation rate

  • gen_dist (np.array) – Genetic distance matrix of the virus of the hosts. The index has to be identical to the index of the hosts.

add_same_location_prior(P_NM, tau, loc_dist)[source]

Adds a genetic prior to the model that computes the likelihood that two sampled hosts has a relationship given the genetic distance of the virus of the hosts. Two nodes are considered that has a relationship if the only hosts that are on they are connected through unsampled hosts.

Parameters:
  • log_K (float) – Log probability of two hosts not being in the same location

  • gen_dist (np.array) – Genetic distance matrix of the virus of the hosts. The index has to be identical to the index of the hosts.

create_transmision_phylogeny_nets(N, mu, P_mut)[source]

N: Number of hosts mu: Mutation rate P_mut: Prob of mutation

get_newick(lengths=True)[source]
save_json(filename)[source]

Save the transmission tree to a JSON file.

Parameters:

filename (str) – Path to the output JSON file.

classmethod json_to_tree(filename, sampling_params=None, offspring_params=None, infection_params=None)[source]

Load a transmission model from a JSON file and reconstruct the model object.

Parameters:
  • filename (str) – Path to the JSON file.

  • sampling_params (dict, optional) – Sampling parameters to override those in the file. Default is None.

  • offspring_params (dict, optional) – Offspring parameters to override those in the file. Default is None.

  • infection_params (dict, optional) – Infection parameters to override those in the file. Default is None.

Returns:

The reconstructed transmission model.

Return type:

didelot_unsampled

infection_time_from_sampling_step(selected_host=None, metHast=True, verbose=False)[source]

Propose and possibly accept a new infection time for a sampled host using the Metropolis-Hastings algorithm.

This method samples a new infection time for a selected host (or a random sampled host if not provided), computes the acceptance probability, and updates the host’s infection time if the proposal is accepted.

Parameters:
  • selected_host (host, optional) – The host whose infection time will be changed. If None, a random sampled host is selected.

  • metHast (bool, optional) – If True, use the Metropolis-Hastings algorithm to accept or reject the proposal. Default is True.

  • verbose (bool, optional) – If True, print detailed information about the proposal. Default is False.

Returns:

  • t_inf_new (float) – The proposed new infection time.

  • gg (float) – Proposal ratio for the Metropolis-Hastings step.

  • pp (float) – Likelihood ratio for the Metropolis-Hastings step.

  • P (float) – Acceptance probability for the Metropolis-Hastings step.

  • selected_host (host) – The host whose infection time was proposed to change.

infection_time_from_infection_model_step(selected_host=None, metHast=True, Dt_new=None, verbose=False)[source]

Method to change the infection time of a host and then accept the change using the Metropolis Hastings algorithm.

Parameters:
  • selected_host (host object, default=None) – Host whose infection time will be changed. If None, a host is randomly selected.

  • metHast (bool, default=True) – If True, the Metropolis Hastings algorithm is used to accept or reject the change.

  • Dt_new (float, default=None) – New infection time for the host. If None, a new time is sampled.

  • verbose (bool, default=False) – If True, prints the results of the step.

add_unsampled_with_times(selected_host=None, P_add=0.5, P_rewiring=0.5, P_off=0.5, verbose=False, only_geometrical=False, detailed_probs=False)[source]

Method to propose the addition of an unsampled host to the transmission tree and get the probability of the proposal.

Parameters:

selected_host: host object

Host to which the unsampled host will be added. If None, a host is randomly selected.

P_add: float

Probability of proposing to add a new host to the transmission tree.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

verbose: bool

If True, prints the results of the step.

only_geometrical: bool

If True, only the proposal of the new topological structure will be considered.

detailed_probs: bool

If True, the method will return both probabilities of the proposals, of adding and removing a host.

Returns:

T_new: DiGraph object

New transmission tree with the proposed changes.

gg: float

Ratio of the probabilities of the proposals.

g_go: float

Probability of the proposal of adding a host.

g_ret: float

Probability of the proposal of removing a host.

prob_time: float

Probability of the time of infection of the new host.

unsampled: host object

Unsampeld host to be added to the transmission tree.

added: bool

If True, the host was added to the transmission tree.

remove_unsampled_with_times(selected_host=None, P_add=0.5, P_rewiring=0.5, P_off=0.5, only_geometrical=False, detailed_probs=False, verbose=False)[source]

Method to propose the removal of an unsampled host from the transmission tree and get the probability of the proposal. In case that no unsampled hosts are available, a new host is proposed to be added to the transmission tree.

Parameters:

selected_host: host object

Unsampled host to be removed from the transmission tree. If None, a host is randomly selected.

P_add: float

Probability of proposing to add a new host to the transmission tree.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

verbose: bool

If True, prints the results of the step.

only_geometrical: bool

If True, only the proposal of the new topological structure will be considered.

detailed_probs: bool

If True, the method will return both probabilities of the proposals, of adding and removing a host.

Returns:

T_new: DiGraph object

New transmission tree with the proposed changes.

gg: float

Ratio of the probabilities of the proposals.

g_go: float

Probability of the proposal of adding a host.

g_ret: float

Probability of the proposal of removing a host.

prob_time: float

Probability of proposing the time of the selected_host.

added: bool

If True, the host was added to the transmission tree. Else, the node have been removed

add_remove_step(P_add=0.5, P_rewiring=0.5, P_off=0.5, metHast=True, verbose=False)[source]

Method to propose the addition or removal of an unsampled host to the transmission tree and get the probability of the proposal.

Parameters:

P_add: float

Probability of proposing an addition of an unsampled host. Else, an unsampled host is going to be proposed for removal.

P_rewiring: float

Probability of rewiring the new host to another sibling host.

P_off: float

Probability to rewire the new host to be a leaf.

metHast: bool

If True, the Metropolis Hastings algorithm is used to accept or reject the change.

verbose: bool

If True, prints the results of the step.

Returns:

MCMC_step(N_steps, verbose=False)[source]
class transmission_models.classes.genetic_prior_tree(model, mu, distance_matrix)[source]

Bases: object

__init__(model, mu, distance_matrix)[source]

Initialize the genetic prior tree object.

Parameters:
  • model (object) – The transmission model containing the tree structure.

  • mu (float) – The mutation rate parameter for the Poisson distribution.

  • distance_matrix (numpy.ndarray) – Matrix containing pairwise genetic distances between hosts.

Notes

This initializes the genetic prior calculator with: - A Poisson distribution with rate mu for modeling genetic distances - A distance matrix for pairwise host comparisons - A reference to the transmission model

static search_firsts_sampled_siblings(host, T, distance_matrix)[source]

Find all sampled siblings of a host in the transmission tree.

Parameters:
  • host (object) – The host for which to find sampled siblings.

  • T (networkx.DiGraph) – The transmission tree.

  • distance_matrix (numpy.ndarray) – Matrix containing pairwise genetic distances between hosts.

Returns:

List of sampled sibling hosts that have genetic distance data.

Return type:

list

Notes

This method recursively searches through the tree to find all sampled hosts that are descendants of the given host and have valid genetic distance data (non-NaN values in the distance matrix).

static search_first_sampled_parent(host, T, root)[source]

Find the first sampled ancestor of a host in the transmission tree.

Parameters:
  • host (object) – The host for which to find the first sampled parent.

  • T (networkx.DiGraph) – The transmission tree.

  • root (object) – The root host of the transmission tree.

Returns:

The first sampled parent host, or None if no sampled parent is found.

Return type:

object or None

Notes

This method traverses up the tree from the given host until it finds the first sampled ancestor, or reaches the root without finding one.

static get_mut_time_dist(hp, hs)[source]

Calculate the mutation time distance between two hosts.

Parameters:
  • hp (object) – The parent host.

  • hs (object) – The sibling host.

Returns:

The mutation time distance: (hs.t_sample + hp.t_sample - 2 * hp.t_inf).

Return type:

float

Notes

This calculates the time available for mutations to accumulate between the sampling times of two hosts, accounting for their common infection time.

get_closest_sampling_siblings(T=None, verbose=False)[source]

Calculate log-likelihood correction for closest sampling siblings.

Parameters:
  • T (networkx.DiGraph, optional) – The transmission tree. If None, uses self.model.T.

  • verbose (bool, optional) – If True, print detailed information during calculation.

Returns:

The log-likelihood correction value.

Return type:

float

Notes

This method calculates correction terms for the genetic prior by finding the closest sampled siblings for each host and computing the log-likelihood of their genetic distances based on the time difference between sampling events.

prior_host(host, T, parent_dist=False)[source]

Calculate the log prior for a specific host in the transmission tree.

Parameters:
  • host (object) – The host for which to calculate the log prior.

  • T (networkx.DiGraph) – The transmission tree.

  • parent_dist (bool, optional) – If True, include parent distance in the calculation. Default is False.

Returns:

The log prior value for the host.

Return type:

float

Notes

This method calculates the log prior by considering: 1. Direct connections to sampled hosts 2. Connections to sampled siblings through unsampled intermediate hosts 3. Parent distance (if parent_dist=True)

The calculation uses Poisson distributions based on the mutation rate and time differences between sampling events.

prior_pair(h1, h2)[source]

Calculate the log prior for a pair of hosts.

Parameters:
  • h1 (object) – First host in the pair.

  • h2 (object) – Second host in the pair.

Returns:

The log prior value for the pair, or 0 if either host is not sampled.

Return type:

float

Notes

This method calculates the log prior for the genetic distance between two hosts based on their sampling time difference and the Poisson distribution with rate mu * Dt.

log_prior_host_list(host_list, T=None)[source]

Calculate the total log prior for a list of hosts.

Parameters:
  • host_list (list) – List of hosts for which to calculate the log prior.

  • T (networkx.DiGraph, optional) – The transmission tree. If None, uses self.model.T.

Returns:

The sum of log priors for all hosts in the list.

Return type:

float

Notes

This method iterates through the host list and sums the log priors for each individual host using the log_prior_host method.

log_prior_host(host, T=None)[source]

Compute the log prior for a host.

Parameters:
  • host (object) – The host for which to compute the log prior.

  • T (object, optional) – Transmission tree. Default is None.

Returns:

The log prior value for the host.

Return type:

float

Notes

The function operates as follows:

  1. Computes the log prior for the host based on the transmission tree.

  2. Returns the log prior value.

log_prior_T(T, update_up=True, verbose=False)[source]

Calculate the total log prior for an entire transmission tree.

Parameters:
  • T (networkx.DiGraph) – The transmission tree.

  • update_up (bool, optional) – If True, include correction terms for closest sampling siblings. Default is True.

  • verbose (bool, optional) – If True, print detailed information during calculation.

Returns:

The total log prior value for the transmission tree.

Return type:

float

Notes

This method calculates the complete log prior for a transmission tree by: 1. Iterating through all hosts and their connections 2. Computing log-likelihoods for direct connections to sampled hosts 3. Computing log-likelihoods for connections to sampled siblings through unsampled hosts 4. Adding correction terms for closest sampling siblings (if update_up=True)

The calculation uses Poisson distributions based on mutation rates and time differences.

Delta_log_prior(host, T_end, T_ini)[source]

Calculate the difference in log prior between two transmission tree states.

Parameters:
  • host (object) – The host for which to calculate the log prior difference.

  • T_end (networkx.DiGraph) – The final transmission tree state.

  • T_ini (networkx.DiGraph) – The initial transmission tree state.

Returns:

The difference in log prior: log_prior(T_end) - log_prior(T_ini).

Return type:

float

Notes

This method calculates how the log prior changes when a transmission tree transitions from state T_ini to T_end. It considers: 1. Changes in parent relationships 2. Changes in sibling relationships

The calculation is useful for MCMC acceptance ratios where only the difference in log prior is needed, not the absolute values.

class transmission_models.classes.location_distance_prior_tree(model, mu, distance_matrix)[source]

Bases: object

__init__(model, mu, distance_matrix)[source]
static search_firsts_sampled_siblings(host, T)[source]
static search_first_sampleed_parent(host, T, root)[source]
static get_mut_time_dist(hp, hs)[source]
get_closest_sampling_siblings(T=None)[source]
prior_host(host, T, parent_dist=False)[source]
log_prior_T(T, update_up=True, verbose=False)[source]
class transmission_models.classes.same_location_prior_tree(model, P_NM, tau, distance_matrix)[source]

Bases: object

Class to compute the prior of the location of the hosts in the tree. The prior model computes which is the probability that a hosts stays where it lives in a characteristic time tau. It will stay where it lives with a probability exp(-t*P_NM/tau) where P is the probability that the host no moves in tau.

__init__(model, P_NM, tau, distance_matrix)[source]
static get_roots_data_subtrees(host, T, dist_matrix)[source]
static search_firsts_sampled_siblings(host, T, distance_matrix)[source]
static get_mut_time_dist(hp, hs)[source]
get_closest_sampling_siblings(T=None)[source]
prior_host(host, T, parent_dist=False)[source]
log_prior_T(T, update_up=True, verbose=False)[source]
class transmission_models.classes.MCMC(model, P_rewire=0.3333333333333333, P_add_remove=0.3333333333333333, P_t_shift=0.3333333333333333, P_add=0.5, P_rewire_add=0.5, P_offspring_add=0.5, P_to_offspring=0.5)[source]

Bases: object

Markov Chain Monte Carlo sampler for transmission tree inference.

This class implements MCMC sampling algorithms for transmission network inference using various proposal mechanisms.

Parameters:
  • model (didelot_unsampled) – The transmission tree model to sample from.

  • P_rewire (float, optional) – The probability of rewiring a transmission tree. Default is 1/3.

  • P_add_remove (float, optional) – The probability of adding or removing an unsampled host in the transmission tree. Default is 1/3.

  • P_t_shift (float, optional) – The probability of shifting the infection time of the host in the transmission tree. Default is 1/3.

  • P_add (float, optional) – The probability of adding a new host to the transmission tree once the add/remove have been proposed. Default is 0.5.

  • P_rewire_add (float, optional) – The probability of rewiring the new unsampled host once the add have been proposed. Default is 0.5.

  • P_offspring_add (float, optional) – The probability that the new unsampled host is an offspring once the add and rewire have been proposed. Default is 0.5.

  • P_to_offspring (float, optional) – The probability of moving to offspring model during rewiring. Default is 0.5.

Variables:
  • model (didelot_unsampled) – The transmission model being sampled.

  • P_rewire (float) – Probability of rewiring moves.

  • P_add_remove (float) – Probability of add/remove moves.

  • P_t_shift (float) – Probability of time shift moves.

  • P_add (float) – Probability of adding vs removing hosts.

  • P_rewire_add (float) – Probability of rewiring added hosts.

  • P_offspring_add (float) – Probability of offspring vs chain model for added hosts.

  • P_to_offspring (float) – Probability of moving to offspring model.

__init__(model, P_rewire=0.3333333333333333, P_add_remove=0.3333333333333333, P_t_shift=0.3333333333333333, P_add=0.5, P_rewire_add=0.5, P_offspring_add=0.5, P_to_offspring=0.5)[source]

Initialize the MCMC sampler.

Parameters:
  • model (didelot_unsampled) – The transmission tree model to sample from.

  • P_rewire (float, optional) – The probability of rewiring a transmission tree. Default is 1/3.

  • P_add_remove (float, optional) – The probability of adding or removing an unsampled host in the transmission tree. Default is 1/3.

  • P_t_shift (float, optional) – The probability of shifting the infection time of the host in the transmission tree. Default is 1/3.

  • P_add (float, optional) – The probability of adding a new host to the transmission tree once the add/remove have been proposed. Default is 0.5.

  • P_rewire_add (float, optional) – The probability of rewiring the new unsampled host once the add have been proposed. Default is 0.5.

  • P_offspring_add (float, optional) – The probability that the new unsampled host is an offspring once the add and rewire have been proposed. Default is 0.5.

  • P_to_offspring (float, optional) – The probability of moving to offspring model during rewiring. Default is 0.5.

MCMC_iteration(verbose=False)[source]

Perform an MCMC iteration on the transmission tree model.

Parameters:

verbose (bool, optional) – Whether to print the progress of the MCMC iteration. Default is False.

Returns:

A tuple containing:

  • movestr

    The type of move proposed (‘rewire’, ‘add_remove’, or ‘time_shift’).

  • ggfloat

    The ratio of proposal probabilities.

  • ppfloat

    The ratio of posterior probabilities.

  • Pfloat

    The acceptance probability.

  • acceptedbool

    Whether the move was accepted.

  • DLfloat

    The difference in log likelihood.

Return type:

tuple

Notes

The function operates as follows:

  1. Selects a move type at random.

  2. Performs the move and computes acceptance probability.

  3. Returns move details and acceptance status.

MCMC Module

MCMC Module.

This module contains Markov Chain Monte Carlo sampling algorithms for transmission network inference.

Main Classes

MCMC : Main MCMC sampler class for transmission tree inference

The MCMC module provides methods for sampling from the posterior distribution of transmission trees using various proposal mechanisms including: - Tree topology changes (rewiring) - Adding/removing unsampled hosts - Infection time updates

transmission_models.classes.mcmc.random() x in the interval [0, 1).