This function carries out an hypothesis test where the null hypothesis is that the two populations of networks share the same underlying probabilistic distribution against the alternative hypothesis that the two populations come from different distributions. The test is performed in a non-parametric fashion using a permutational framework in which several statistics can be used, together with several choices of network matrix representations and distances between networks.

test2_global(
  x,
  y,
  representation = "adjacency",
  distance = "frobenius",
  stats = c("flipr:t_ip", "flipr:f_ip"),
  B = 1000L,
  test = "exact",
  k = 5L,
  seed = NULL
)

Arguments

x

An nvd object listing networks in sample 1.

y

An nvd object listing networks in sample 2.

representation

A string specifying the desired type of representation, among: "adjacency", "laplacian" and "modularity". Defaults to "adjacency".

distance

A string specifying the chosen distance for calculating the test statistic, among: "hamming", "frobenius", "spectral" and "root-euclidean". Defaults to "frobenius".

stats

A character vector specifying the chosen test statistic(s), among: "original_edge_count", "generalized_edge_count", "weighted_edge_count", "student_euclidean", "welch_euclidean" or any statistics based on inter-point distances available in the flipr package: "flipr:student_ip", "flipr:fisher_ip", "flipr:bg_ip", "flipr:energy_ip", "flipr:cq_ip". Defaults to c("flipr:student_ip", "flipr:fisher_ip").

B

The number of permutation or the tolerance. If this number is lower than 1, it is intended as a tolerance. Otherwise, it is intended as the number of required permutations. Defaults to 1000L.

test

A character string specifying the formula to be used to compute the permutation p-value. Choices are "estimate", "upper_bound" and "exact". Defaults to "exact" which provides exact tests.

k

An integer specifying the density of the minimum spanning tree used for the edge count statistics. Defaults to 5L.

seed

An integer for specifying the seed of the random generator for result reproducibility. Defaults to NULL.

Value

A list with three components: the value of the statistic for the original two samples, the p-value of the resulting permutation test and a numeric vector storing the values of the permuted statistics.

Examples

n <- 10L
gnp_params <- list(p = 1/3)
k_regular_params <- list(k = 8L)

# Two different models for the two populations
x <- nvd(model = "gnp", n = n, model_params = gnp_params)
y <- nvd(model = "k_regular", n = n, model_params = k_regular_params)
t1 <- test2_global(x, y, representation = "modularity")
#> ! Setting the seed for sampling permutations is mandatory for obtaining a continuous p-value function. Using `seed = 1234`.
t1$pvalue
#> [1] 0.0009962984

# Same model for the two populations
x <- nvd(model = "gnp", n = 10L, model_params = gnp_params)
y <- nvd(model = "gnp", n = 10L, model_params = gnp_params)
t2 <- test2_global(x, y, representation = "modularity")
#> ! Setting the seed for sampling permutations is mandatory for obtaining a continuous p-value function. Using `seed = 1234`.
t2$pvalue
#> [1] 0.9960013