Local Two-Sample Test for Network-Valued Data
Usage
test2_local(
x,
y,
partition,
representation = "adjacency",
distance = "frobenius",
stats = c("flipr:t_ip", "flipr:f_ip"),
B = 1000L,
alpha = 0.05,
test = "exact",
k = 5L,
seed = NULL,
verbose = FALSE
)
Arguments
- x
Either an object of class nvd listing networks in sample 1 or a distance matrix of size \(n_1 + n_2\).
- y
Either an object of class nvd listing networks in sample 2 or an integer value specifying the size of sample 1 or an integer vector specifying the indices of the observations belonging to sample 1.
- partition
Either a list or an integer vector specifying vertex memberships into partition elements.
- representation
A string specifying the desired type of representation, among:
"adjacency"
,"laplacian"
and"modularity"
. Defaults to"adjacency"
.- distance
A string specifying the chosen distance for calculating the test statistic, among:
"hamming"
,"frobenius"
,"spectral"
and"root-euclidean"
. Defaults to"frobenius"
.- stats
A character vector specifying the chosen test statistic(s), among:
"original_edge_count"
,"generalized_edge_count"
,"weighted_edge_count"
,"student_euclidean"
,"welch_euclidean"
or any statistics based on inter-point distances available in the flipr package:"flipr:student_ip"
,"flipr:fisher_ip"
,"flipr:bg_ip"
,"flipr:energy_ip"
,"flipr:cq_ip"
. Defaults toc("flipr:student_ip", "flipr:fisher_ip")
.- B
The number of permutation or the tolerance. If this number is lower than
1
, it is intended as a tolerance. Otherwise, it is intended as the number of required permutations. Defaults to1000L
.- alpha
Significance level for hypothesis testing. If set to 1, the function outputs properly adjusted p-values. If lower than 1, then only p-values lower than alpha are properly adjusted. Defaults to
0.05
.- test
A character string specifying the formula to be used to compute the permutation p-value. Choices are
"estimate"
,"upper_bound"
and"exact"
. Defaults to"exact"
which provides exact tests.- k
An integer specifying the density of the minimum spanning tree used for the edge count statistics. Defaults to
5L
.- seed
An integer for specifying the seed of the random generator for result reproducibility. Defaults to
NULL
.- verbose
Boolean specifying whether information on intermediate tests should be printed in the process (default:
FALSE
).
Value
A length-2 list reporting the adjusted p-values of each element of the partition for the intra- and inter-tests.
Examples
n <- 5L
p1 <- matrix(
data = c(0.1, 0.4, 0.1, 0.4,
0.4, 0.4, 0.1, 0.4,
0.1, 0.1, 0.4, 0.4,
0.4, 0.4, 0.4, 0.4),
nrow = 4,
ncol = 4,
byrow = TRUE
)
p2 <- matrix(
data = c(0.1, 0.4, 0.4, 0.4,
0.4, 0.4, 0.4, 0.4,
0.4, 0.4, 0.1, 0.1,
0.4, 0.4, 0.1, 0.4),
nrow = 4,
ncol = 4,
byrow = TRUE
)
sim <- sample2_sbm(n, 68, p1, c(17, 17, 17, 17), p2, seed = 1234)
m <- as.integer(c(rep(1, 17), rep(2, 17), rep(3, 17), rep(4, 17)))
test2_local(sim$x, sim$y, m,
seed = 1234,
alpha = 0.05,
B = 19)
#> $intra
#> # A tibble: 4 × 3
#> E pvalue truncated
#> <chr> <dbl> <lgl>
#> 1 P1 0.548 TRUE
#> 2 P2 0.548 TRUE
#> 3 P3 0.0480 FALSE
#> 4 P4 0.548 TRUE
#>
#> $inter
#> # A tibble: 6 × 4
#> E1 E2 pvalue truncated
#> <chr> <chr> <dbl> <lgl>
#> 1 P1 P2 0.548 TRUE
#> 2 P1 P3 0.0480 FALSE
#> 3 P1 P4 0.548 TRUE
#> 4 P2 P3 0.0480 FALSE
#> 5 P2 P4 0.548 TRUE
#> 6 P3 P4 0.0480 FALSE
#>