Skip to contents

Local Two-Sample Test for Network-Valued Data

Usage

test2_local(
  x,
  y,
  partition,
  representation = "adjacency",
  distance = "frobenius",
  stats = c("flipr:t_ip", "flipr:f_ip"),
  B = 1000L,
  alpha = 0.05,
  test = "exact",
  k = 5L,
  seed = NULL,
  verbose = FALSE
)

Arguments

x

Either an object of class nvd listing networks in sample 1 or a distance matrix of size \(n_1 + n_2\).

y

Either an object of class nvd listing networks in sample 2 or an integer value specifying the size of sample 1 or an integer vector specifying the indices of the observations belonging to sample 1.

partition

Either a list or an integer vector specifying vertex memberships into partition elements.

representation

A string specifying the desired type of representation, among: "adjacency", "laplacian" and "modularity". Defaults to "adjacency".

distance

A string specifying the chosen distance for calculating the test statistic, among: "hamming", "frobenius", "spectral" and "root-euclidean". Defaults to "frobenius".

stats

A character vector specifying the chosen test statistic(s), among: "original_edge_count", "generalized_edge_count", "weighted_edge_count", "student_euclidean", "welch_euclidean" or any statistics based on inter-point distances available in the flipr package: "flipr:student_ip", "flipr:fisher_ip", "flipr:bg_ip", "flipr:energy_ip", "flipr:cq_ip". Defaults to c("flipr:student_ip", "flipr:fisher_ip").

B

The number of permutation or the tolerance. If this number is lower than 1, it is intended as a tolerance. Otherwise, it is intended as the number of required permutations. Defaults to 1000L.

alpha

Significance level for hypothesis testing. If set to 1, the function outputs properly adjusted p-values. If lower than 1, then only p-values lower than alpha are properly adjusted. Defaults to 0.05.

test

A character string specifying the formula to be used to compute the permutation p-value. Choices are "estimate", "upper_bound" and "exact". Defaults to "exact" which provides exact tests.

k

An integer specifying the density of the minimum spanning tree used for the edge count statistics. Defaults to 5L.

seed

An integer for specifying the seed of the random generator for result reproducibility. Defaults to NULL.

verbose

Boolean specifying whether information on intermediate tests should be printed in the process (default: FALSE).

Value

A length-2 list reporting the adjusted p-values of each element of the partition for the intra- and inter-tests.

Examples

n <- 5L
p1 <- matrix(
  data = c(0.1, 0.4, 0.1, 0.4,
           0.4, 0.4, 0.1, 0.4,
           0.1, 0.1, 0.4, 0.4,
           0.4, 0.4, 0.4, 0.4),
  nrow = 4,
  ncol = 4,
  byrow = TRUE
)
p2 <- matrix(
  data = c(0.1, 0.4, 0.4, 0.4,
           0.4, 0.4, 0.4, 0.4,
           0.4, 0.4, 0.1, 0.1,
           0.4, 0.4, 0.1, 0.4),
  nrow = 4,
  ncol = 4,
  byrow = TRUE
)
sim <- sample2_sbm(n, 68, p1, c(17, 17, 17, 17), p2, seed = 1234)
m <- as.integer(c(rep(1, 17), rep(2, 17), rep(3, 17), rep(4, 17)))
test2_local(sim$x, sim$y, m,
            seed = 1234,
            alpha = 0.05,
            B = 19)
#> $intra
#> # A tibble: 4 × 3
#>   E     pvalue truncated
#>   <chr>  <dbl> <lgl>    
#> 1 P1    0.548  TRUE     
#> 2 P2    0.548  TRUE     
#> 3 P3    0.0480 FALSE    
#> 4 P4    0.548  TRUE     
#> 
#> $inter
#> # A tibble: 6 × 4
#>   E1    E2    pvalue truncated
#>   <chr> <chr>  <dbl> <lgl>    
#> 1 P1    P2    0.548  TRUE     
#> 2 P1    P3    0.0480 FALSE    
#> 3 P1    P4    0.548  TRUE     
#> 4 P2    P3    0.0480 FALSE    
#> 5 P2    P4    0.548  TRUE     
#> 6 P3    P4    0.0480 FALSE    
#>