r - Efficient way to analysis neighbours of subsets of nodes in large graph -
i have graph of 6 million of nodes such
require(igraph) # graph of 1000 nodes g <- ba.game(1000)
with following 4 attributes defined each node
# attributes v(g)$attribute1 <- v(g) %in% sample(v(g), 20) v(g)$attribute2 <- v(g) %in% sample(v(g), 20) v(g)$attribute3 <- v(g) %in% sample(v(g), 20) v(g)$attribute4 <- v(g) %in% sample(v(g), 20)
among nodes have subset
of 12,000 of particular interest:
# subset of 100 nodes v(g)$subset <- v(g) %in% sample(v(g), 100)
what want obtain analysis (count) of neighbourhood of subset
. is, want define
v(g)$neigh.attr1 <- rep(na, vcount(g)) v(g)$neigh.attr2 <- rep(na, vcount(g)) v(g)$neigh.attr3 <- rep(na, vcount(g)) v(g)$neigh.attr4 <- rep(na, vcount(g))
such na
replaced every node in subset
corresponding count of neighbouring nodes v(g)$attribute{1..4}==true
.
i can create list of neighbourhood of interest
neighbours <- neighborhood(g, order = 1, v(g)[v(g)$subset==true], mode = "out")
but can't think of efficient way iterate on every neighbours
, compute statistics each of 4 attributes. indeed way i've came loop given size of original graph takes long:
subset_indices <- as.numeric(v(g)[v(g)$subset==true]) (i in 1:length(neighbours)) { v(g)$neigh.attr1[subset_indices[i]] <- sum(v(g)$attribute1[neighbours[[i]]]) v(g)$neigh.attr2[subset_indices[i]] <- sum(v(g)$attribute2[neighbours[[i]]]) v(g)$neigh.attr3[subset_indices[i]] <- sum(v(g)$attribute3[neighbours[[i]]]) v(g)$neigh.attr4[subset_indices[i]] <- sum(v(g)$attribute4[neighbours[[i]]]) }
Comments
Post a Comment