ergm
functions such as ergm
and simulate
(for ERGMs) may operate in two modes: binary and weighted/valued, with the latter activated by passing a non-NULL value as the response
argument, giving the edge attribute name to be modeled/simulated.
Binary ERGM statistics cannot be used in valued mode and vice versa. However, a substantial number of binary ERGM statistics — particularly the ones with dyadic indepenence — have simple generalizations to valued ERGMs, and have been adapted in ergm
. They have the same form as their binary ERGM counterparts, with an additional argument: form
, which, at this time, has two possible values: “sum”
(the default) and “nonzero”
. The former creates a statistic of the form ∑{i,j} x{i,j} y_{i,j}, where y_{i,j} is the value of dyad (i,j) and x_{i,j} is the term’s covariate associated with it. The latter computes the binary version, with the edge considered to be present if its value is not 0.
Valued version of some binary ERGM terms have an argument threshold
, which sets the value above which a dyad is conidered to have a tie. (Value less than or equal to threshold
is considered a nontie.)
Some terms taking nodal or dyadic covariates take optional transform
and transformname
arguments. transform
should be a function with one argument, taking a data structure of the same mode as the covariate and returning a similarly structured data structure, transforming the covariate as needed.
For example, nodecov(“a”, transform=function(x) x^2)
will add a nodal covariate having the square of the value of the nodal attribute “a”
.
transformname
, if given, will be added to the term’s name to help identify it.
ergm
package
A cross-referenced html version of the term documentation is is available via vignette(‘ergm-term-crossRef’)
and terms can also be searched via search.ergmTerms
.
absdiff(attrname, pow=1)
(binary) (dyad-independent) (frequently-used) (directed) (undirected) (quantitative nodal attribute), absdiff(attrname, pow=1, form =“sum”)
(valued) (dyad-independent) (directed) (undirected) (quantitative nodal attribute)
Absolute difference: The attrname
argument is a character string giving the name of a quantitative attribute in the network’s vertex attribute list. This term adds one network statistic to the model equaling the sum of abs(attrname[i]-attrname[j])^pow
for all edges (i,j) in the network.
absdiffcat(attrname, base=NULL)
(binary) (dyad-independent) (directed) (undirected) (categorical nodal attribute), absdiffcat(attrname, base=NULL, form=“sum”)
(valued) (dyad-independent) (directed) (undirected) (categorical nodal attribute)
Categorical absolute difference: The attrname
argument is a character string giving the name of a quantitative attribute in the network’s vertex attribute list. This term adds one statistic for every possible nonzero distinct value of abs(attrname[i]-attrname[j])
in the network; the value of each such statistic is the number of edges in the network with the corresponding absolute difference. The optional base
argument is a vector indicating which nonzero differences, in order from smallest to largest, should be omitted from the model (i.e., treated like the zero-difference category). The base
argument, if used, should contain indices, not differences themselves. For instance, if the possible values of abs(attrname[i]-attrname[j])
are 0, 0.5, 3, 3.5, and 10, then to omit 0.5 and 10 one should set base=c(1, 4)
. Note that this term should generally be used only when the quantitative attribute has a limited number of possible values; an example is the “Grade”
attribute of the faux.mesa.high
or faux.magnolia.high
datasets.
altkstar(lambda, fixed=FALSE)
(binary) (undirected) (curved) (categorical nodal attribute)
Alternating k-star: This term adds one network statistic to the model equal to a weighted alternating sequence of k-star statistics with weight parameter lambda
. This is the version given in Snijders et al. (2006). The gwdegree
and altkstar
produce mathematically equivalent models, as long as they are used together with the edges
(or kstar(1)
) term, yet the interpretation of the gwdegree
parameters is slightly more straightforward than the interpretation of the altkstar
parameters. For this reason, we recommend the use of the gwdegree
instead of altkstar
. See Section 3 and especially equation (13) of Hunter (2007) for details. The optional argument fixed
indicates whether the scale parameter lambda
is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE
, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with undirected networks.
asymmetric(attrname=NULL, diff=FALSE, keep=NULL)
(binary) (directed) (dyad-independent) (triad-related)
Asymmetric dyads: This term adds one network statistic to the model equal to the number of pairs of actors for which exactly one of (i,j) or (j,i) exists. This term can only be used with directed networks. If the optional attrname
argument is used, only asymmetric pairs that match on the named vertex attribute are counted. The optional modifiers diff
and keep
are used in the same way as for the nodematch
term; refer to this term for details and an example.
atleast(threshold=0)
(valued) (directed) (undirected)
Number of ties with values greater than or equal to a threshold Adds one statistic equaling to the number of ties whose values equal or exceed threshold
.
b1concurrent(by=NULL)
(binary) (bipartite) (undirected) (categorical nodal attribute)
Concurrent node count for the first mode in a bipartite (aka two-mode) network: This term adds one network statistic to the model, equal to the number of nodes in the first mode of the network with degree 2 or higher. The first mode of a bipartite network object is sometimes known as the “actor” mode. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list; it functions just like the by
argument of the b1degree
term. Without the optional argument, this statistic is equivalent to b1mindegree(2)
. This term can only be used with undirected bipartite networks.
b1degrange(from, to=+Inf, by=NULL, homophily=FALSE)
(binary) (bipartite) (undirected)
Degree range for the first mode in a bipartite (a.k.a. two-mode) network: The from
and to
arguments are vectors of distinct integers (or +Inf
, for to
(its default)). If one of the vectors has length 1, it is recycled to the length of the other. Otherwise, they must have the same length. This term adds one network statistic to the model for each element of from
(or to
); the ith such statistic equals the number of nodes of the first mode (“actors”) in the network of degree greater than or equal to from[i]
but strictly less than to[i]
, i.e. with edge count in semiopen interval [from,to)
. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified and homophily
is TRUE
, then degrees are calculated using the subnetwork consisting of only edges whose endpoints have the same value of the by
attribute. If by
is specified and homophily
is FALSE
(the default), then separate degree range statistics are calculated for nodes having each separate value of the attribute.
This term can only be used with bipartite networks; for directed networks see idegrange
and odegrange
. For undirected networks, see degrange
, and see b2degrange
for degrees of the second mode (“events”).
b1degree(d, by=NULL)
(binary) (bipartite) (undirected) (categorical nodal attribute) (frequently-used)
Degree for the first mode in a bipartite (aka two-mode) network: The d
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d
; the ith such statistic equals the number of nodes of degree d[i]
in the first mode of a bipartite network, i.e. with exactly d[i]
edges. The first mode of a bipartite network object is sometimes known as the “actor” mode. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then each node’s degree is tabulated only with other nodes having the same value of the by
attribute. This term can only be used with undirected bipartite networks.
b1factor(attrname, base=1)
(binary) (bipartite) (undirected) (dyad-independent) (frequently-used) (categorical nodal attribute)
Factor attribute effect for the first mode in a bipartite (aka two-mode) network : The attrname
argument is a character string giving the name of a categorical attribute in the network’s vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname
attribute. Each of these statistics gives the number of times a node with that attribute in the first mode of the network appears in an edge. The first mode of a bipartite network object is sometimes known as the “actor” mode. To include all attribute values is usually not a good idea, because the sum of all such statistics equals the number of edges and hence a linear dependency would arise in any model also including edges
. Thus, the base
argument tells which value(s) (numbered in order according to the sort
function) should be omitted. The default value, base=1
, means that the smallest (i.e., first in sorted order) attribute value is omitted. For example, if the “fruit” factor has levels “orange”, “apple”, “banana”, and “pear”, then to add just two terms, one for “apple” and one for “pear”, then set “banana” and “orange” to the base (remember to sort the values first) by using nodefactor(“fruit”, base=2:3)
. This term can only be used with undirected bipartite networks.
b1mindegree(d)
(binary) (bipartite) (undirected)
Minimum degree for the first mode in a bipartite (aka two-mode) network: The d
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d
; the ith such statistic equals the number of nodes in the first mode of a bipartite network with at least degree d[i]
. The first mode of a bipartite network object is sometimes known as the “actor” mode. This term can only be used with undirected bipartite networks.
b1nodematch(attrname, diff=FALSE, keep=NULL, by=NULL, alpha=1, beta=1, byb2attr=NULL)
(binary) (bipartite) (undirected) (dyadic-independent) (categorical nodal attribute) (frequently-used)
Nodal attribute-based homophily effect for the first mode in a bipartite (aka two-mode) network: This term is introduced in Bomiriya et al (2014). The attrname
argument is a character string giving the name of a categorical attribute in the network’s vertex attribute list. Out of the two arguments (discount parameters) alpha
and beta
, both which takes values from [0,1], only one should be set at a time. If none is set to a value other than 1, this term will simply be a homophily based two-star statistic. This term adds one statistic to the model unless diff
is set to TRUE
, in which case the term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname
attribute. To include only the attribute values you wish, use the keep
argument. If an alpha
discount parameter is used, each of these statistics gives the sum of the number of common second-mode nodes raised to the power alpha
for each pair of first-mode nodes with that attribute. If a beta
discount parameter is used, each of these statistics gives half the sum of the number of two-paths with two first-mode nodes with that attribute as the two ends of the two path raised to the power beta
for each edge in the network. The byb2attr
argument is a character string giving the name of a second mode categorical attribute in the network’s attribute list. Setting this argument will separate the orginal statistics based on the values of the set second mode attribute— i.e. for example, if diff
is FALSE
, then the sum of all the statistics for each level of this second-mode attribute will be equal to the original b1nodematch
statistic where byb2attr
set to NULL
. This term can only be used with undirected bipartite networks.
b1star(k, attrname=NULL)
(binary) (bipartite) (undirected) (categorical nodal attribute)
k-Stars for the first mode in a bipartite (aka two-mode) network: The k
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k
. The ith such statistic counts the number of distinct k[i]
-stars whose center node is in the first mode of the network. The first mode of a bipartite network object is sometimes known as the “actor” mode. A k-star is defined to be a center node N and a set of k different nodes {O_1, …, O_k} such that the ties {N, O_i} exist for i=1, …, k. The optional argument attrname
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then the count is over the number of k-stars (with center node in the first mode) where all nodes have the same value of the attribute. This term can only be used for undirected bipartite networks. Note that b1star(1)
is equal to b2star(1)
and to edges
.
b1starmix(k, attrname, base=NULL, diff=TRUE)
(binary) (bipartite) (undirected) (categorical nodal attribute)
Mixing matrix for k-stars centered on the first mode of a bipartite network: Only a single value of k is allowed. This term counts all k-stars in which the b2 nodes (called events in some contexts) are homophilous in the sense that they all share the same value of attrname
. However, the b1 node (in some contexts, the actor) at the center of the k-star does NOT have to have the same value as the b2 nodes; indeed, the values taken by the b1 nodes may be completely distinct from those of the b2 nodes, which allows for the use of this term in cases where there are two separate nodal attributes, one for the b1 nodes and another for the b2 nodes (in this case, however, these two attributes should be combined to form a single nodal attribute called attrname
. A different statistic is created for each value of attrname
seen in a b1 node, even if no k-stars are observed with this value. Whether a different statistic is created for each value seen in a b2 node depends on the value of the diff
argument: When diff=TRUE
, the default, a different statistic is created for each value and thus the behavior of this term is reminiscent of the nodemix
term, from which it takes its name; when diff=FALSE
, all homophilous k-stars are counted together, though these k-stars are still categorized according to the value of the central b1 node. The base
term may be used to control which of the possible terms are left out of the model: By default, all terms are included, but if base
is set to a vector of indices then the corresponding terms (in the order they would be created when base=NULL
) are left out.
b1twostar(b1attrname, b2attrname, base=NULL)
(binary) (bipartite) (undirected) (categorical nodal attribute)
Two-star census for central nodes centered on the first mode of a bipartite network: This term takes two nodal attribute names, one for b1 nodes (actors in some contexts) and one for b2 nodes (events in some contexts). Only b1attrname
is required; if b2attrname
is not passed, it is assumed to be the same as b1attrname
. Assuming that there are n_1 values of b1attrname
among the b1 nodes and n_2 values of b2attrname
among the b2 nodes, then the total number of distinct categories of two stars according to these two attributes is n_1(n_2)(n_2+1)/2. This model term creates a distinct statistic counting each of these categories. The base
term may be used to leave some of these categories out; when passed as a vector of integer indices (in the order the statistics would be created when base=NULL
), the corresponding terms will be left out.
b2concurrent(by=NULL)
(binary) (bipartite) (undirected) (frequently-used)
Concurrent node count for the second mode in a bipartite (aka two-mode) network: This term adds one network statistic to the model, equal to the number of nodes in the second mode of the network with degree 2 or higher. The second mode of a bipartite network object is sometimes known as the “event” mode. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list; it functions just like the by
argument of the b2degree
term. Without the optional argument, this statistic is equivalent to b2mindegree(2)
. This term can only be used with undirected bipartite networks.
b2degrange(from, to=+Inf, by=NULL, homophily=FALSE)
(binary) (bipartite) (undirected)
Degree range for the second mode in a bipartite (a.k.a. two-mode) network: The from
and to
arguments are vectors of distinct integers (or +Inf
, for to
(its default)). If one of the vectors has length 1, it is recycled to the length of the other. Otherwise, they must have the same length. This term adds one network statistic to the model for each element of from
(or to
); the ith such statistic equals the number of nodes of the second mode (“events”) in the network of degree greater than or equal to from[i]
but strictly less than to[i]
, i.e. with edge count in semiopen interval [from,to)
. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified and homophily
is TRUE
, then degrees are calculated using the subnetwork consisting of only edges whose endpoints have the same value of the by
attribute. If by
is specified and homophily
is FALSE
(the default), then separate degree range statistics are calculated for nodes having each separate value of the attribute.
This term can only be used with bipartite networks; for directed networks see idegrange
and odegrange
. For undirected networks, see degrange
, and see b1degrange
for degrees of the first mode (“actors”).
b2degree(d, by=NULL)
(binary) (bipartite) (undirected) (categorical nodal attribute) (frequently-used)
Degree for the second mode in a bipartite (aka two-mode) network: The d
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d
; the ith such statistic equals the number of nodes of degree d[i]
in the second mode of a bipartite network, i.e. with exactly d[i]
edges. The second mode of a bipartite network object is sometimes known as the “event” mode. The optional term by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then each node’s degree is tabulated only with other nodes having the same value of the by
attribute. This term can only be used with undirected bipartite networks.
b2factor(attrname, base=1)
(binary) (bipartite) (undirected) (dyad-independent) (categorical nodal attribute) (frequently-used)
Factor attribute effect for the second mode in a bipartite (aka two-mode) network : The attrname
argument is a character string giving the name of a categorical attribute in the network’s vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname
attribute. Each of these statistics gives the number of times a node with that attribute in the second mode of the network appears in an edge. The second mode of a bipartite network object is sometimes known as the “event” mode. To include all attribute values is usually not a good idea, because the sum of all such statistics equals the number of edges and hence a linear dependency would arise in any model also including edges
. Thus, the base
argument tells which value(s) (numbered in order according to the sort
function) should be omitted. The default value, base=1
, means that the smallest (i.e., first in sorted order) attribute value is omitted. For example, if the “fruit” factor has levels “orange”, “apple”, “banana”, and “pear”, then to add just two terms, one for “apple” and one for “pear”, then set “banana” and “orange” to the base (remember to sort the values first) by using nodefactor(“fruit”, base=2:3)
. This term can only be used with undirected bipartite networks.
b2mindegree(d)
(binary) (bipartite) (undirected)
Minimum degree for the second mode in a bipartite (aka two-mode) network: The d
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d
; the ith such statistic equals the number of nodes in the second mode of a bipartite network with at least degree d[i]
. The second mode of a bipartite network object is sometimes known as the “event” mode. This term can only be used with undirected bipartite networks.
b2nodematch(attrname, diff=FALSE, keep=NULL, by=NULL, alpha=1, beta=1, byb1attr=NULL)
(binary) (bipartite) (undirected) (dyadic-independent) (categorical nodal attribute) (frequently-used)
Nodal attribute-based homophily effect for the second mode in a bipartite (aka two-mode) network: This term is introduced in Bomiriya et al (2014). The attrname
argument is a character string giving the name of a categorical attribute in the network’s vertex attribute list. Out of the two arguments (discount parameters) alpha
and beta
, both which takes values from [0,1], only one should be set at a time. If none is set to a value other than 1, this term will simply be a homophily based two-star statistic. This term adds one statistic to the model unless diff
is set to TRUE
, in which case the term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname
attribute. To include only the attribute values you wish, use the keep
argument. If an alpha
discount parameter is used, each of these statistics gives the sum of the number of common first-mode nodes raised to the power alpha
for each pair of second-mode nodes with that attribute. If a beta
discount parameter is used, each of these statistics gives half the sum of the number of two-paths with two second-mode nodes with that attribute as the two ends of the two path raised to the power beta
for each edge in the network. The byb1attr
argument is a character string giving the name of a first mode categorical attribute in the network’s attribute list. Setting this argument will separate the orginal statistics based on the values of the set first mode attribute— i.e. for example, if diff
is FALSE
, then the sum of all the statistics for each level of this first-mode attribute will be equal to the original b2nodematch
statistic where byb1attr
set to NULL
. This term can only be used with undirected bipartite networks.
b2star(k, attrname=NULL)
(binary) (bipartite) (undirected) (categorical nodal attribute)
k-Stars for the second mode in a bipartite (aka two-mode) network: The k
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k
. The ith such statistic counts the number of distinct k[i]
-stars whose center node is in the second mode of the network. The second mode of a bipartite network object is sometimes known as the “event” mode. A k-star is defined to be a center node N and a set of k different nodes {O_1, …, O_k} such that the ties {N, O_i} exist for i=1, …, k. The optional argument attrname
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then the count is over the number of k-stars (with center node in the second mode) where all nodes have the same value of the attribute. This term can only be used for undirected bipartite networks. Note that b2star(1)
is equal to b1star(1)
and to edges
.
b2starmix(k, attrname, base=NULL, diff=TRUE)
(binary) (bipartite) (undirected) (categorical nodal attribute)
Mixing matrix for k-stars centered on the second mode of a bipartite network: This term is exactly the same as b1starmix
except that the roles of b1 and b2 are reversed.
b2twostar(b1attrname, b2attrname, base=NULL)
(binary) (bipartite) (undirected) (categorical nodal attribute)
Two-star census for central nodes centered on the second mode of a bipartite network: This term is exactly the same as b1twostar
except that the roles of b1 and b2 are reversed.
balance
(binary) (triad-related) (directed) (undirected)
Balanced triads: This term adds one network statistic to the model equal to the number of triads in the network that are balanced. The balanced triads are those of type 102
or 300
in the categorization of Davis and Leinhardt (1972). For details on the 16 possible triad types, see ?triad.classify
in the {sna}
package. For an undirected network, the balanced triads are those with an even number of ties (i.e., 0 and 2).
coincidence(d=NULL,active=0)
(binary) (bipartite) (undirected)
Coincident node count for the second mode in a bipartite (aka two-mode) network: By default this term adds one network statistic to the model for each pair of nodes of mode two. It is equal to the number of (first mode) mutual partners of that pair. The first mode of a bipartite network object is sometimes known as the “actor” mode and the seconds as the “event” mode. So this is the number of actors going to both events in the pair. The optional argument d
is a two-column matrix of (row-wise) pairs indices where the first row is less than the second row. The second optional argument, active
, selects pairs for which the observed count is at least active
. This term can only be used with undirected bipartite networks.
concurrent(by=NULL)
(binary) (undirected) (categorical nodal attribute)
Concurrent node count: This term adds one network statistic to the model, equal to the number of nodes in the network with degree 2 or higher. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list; it functions just like the by
argument of the degree
term. This term can only be used with undirected networks.
concurrentties(by=NULL)
(binary) (undirected) (categorical nodal attribute)
Concurrent tie count: This term adds one network statistic to the model, equal to the number of ties incident on each actor beyond the first. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list; it functions just like the by
argument of the degree
term. This term can only be used with undirected networks.
ctriple(attrname=NULL)
(binary) (directed) (triad-related) (categorical nodal attribute) , a.k.a. ctriad
(binary) (directed) (triad-related) (categorical nodal attribute)
Cyclic triples: This term adds one statistic to the model, equal to the number of cyclic triples in the network, defined as a set of edges of the form {(i,j), (j,k), (k,i)}. Note that for all directed networks, triangle
is equal to ttriple+ctriple
, so at most two of these three terms can be in a model. The optional argument attrname
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then the count is over the number of cyclic triples where all three nodes have the same value of the attribute. This term can only be used with directed networks.
cycle(k)
(binary) (directed) (undirected)
Cycles: The k
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k
; the ith such statistic equals the number of cycles in the network with length exactly k[i]
. The cycle statistic applies to both directed and undirected networks. For directed networks, it counts directed cycles of length k, as opposed to undirected cycles in the undirected case. The directed cycle terms of lengths 2 and 3 are equivalent to mutual
and ctriple
(respectively). The undirected cycle term of length 3 is equivalent to triangle
, and there is no undirected cycle term of length 2.
cyclicalties(attrname=NULL)
(binary) (directed), cyclicalties(threshold=0)
(valued) (directed) (undirected)
Cyclical ties: This term adds one statistic, equal to the number of ties i–>j such that there exists a two-path from i to j. (Related to the ttriple
term.) The binary version takes a nodal attribute attrname
, and, if given, all three nodes involved (i, j, and the node on the two-path) must match on this attribute in order for i–>j to be counted. The binary version of this term can only be used with directed networks. The valued version can be used with both directed and undirected.
cyclicalweights(twopath=“min”,combine=“max”,affect=“min”)
(valued) (directed) (undirected)
Cyclical weights: This statistic implements the cyclical weights statistic, like that defined by Krivitsky (2012), Equation 13, but with the focus dyad being y_{j,i} rather than y_{i,j}. The currently implemented options for twopath
is the minimum of the constituent dyads (“min”
) or their geometric mean (“geomean”
); for combine
, the maximum of the 2-path strengths (“max”
) or their sum (“sum”
); and for affect
, the minimum of the focus dyad and the combined strength of the two paths (“min”
) or their geometric mean (“geomean”
). For each of these options, the first (and the default) is more stable but also more conservative, while the second is more sensitive but more likely to induce a multimodal distribution of networks.
degrange(from, to=+Inf, by=NULL, homophily=FALSE)
(binary) (undirected) (categorical nodal attribute)
Degree range: The from
and to
arguments are vectors of distinct integers (or +Inf
, for to
(its default)). If one of the vectors has length 1, it is recycled to the length of the other. Otherwise, they must have the same length. This term adds one network statistic to the model for each element of from
(or to
); the ith such statistic equals the number of nodes in the network of degree greater than or equal to from[i]
but strictly less than to[i]
, i.e. with edges in semiopen interval [from,to)
. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified and homophily
is TRUE
, then degrees are calculated using the subnetwork consisting of only edges whose endpoints have the same value of the by
attribute. If by
is specified and homophily
is FALSE
(the default), then separate degree range statistics are calculated for nodes having each separate value of the attribute.
This term can only be used with undirected networks; for directed networks see idegrange
and odegrange
. This term can be used with bipartite networks, and will count nodes of both first and second mode in the specified degree range. To count only nodes of the first mode (“actors”), use b1degrange
and to count only those fo the second mode (“events”), use b2degrange
.
degree(d, by=NULL, homophily=FALSE)
(binary) (undirected) (categorical nodal attribute) (frequently-used)
Degree: The d
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d
; the ith such statistic equals the number of nodes in the network of degree d[i]
, i.e. with exactly d[i]
edges. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified and homophily
is TRUE
, then degrees are calculated using the subnetwork consisting of only edges whose endpoints have the same value of the by
attribute. If by
is specified and homophily
is FALSE
(the default), then separate degree statistics are calculated for nodes having each separate value of the attribute. This term can only be used with undirected networks; for directed networks see idegree
and odegree
.
degreepopularity
(binary) (undirected)
Degree popularity: This term adds one network statistic to the model equaling the sum over the actors of each actor’s degree taken to the 3/2 power (or, equivalently, multiplied by its square root). This term is an undirected analog to the terms of Snijders et al. (2010), equations (11) and (12). This term can only be used with undirected networks.
degcrossprod
(binary) (undirected)
Degree Cross-Product: This term adds one network statistic equal to the mean of the cross-products of the degrees of all pairs of nodes in the network which are tied. Only coded for undirected networks.
degcor
(binary) (undirected)
Degree Correlation: This term adds one network statistic equal to the correlation of the degrees of all pairs of nodes in the network which are tied. Only coded for undirected networks.
density
(binary) (dyad-independent) (directed) (undirected)
Density: This term adds one network statistic equal to the density of the network. For undirected networks, density
equals kstar(1)
or edges
divided by n(n-1)/2; for directed networks, density
equals edges
or istar(1)
or ostar(1)
divided by n(n-1).
dsp(d)
(binary) (directed) (undirected)
Dyadwise shared partners: The d
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d
; the ith such statistic equals the number of dyads in the network with exactly d[i]
shared partners. This term can be used with directed and undirected networks. For directed networks the count is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the dyad).
dyadcov(x, attrname=NULL)
(binary) (dyad-independent) (directed) (undirected) (categorical nodal attribute)
Dyadic covariate: If the network is directed, x
is either a (symmetric) matrix of covariates, one for each possible dyad (i,j), or an undirected network; if the latter, optional argument attrname
provides the name of the quantitative edge attribute to use for covariate values (in this case, missing edges in x
are assigned a covariate value of zero). This term adds three statistics to the model, each equal to the sum of the covariate values for all dyads occupying one of the three possible non-empty dyad states (mutual, upper-triangular asymmetric, and lower-triangular asymmetric dyads, respectively), with the empty or null state serving as a reference category. If the network is undirected, x
is either a matrix of edgewise covariates, or a network; if the latter, optional argument attrname
provides the name of the edge attribute to use for edge values. This term adds one statistic to the model, equal to the sum of the covariate values for each edge appearing in the network. The edgecov
and dyadcov
terms are equivalent for undirected networks.
edgecov(x, attrname=NULL)
(binary) (dyad-independent) (directed) (undirected) (frequently-used) , edgecov(x, attrname=NULL, form=“sum”)
(valued) (directed) (undirected) (dyad-independent)
Edge covariate: The x
argument is either a square matrix of covariates, one for each possible edge in the network, the name of a network attribute of covariates, or a network; if the latter, optional argument attrname
provides the name of the quantitative edge attribute to use for covariate values (in this case, missing edges in x
are assigned a covariate value of zero). This term adds one statistic to the model, equal to the sum of the covariate values for each edge appearing in the network. The edgecov
term applies to both directed and undirected networks. For undirected networks the covariates are also assumed to be undirected. The edgecov
and dyadcov
terms are equivalent for undirected networks.
edges
(binary) (valued) (dyad-independent) (directed) (undirected) (frequently-used) , a.k.a nonzero
(valued) (directed) (undirected) (dyad-independent)
Edges: This term adds one network statistic equal to the number of edges (i.e. nonzero values) in the network. For undirected networks, edges
is equal to kstar(1)
; for directed networks, edges
is equal to both ostar(1)
and istar(1)
.
esp(d)
(binary) (directed) (undirected)
Edgewise shared partners: This is just like the dsp
term, except this term adds one network statistic to the model for each element in d
where the ith such statistic equals the number of edges (rather than dyads) in the network with exactly d[i]
shared partners. This term can be used with directed and undirected networks. For directed networks the count is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the edge and in the same direction).
greaterthan(threshold=0)
(valued) (directed) (undirected) (dyadic-independent)
Number of dyads with values strictly greater than a threshold: Adds one statistic equaling to the number of ties whose values exceed threshold
.
gwb1degree(decay, fixed=FALSE, cutoff=30)
(binary) (bipartite) (undirected) (curved)
Geometrically weighted degree distribution for the first mode in a bipartite (aka two-mode) network: This term adds one network statistic to the model equal to the weighted degree distribution with decay controlled by the decay
parameter, for nodes in the first mode of a bipartite network. The first mode of a bipartite network object is sometimes known as the “actor” mode. The decay
parameter is the same as theta_s in equation (14) in Hunter (2007). The value supplied for this parameter may be fixed (if fixed=TRUE
), or it may be used as merely the starting value for the estimation in a curved exponential family model (the default). The optional argument cutoff
is only relevant if fixed=FALSE
. In that case it only uses this number of terms in computing the statistics to reduce the computational burden. This term can only be used with undirected bipartite networks.
gwb2degree(decay, fixed=FALSE, cutoff=30)
(binary) (bipartite) (undirected) (curved)
Geometrically weighted degree distribution for the second mode in a bipartite (aka two-mode) network: This term adds one network statistic to the model equal to the weighted degree distribution with decay controlled by the decay
parameter, for nodes in the second mode of a bipartite network. The second mode of a bipartite network object is sometimes known as the “event” mode. The decay
parameter is the same as theta_s in equation (14) in Hunter (2007). The value supplied for this parameter may be fixed (if fixed=TRUE
), or it may be used as merely the starting value for the estimation in a curved exponential family model (the default). The optional argument cutoff
is only relevant if fixed=FALSE
. In that case it only uses this number of terms in computing the statistics to reduce the computational burden. This term can only be used with undirected bipartite networks.
gwdegree(decay, fixed=FALSE, cutoff=30)
(binary) (undirected) (curved) (frequently-used)
Geometrically weighted degree distribution: This term adds one network statistic to the model equal to the weighted degree distribution with decay controlled by the decay
parameter. The decay
parameter is the same as theta_s in equation (14) in Hunter (2007). The value supplied for this parameter may be fixed (if fixed=TRUE
), or it may be used as merely the starting value for the estimation in a curved exponential family model (the default). The optional argument cutoff
is only relevant if fixed=FALSE
. In that case it only uses this number of terms in computing the statistics to reduce the computational burden. This term can only be used with undirected networks.
gwdsp(alpha=0, fixed=FALSE, cutoff=30)
(binary) (directed) (undirected) (curved)
Geometrically weighted dyadwise shared partner distribution: This term adds one network statistic to the model equal to the geometrically weighted dyadwise shared partner distribution with weight parameter alpha
> 0. The optional argument fixed
indicates whether the scale parameter lambda
is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE
, which means the scale parameter is not fixed and thus the model is a CEF model. This term can be used with directed and undirected networks. For directed networks the count is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the dyad). The optional argument cutoff
is only relevant if fixed=FALSE
. In that case it only uses this number of terms in computing the statistics to reduce the computational burden.
gwesp(alpha=0, fixed=FALSE, cutoff=30)
(binary) (frequently-used) (directed) (undirected) (curved)
Geometrically weighted edgewise shared partner distribution: This term is just like gwdsp
except it adds a statistic equal to the geometrically weighted edgewise (not dyadwise) shared partner distribution with weight parameter alpha
. The optional argument fixed
indicates whether the scale parameter lambda
is to be fit as a curved exponential-family model (see Hunter and Handcock, 2006). The default is FALSE
, which means the scale parameter is not fixed and thus the model is a CEF model. This term can be used with directed and undirected networks. For directed networks the geometric weighting is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the edge and in the same direction). The optional argument cutoff
is only relevant if fixed=FALSE
. In that case it only uses this number of terms in computing the statistics to reduce the computational burden.
gwidegree(decay, fixed=FALSE, cutoff=30)
(binary) (directed) (curved)
Geometrically weighted in-degree distribution: This term adds one network statistic to the model equal to the weighted in-degree distribution with weight parameter decay
. The optional argument fixed
indicates whether the scale parameter lambda
is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE
, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with directed networks. The optional argument cutoff
is only relevant if fixed=FALSE
. In that case it only uses this number of terms in computing the statistics to reduce the computational burden.
gwnsp(alpha=0, fixed=FALSE, cutoff=30)
(binary) (directed) (undirected) (curved)
Geometrically weighted nonedgewise shared partner distribution: This term is just like gwesp
and gwdsp
except it adds a statistic equal to the geometrically weighted nonedgewise (that is, over dyads that do not have an edge) shared partner distribution with weight parameter alpha
. The optional argument fixed
indicates whether the scale parameter lambda
is to be fit as a curved exponential-family model (see Hunter and Handcock, 2006). The default is FALSE
, which means the scale parameter is not fixed and thus the model is a CEF model. This term can be used with directed and undirected networks. For directed networks the geometric weighting is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the non-edge and in the same direction). The optional argument cutoff
is only relevant if fixed=FALSE
. In that case it only uses this number of terms in computing the statistics to reduce the computational burden.
gwodegree(decay, fixed=FALSE, cutoff=30)
(binary) (directed) (curved)
Geometrically weighted out-degree distribution: This term adds one network statistic to the model equal to the weighted out-degree distribution with weight parameter decay
. The optional argument fixed
indicates whether the scale parameter lambda
is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE
, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with directed networks. The optional argument cutoff
is only relevant if fixed=FALSE
. In that case it only uses this number of terms in computing the statistics to reduce the computational burden.
hamming(x, cov, attrname=NULL)
(binary) (dyad-independent) (directed) (undirected)
Hamming distance: This term adds one statistic to the model equal to the weighted or unweighted Hamming distance of the network from the network specified by x
. (If no argument is given, x
is taken to be the observed network, i.e., the network on the left side of the ~ in the formula that defines the ERGM.) Unweighted Hamming distance is defined as the total number of pairs (i,j) (ordered or unordered, depending on whether the network is directed or undirected) on which the two networks differ. If the optional argument cov
is specified, then the weighted Hamming distance is computed instead, where each pair (i,j) contributes a pre-specified weight toward the distance when the two networks differ on that pair. The argument cov
is either a matrix of edgewise weights or a network; if the latter, the optional argument attrname
provides the name of the edge attribute to use for weight values.
hammingmix(attrname, x, base=0)
(binary) (directed) (dyad-independent)
Hamming distance within mixing: This term adds one statistic to the model for every possible pairing of
attribute values of the network for the vertex attribute named attrname
. Each such statistic is the Hamming distance (i.e., the number of differences) between the appropriate subset of dyads in the network and the corresponding subset in x
. The ordering of the attribute values is alphabetical. The option base
gives the index of statistics to be omitted from the tabulation. For example base=2
will omit the second statistic, making it the de facto reference category. This term can only be used with directed networks.
idegrange(from, to=+Inf, by=NULL, homophily=FALSE)
(binary) (directed) (categorical nodal attribute)
In-degree range: The from
and to
arguments are vectors of distinct integers (or +Inf
, for to
(its default)). If one of the vectors has length 1, it is recycled to the length of the other. Otherwise, they must have the same length. This term adds one network statistic to the model for each element of from
(or to
); the ith such statistic equals the number of nodes in the network of in-degree greater than or equal to from[i]
but strictly less than to[i]
, i.e. with in-edge count in semiopen interval [from,to)
. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified and homophily
is TRUE
, then degrees are calculated using the subnetwork consisting of only edges whose endpoints have the same value of the by
attribute. If by
is specified and homophily
is FALSE
(the default), then separate degree range statistics are calculated for nodes having each separate value of the attribute.
This term can only be used with directed networks; for undirected networks (bipartite and not) see degrange
. For degrees of specific modes of bipartite networks, see b1degrange
and b2degrange
. For in-degrees, see idegrange
.
idegree(d, by=NULL, homophily=FALSE)
(binary) (directed) (categorical nodal attribute) (frequently-used)
In-degree: The d
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d
; the ith such statistic equals the number of nodes in the network of in-degree d[i]
, i.e. the number of nodes with exactly d[i]
in-edges. The optional term by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified and homophily
is TRUE
, then degrees are calculated using the subnetwork consisting of only edges whose endpoints have the same value of the by
attribute. If by
is specified and homophily
is FALSE
(the default), then separate degree statistics are calculated for nodes having each separate value of the attribute. This term can only be used with directed networks; for undirected networks see degree
.
idegreepopularity
(binary) (directed)
In-degree popularity: This term adds one network statistic to the model equaling the sum over the actors of each actor’s in-degree taken to the 3/2 power (or, equivalently, multiplied by its square root). This term is analogous to the term of Snijders et al. (2010), equation (11). This term can only be used with directed networks.
ininterval(lower=-Inf, upper=+Inf, open=c(TRUE,TRUE))
(valued) (directed) (undirected) (dyadic-independent)
Number of ties whose values are in an interval Adds one statistic equaling to the number of ties whose values are between lower
and upper
. Argument open
is a logical
vector of length 2 that controls whether the interval is open (exclusive) on the lower and on the upper end, respectively.
intransitive
(binary) (directed) (triad-related)
Intransitive triads: This term adds one statistic to the model, equal to the number of triads in the network that are intransitive. The intransitive triads are those of type 111D
, 201
, 111U
, 021C
, or 030C
in the categorization of Davis and Leinhardt (1972). For details on the 16 possible triad types, see triad.classify
in the sna
package. Note the distinction from the ctriple
term. This term can only be used with directed networks.
isolates
(binary) (directed) (undirected) (frequently-used)
Isolates: This term adds one statistic to the model equal to the number of isolates in the network. For an undirected network, an isolate is defined to be any node with degree zero. For a directed network, an isolate is any node with both in-degree and out-degree equal to zero.
istar(k, attrname=NULL)
(binary) (directed) (categorical nodal attribute)
In-stars: The k
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k
. The ith such statistic counts the number of distinct k[i]
-instars in the network, where a k-instar is defined to be a node N and a set of k different nodes {O_1, …, O_k} such that the ties (O_j, N) exist for j=1, …, k. The optional argument attrname
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then the count is over the number of k-instars where all nodes have the same value of the attribute. This term can only be used for directed networks; for undirected networks see kstar
. Note that istar(1)
is equal to both ostar(1)
and edges
.
kstar(k, attrname=NULL)
(binary) (undirected) (categorical nodal attribute)
k-Stars: The k
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k
. The ith such statistic counts the number of distinct k[i]
-stars in the network, where a k-star is defined to be a node N and a set of k different nodes {O_1, …, O_k} such that the ties {N, O_i} exist for i=1, …, k. The optional argument attrname
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then the count is over the number of k-stars where all nodes have the same value of the attribute. This term can only be used for undirected networks; for directed networks, see istar
, ostar
, twopath
and m2star
. Note that kstar(1)
is equal to edges
.
localtriangle(x)
(binary) (triad-related) (directed) (undirected)
Triangles within neighborhoods: This term adds one statistic to the model equal to the number of triangles in the network between nodes “close to” each other. For an undirected network, a local triangle is defined to be any set of three edges between nodal pairs {(i,j), (j,k), (k,i)} that are in the same neighborhood. For a directed network, a triangle is defined as any set of three edges (i,j), (j,k) and either (k{}i) or (k{≤ftarrow}i) where again all nodes are within the same neighborhood. The argument x
is an undirected network or an symmetric adjacency matrix that specifies whether the two nodes are in the same neighborhood. Note that triangle
, with or without an argument, is a special case of localtriangle
.
m2star
(binary) (directed)
Mixed 2-stars, a.k.a 2-paths: This term adds one statistic to the model, equal to the number of mixed 2-stars in the network, where a mixed 2-star is a pair of distinct edges (i,j), (j,k). A mixed 2-star is sometimes called a 2-path because it is a directed path of length 2 from i to k via j. However, in the case of a 2-path the focus is usually on the endpoints i and k, whereas for a mixed 2-star the focus is usually on the midpoint j. This term can only be used with directed networks; for undirected networks see kstar(2)
. See also twopath
.
meandeg
(binary) (dyad-independent) (directed) (undirected)
Mean vertex degree: This term adds one network statistic to the model equal to the average degree of a node. Note that this term is a constant multiple of both edges
and density
.
mutual(same=NULL, diff=FALSE, by=NULL, keep=NULL)
(binary) (directed) (dyad-independent) (frequently-used), mutual(form=“min”,threshold=0)
(valued) (directed) (dyad-independent)
Mutuality: In binary ERGMs, equal to the number of pairs of actors i and j for which (i,j) and (j,i) both exist. For valued ERGMs, equal to ∑{i<j} m(y{i,j},y_{j,i}), where m is determined by form
argument: “min”
for (y_{i,j},y_{j,i}), “nabsdiff”
for -|y_{i,j},y_{j,i}|, “product”
for y_{i,j}y_{j,i}, and “geometric”
for √{y_{i,j}}√{y_{j,i}}. See Krivitsky (2012) for a discussion of these statistics. form=“threshold”
simply computes the binary mutuality
after thresholding at threshold
.
This term can only be used with directed networks. The binary version also has the following capabilities: if the optional same
argument is passed the name of a vertex attribute, only mutual pairs that match on the attribute are counted; separate counts for each unique matching value can be obtained by using diff=TRUE
with same
; and if by
is passed the name of a vertex attribute, then each node is counted separately for each mutual pair in which it occurs and the counts are tabulated by unique values of the attribute. This means that the sum of the mutual statistics when by
is used will equal twice the standard mutual statistic. Only one of same
or by
may be used, and only the former is affected by diff
; if both same
and by
are passed, by
is ignored. Finally, if keep
is passed a numerical vector, this vector of integers tells which statistics should be kept whenever the mutual
term would ordinarily result in multiple statistics.
nearsimmelian
(binary) (directed) (triad-related)
Near simmelian triads: This term adds one statistic to the model equal to the number of near Simmelian triads, as defined by Krackhardt and Handcock (2007). This is a sub-graph of size three which is exactly one tie short of being complete. This term can only be used with directed networks.
nodecov(attrname, transform, transformname)
(binary) (dyad-independent) (frequently-used) (directed) (undirected) (quantitative nodal attribute) , nodecov(attrname, transform, transformname, form=“sum”)
(valued) (dyad-independent) (directed) (undirected) (quantitative nodal attribute) , a.k.a. nodemain
(binary) (directed) (undirected)
Main effect of a covariate: The attrname
argument is a character string giving the name of a numeric (not categorical) attribute in the network’s vertex attribute list. This term adds a single network statistic to the model equaling the sum of attrname(i)
and attrname(j)
for all edges (i,j) in the network. For categorical attributes, see nodefactor
. Note that for directed networks, nodecov
equals nodeicov
plus nodeocov
.
nodecovar
(valued) (directed) (undirected) (quantitative nodal attribute)
Uncentered covariance of dyad values incident on each actor: This term adds one statistic equal to ∑{i,j,k} (y{i,j}y_{i,k}+y_{k,j}y_{k,j}). This can be viewed as a valued analog of the kstar(2)
statistic.
nodefactor(attrname, base=1)
(binary) (dyad-independent) (directed) (undirected) (categorical nodal attribute) (frequently-used) , nodefactor(attrname, base=1, form=“sum”)
(dyad-independent) (valued) (directed) (undirected) (categorical nodal attribute)
Factor attribute effect: The attrname
argument is a character vector giving one or more names of categorical attributes in the network’s vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname
attribute (or each combination of the attributes given). Each of these statistics gives the number of times a node with that attribute or those attributes appears in an edge in the network. In particular, for edges whose endpoints both have the same attribute values, this value is counted twice. To include all attribute values is usually not a good idea – though this may be accomplished if desired by setting base=0
– because the sum of all such statistics equals twice the number of edges and hence a linear dependency would arise in any model also including edges
. Thus, the base
argument tells which value(s) (numbered in order according to the sort
function) should be omitted. The default value, base=1
, means that the smallest (i.e., first in sorted order) attribute value is omitted. For example, if the “fruit” factor has levels “orange”, “apple”, “banana”, and “pear”, then to add just two terms, one for “apple” and one for “pear”, then set “banana” and “orange” to the base (remember to sort the values first) by using nodefactor(“fruit”, base=2:3)
. For an analogous term for quantitative vertex attributes, see nodecov
.
nodeicov(attrname, transform, transformname)
(binary) (directed) (quantitative nodal attribute) (frequently-used) , nodeicov(attrname, transform, transformname, form=“sum”)
(valued) (directed) (quantitative nodal attribute)
Main effect of a covariate for in-edges: The attrname
argument is a character string giving the name of a numeric (not categorical) attribute in the network’s vertex attribute list. This term adds a single network statistic to the model equaling the total value of attrname(j)
for all edges (i,j) in the network. This term may only be used with directed networks. For categorical attributes, see nodeifactor
.
nodeicovar
(valued) (directed) (quantitative nodal attribute)
Uncentered covariance of in-dyad values incident on each actor: This term adds one statistic equal to ∑{i,j,k} y{k,j}y_{k,j}. This can be viewed as a valued analog of the istar(2)
statistic.
nodeifactor(attrname, base=1)
(binary) (dyad-independent) (directed) (categorical nodal attribute) (frequently-used) , nodeifactor(attrname, base=1, form=“sum”)
(valued) (dyad-independent) (directed) (categorical nodal attribute)
Factor attribute effect for in-edges: The attrname
argument is a character vector giving one or more names of a categorical attribute in the network’s vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname
attribute (or each combination of the attributes given). Each of these statistics gives the number of times a node with that attribute or those attributes appears as the terminal node of a directed tie. To include all attribute values is usually not a good idea – though this may be accomplished if desired by setting base=0
– because the sum of all such statistics equals the number of edges and hence a linear dependency would arise in any model also including edges
. Thus, the base
argument tells which value(s) (numbered in order according to the sort
function) should be omitted. The default value, base=1
, means that the smallest (i.e., first in sorted order) attribute value is omitted. For example, if the “fruit” factor has levels “orange”, “apple”, “banana”, and “pear”, then to add just two terms, one for “apple” and one for “pear”, then set “banana” and “orange” to the base (remember to sort the values first) by using nodefactor(“fruit”, base=2:3)
. For an analogous term for quantitative vertex attributes, see nodeicov
.
nodeisqrtcovar
(valued) (directed) (non-negative) (quantitative nodal attribute)
Uncentered covariance of square roots of in-dyad values incident on each actor: This term adds one statistic equal to ∑{i,j,k} √{y{i,j}}√{y_{k,j}}. This can be viewed as a valued analog of the istar(2)
statistic.
nodematch(attrname, diff=FALSE, keep=NULL)
(binary) (dyad-independent) (frequently-used) (directed) (undirected) (categorical nodal attribute) , nodematch(attrname, diff=FALSE, keep=NULL, form=“sum”)
(valued) (dyad-independent) (directed) (undirected) (categorical nodal attribute) a.k.a. match
(binary) (directed) (dyad-independent) (undirected) (categorical nodal attribute)
Uniform homophily and differential homophily: The attrname
argument is a character vector giving one or more names of attributes in the network’s vertex attribute list. When diff=FALSE
, this term adds one network statistic to the model, which counts the number of edges (i,j) for which attrname(i)==attrname(j)
. (When multiple names are given, the statistic counts only those on which all the named attributes match.) When diff=TRUE
, p network statistics are added to the model, where p is the number of unique values of the attrname
attribute. The kth such statistic counts the number of edges (i,j) for which attrname(i) == attrname(j) == value(k)
, where value(k)
is the kth smallest unique value of the attrname attribute. If set to non-NULL, the optional keep
argument should be a vector of integers giving the values of k
that should be considered for matches; other values are ignored (this works for both diff=FALSE
and diff=TRUE
). For instance, to add two statistics, counting the matches for just the 2nd and 4th categories, use nodematch
with diff=TRUE
and keep=c(2,4)
.
nodemix(attrname, base=NULL)
(binary) (dyad-independent) (frequently-used) (directed) (undirected) (categorical nodal attribute) , nodemix(attrname, base=NULL, form=“sum”)
(valued) (dyad-independent) (directed) (undirected) (categorical nodal attribute)
Nodal attribute mixing: The attrname
argument is a character vector giving the names of categorical attributes in the network’s vertex attribute list. By default, this term adds one network statistic to the model for each possible pairing of attribute values. The statistic equals the number of edges in the network in which the nodes have that pairing of values. (When multiple names are given, a statistic is added for each combination of attribute values for those names.) In other words, this term produces one statistic for every entry in the mixing matrix for the attribute(s). The ordering of the attribute values is alphabetical (for nominal categories) or numerical (for ordered categories). The optional base
argument is a vector of integers corresponding to the pairings that should not be included. If base
contains only negative integers, then these integers correspond to the only pairings that should be included. By default (i.e., with base=NULL
or base=0
), all pairings are included.
nodeocov(attrname, transform, transformname)
(binary) (directed) (dyadic-independent)(quantitative nodal attribute) , nodeocov(attrname, transform, transformname, form=“sum”)
(valued) (directed) (dyadic-independent) (quantitative nodal attribute)
Main effect of a covariate for out-edges: The attrname
argument is a character string giving the name of a numeric (not categorical) attribute in the network’s vertex attribute list. This term adds a single network statistic to the model equaling the total value of attrname(i)
for all edges (i,j) in the network. This term may only be used with directed networks. For categorical attributes, see nodeofactor
.
nodeocovar
(valued) (directed) (quantitative nodal attribute)
Uncentered covariance of out-dyad values incident on each actor: This term adds one statistic equal to ∑{i,j,k} y{i,j}y_{i,k}. This can be viewed as a valued analog of the ostar(2)
statistic.
nodeofactor(attrname, base=1)
(binary) (dyad-independent) (directed) (categorical nodal attribute) , nodeofactor(attrname, base=1, form=“sum”)
(valued) (dyad-independent) (categorical nodal attribute) (directed)
Factor attribute effect for out-edges: The attrname
argument is a character string giving one or more names of categorical attributes in the network’s vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname
attribute (or each combination of the attributes given). Each of these statistics gives the number of times a node with that attribute or those attributes appears as the node of origin of a directed tie. To include all attribute values is usually not a good idea – though this may be accomplished if desired by setting base=0
– because the sum of all such statistics equals the number of edges and hence a linear dependency would arise in any model also including edges
. Thus, the base
argument tells which value(s) (numbered in order according to the sort
function) should be omitted. The default value, base=1
, means that the smallest (i.e., first in sorted order) attribute value is omitted. For example, if the “fruit” factor has levels “orange”, “apple”, “banana”, and “pear”, then to add just two terms, one for “apple” and one for “pear”, then set “banana” and “orange” to the base (remember to sort the values first) by using nodefactor(“fruit”, base=2:3)
. For an analogous term for quantitative vertex attributes, see nodeocov
.
nodeosqrtcovar
(valued) (directed) (non-negative) (quantitative nodal attribute)
Uncentered covariance of square roots of out-dyad values incident on each actor: This term adds one statistic equal to ∑{i,j,k} √{y{i,j}}√{y_{i,k}}. This can be viewed as a valued analog of the ostar(2)
statistic.
nodesqrtcovar(center=TRUE)
(valued) (non-negative) (directed) (undirected) (quantitative nodal attribute)
Covariance of square roots of dyad values incident on each actor: This term adds one statistic equal to ∑{i,j,k} (√{y{i,j}}√{y_{i,k}}+√{y_{k,j}}√{y_{k,j}}) if center=FALSE
. This can be viewed as a valued analog of the kstar(2)
statistic. If center=FALSE
(the default), the statistic is instead ∑{i,j,k} ((√{y{i,j}}-{√{y}})(√{y_{i,k}}-{√{y}})+(√{y_{k,j}}-{√{y}})(√{y_{k,j}}-{√{y}})), where {√{y}} is the mean of the square root of dyad values.
nsp(d)
(binary) (directed) (undirected)
Nonedgewise shared partners: This is just like the dsp
and esp
terms, except this term adds one network statistic to the model for each element in d
where the ith such statistic equals the number of non-edges (that is, dyads that do not have an edge) in the network with exactly d[i]
shared partners. This term can be used with directed and undirected networks. For directed networks the count is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the non-edge and in the same direction).
odegrange(from, to=+Inf, by=NULL, homophily=FALSE)
(binary) (directed) (categorical nodal attribute)
Out-degree range: The from
and to
arguments are vectors of distinct integers (or +Inf
, for to
(its default)). If one of the vectors has length 1, it is recycled to the length of the other. Otherwise, they must have the same length. This term adds one network statistic to the model for each element of from
(or to
); the ith such statistic equals the number of nodes in the network of out-degree greater than or equal to from[i]
but strictly less than to[i]
, i.e. with out-edge count in semiopen interval [from,to)
. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified and homophily
is TRUE
, then degrees are calculated using the subnetwork consisting of only edges whose endpoints have the same value of the by
attribute. If by
is specified and homophily
is FALSE
(the default), then separate degree range statistics are calculated for nodes having each separate value of the attribute.
This term can only be used with directed networks; for undirected networks (bipartite and not) see degrange
. For degrees of specific modes of bipartite networks, see b1degrange
and b2degrange
. For in-degrees, see idegrange
.
odegree(d, by=NULL, homophily=FALSE)
(binary) (directed) (categorical nodal attribute) (frequently-used)
Out-degree: The d
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d
; the ith such statistic equals the number of nodes in the network of out-degree d[i]
, i.e. the number of nodes with exactly d[i]
out-edges. The optional argument by
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified and homophily
is TRUE
, then degrees are calculated using the subnetwork consisting of only edges whose endpoints have the same value of the by
attribute. If by
is specified and homophily
is FALSE
(the default), then separate degree statistics are calculated for nodes having each separate value of the attribute. This term can only be used with directed networks; for undirected networks see degree
.
odegreepopularity
(binary) (directed)
Out-degree popularity: This term adds one network statistic to the model equaling the sum over the actors of each actor’s outdegree taken to the 3/2 power (or, equivalently, multiplied by its square root). This term is analogous to the term of Snijders et al. (2010), equation (12). This term can only be used with directed networks.
opentriad
(binary) (undirected) (triad-related)
Open triads: This term adds one statistic to the model equal to the number of 2-stars minus three times the number of triangles in the network. It is currently only implemented for undirected networks.
ostar(k, attrname=NULL)
(binary) (directed) (categorical nodal attribute)
k-Outstars: The k
argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k
. The ith such statistic counts the number of distinct k[i]
-outstars in the network, where a k-outstar is defined to be a node N and a set of k different nodes {O_1, …, O_k} such that the ties (N,O_j) exist for j=1, …, k. The optional argument attrname
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then the count is the number of k-outstars where all nodes have the same value of the attribute. This term can only be used with directed networks; for undirected networks see kstar
. Note that ostar(1)
is equal to both istar(1)
and edges
.
receiver(base=1)
(binary) (directed) (dyad-independent)
Receiver effect: This term adds one network statistic for each node equal to the number of in-ties for that node. This measures the popularity of the node. The term for the first node is omitted by default because of linear dependence that arises if this term is used together with edges
, but its coefficient can be computed as the negative of the sum of the coefficients of all the other actors. That is, the average coefficient is zero, following the Holland-Leinhardt parametrization of the p1 model (Holland and Leinhardt, 1981). The base
argument allows the user to determine which nodes’ statistics should be omitted. The base
argument can also be a vector of negative indices, to specify which should be added instead of deleted, and base=0
specifies that all statistics should be included. This term can only be used with directed networks. For undirected networks, see sociality
.
sender(base=1)
(binary) (directed) (dyad-independent)
Sender effect: This term adds one network statistic for each node equal to the number of out-ties for that node. This measures the activity of the node. The term for the first node is omitted by default because of linear dependence that arises if this term is used together with edges
, but its coefficient can be computed as the negative of the sum of the coefficients of all the other actors. That is, the average coefficient is zero, following the Holland-Leinhardt parametrization of the p1 model (Holland and Leinhardt, 1981). The base
argument allows the user to determine which nodes’ statistics should be omitted. The base
argument can also be a vector of negative indices, to specify which should be added instead of deleted, and base=0
specifies that all statistics should be included. This term can only be used with directed networks. For undirected networks, see sociality
.
simmelian
(binary) (directed) (triad-related)
Simmelian triads: This term adds one statistic to the model equal to the number of Simmelian triads, as defined by Krackhardt and Handcock (2007). This is a complete sub-graph of size three. This term can only be used with directed networks.
simmelianties
(binary) (triad-related) (directed)
Ties in simmelian triads: This term adds one statistic to the model equal to the number of ties in the network that are associated with Simmelian triads, as defined by Krackhardt and Handcock (2007). Each Simmelian has six ties in it but, because Simmelians can overlap in terms of nodes (and associated ties), the total number of ties in these Simmelians is less than six times the number of Simmelians. Hence this is a measure of the clustering of Simmelians (given the number of Simmelians). This term can only be used with directed networks.
smalldiff(attrname, cutoff)
(binary) (dyad-independent) (directed) (undirected) (quantitative nodal attribute)
Number of ties between actors with similar (but not necessarily identical) attribute values: The attrname
argument is a character string giving the name of a quantitative attribute in the network’s vertex attribute list. This term adds one statistic, having as its value the number of edges in the network for which the incident actors’ attribute values differ less than cotoff
; that is, number of edges between i
to j
such that abs(attrname[i]-attrname[j])<cutoff
.
sociality(attrname=NULL, base=1)
(binary) (undirected) (categorical nodal attribute)
Undirected degree: This term adds one network statistic for each node equal to the number of ties of that node. The optional attrname
argument is a character string giving the name of an attribute in the network’s vertex attribute list that takes categorical values. If provided, this term only counts ties between nodes with the same value of the attribute (an actor-specific version of the nodematch
term). This term can only be used with undirected networks. For directed networks, see sender
and receiver
. By default, base=1
means that the statistic for the first node will be omitted, but this argument may be changed to control which statistics are included just as for the sender
and receiver
terms.
sum(pow=1)
(valued) (directed) (undirected)
Sum of dyad values (optionally taken to a power): This term adds one statistic equal to the sum of dyad values taken to the power pow
, which defaults to 1.
threetrail(keep=1:4)
(binary) (directed) (undirected) (triad-related),
Three-trails: a.k.a. threepath
. For an undirected network, this term adds one statistic equal to the number of 3-trails, where a 3-trail is defined as a “trail” of length three that traverses three distinct edges. Note that a 3-trail need not include four distinct nodes; in particular, a triangle counts as three 3-trails. For a directed network, this term adds four statistics (or some subset of these four specified by the keep
argument), one for each of the four distinct types of directed three-paths. If the nodes of the path are written from left to right such that the middle edge points to the right (R), then the four types are RRR, RRL, LRR, and LRL. That is, an RRR 3-trail is of the form i–>j–>k–>l, and RRL 3-trail is of the form i–>j–>k<–l, etc. Like in the undirected case, there is no requirement that the nodes be distinct in a directed 3-trail. However, the three edges must all be distinct. Thus, a mutual tie i<–>j does not count as a 3-trail of the form i–>j–>i<–j; however, in the subnetwork i<–>j–>k, there are two directed 3-trails, one LRR (k<–j–>i–>j) and one RRR (k<–j–>i–>j).
This term used to be (inaccurately) called threepath
. That name has been deprecated and may be removed in a future version.
transitive
(binary) (directed) (triad-related)
Transitive triads: This term adds one statistic to the model, equal to the number of triads in the network that are transitive. The transitive triads are those of type 120D
, 030T
, 120U
, or 300
in the categorization of Davis and Leinhardt (1972). For details on the 16 possible triad types, see triad.classify
in the sna
package. Note the distinction from the ttriple
term. This term can only be used with directed networks.
transitiveties(attrname=NULL)
(binary) (directed) (triad-related) (categorical nodal attribute) , transitiveties(threshold=0)
(valued) (directed) (undirected) (triad-related)
Transitive ties: This term adds one statistic, equal to the number of ties i–>j such that there exists a two-path from i to j. (Related to the ttriple
term.) The binary version takes a nodal attribute attrname
, and, if given, all three nodes involved (i, j, and the node on the two-path) must match on this attribute in order for i–>j to be counted. The binary version of this term can only be used with directed networks. The valued version can be used with both directed and undirected.
transitiveweights(twopath=“min”,combine=“max”,affect=“min”)
(valued) (directed) (undirected) (non-negative) (triad-related)
Transitive weights: This statistic implements the transitive weights statistic defined by Krivitsky (2012), Equation 13. The currently implemented options for twopath
is the minimum of the constituent dyads (“min”
) or their geometric mean (“geomean”
); for combine
, the maximum of the 2-path strengths (“max”
) or their sum (“sum”
); and for affect
, the minimum of the focus dyad and the combined strength of the two paths (“min”
) or their geometric mean (“geomean”
). For each of these options, the first (and the default) is more stable but also more conservative, while the second is more sensitive but more likely to induce a multimodal distribution of networks.
triadcensus(d)
(binary) (triad-related) (directed) (undirected)
Triad census: For a directed network, this term adds one network statistic for each of an arbitrary subset of the 16 possible types of triads categorized by Davis and Leinhardt (1972) as 003, 012, 102, 021D, 021U, 021C, 111D, 111U, 030T, 030C, 201, 120D, 120U, 120C, 210,
and 300
. Note that at least one category should be dropped; otherwise a linear dependency will exist among the 16 statistics, since they must sum to the total number of three-node sets. By default, the category 003
, which is the category of completely empty three-node sets, is dropped. This is considered category zero, and the others are numbered 1 through 15 in the order given above. By specifying a numeric vector of integers from 0 to 15 as the d
argument, the user may specify a set of terms to add other than the default value of 1:15
. Each statistic is the count of the corresponding triad type in the network. For details on the 16 types, see ?triad.classify
in the {sna}
package, on which this code is based. For an undirected network, the triad census is over the four types defined by the number of ties (i.e., 0, 1, 2, and 3), and the default is to add 1:3
, which is to say that the 0 is dropped; however, this too may be controlled by changing the d
argument to a numeric vector giving a subset of {0, 1, 2, 3}.
triangle(attrname=NULL)
(binary) (frequently-used) (triad-related) (directed) (undirected) (categorical nodal attribute)
Triangles: This term adds one statistic to the model equal to the number of triangles in the network. For an undirected network, a triangle is defined to be any set {(i,j), (j,k), (k,i)} of three edges. For a directed network, a triangle is defined as any set of three edges (i,j) and (j,k) and either (k,i) or (i,k). The former case is called a “transitive triple” and the latter is called a “cyclic triple”, so in the case of a directed network, triangle
equals ttriple
plus ctriple
— thus at most two of these three terms can be in a model. The optional argument attrname
restricts the count to those triples of nodes with equal values of the vertex attribute specified by attrname
.
tripercent(attrname=NULL)
(binary) (undirected) (triad-related) (categorical nodal attribute)
Triangle percentage: This term adds one statistic to the model equal to 100 times the ratio of the number of triangles in the network to the sum of the number of triangles and the number of 2-stars not in triangles (the latter is considered a potential but incomplete triangle). In case the denominator equals zero, the statistic is defined to be zero. For the definition of triangle, see triangle
. The optional argument attrname
restricts the counts (both numerator and denominator) to those triples of nodes with equal values of the vertex attribute specified by attrname
. This is often called the mean correlation coefficient. This term can only be used with undirected networks; for directed networks, it is difficult to define the numerator and denominator in a consistent and meaningful way.
ttriple(attrname=NULL)
(binary) (directed) (triad-related) (categorical nodal attribute) , a.k.a. ttriad
(binary) (directed) (triad-related) (categorical nodal attribute)
Transitive triples: This term adds one statistic to the model, equal to the number of transitive triples in the network, defined as a set of edges {(i,j), (j,k), (i,k)}. Note that triangle
equals ttriple+ctriple
for a directed network, so at most two of the three terms can be in a model. The optional argument attrname
is a character string giving the name of an attribute in the network’s vertex attribute list. If this is specified then the count is over the number of transitive triples where all three nodes have the same value of the attribute. This term can only be used with directed networks.
twopath
(binary) (directed) (undirected)
2-Paths: This term adds one statistic to the model, equal to the number of 2-paths in the network. For a directed network this is defined as a pair of edges (i,j), (j,k), where i and j must be distinct. That is, it is a directed path of length 2 from i to k via j. For directed networks a 2-path is also a mixed 2-star but the interpretation is usually different; see m2star
. For undirected networks a twopath is defined as a pair of edges {i,j}, {j,k}. That is, it is an undirected path of length 2 from i to k via j, also known as a 2-star.