Omission-to-Modification Error Ratio
- Omission-to-Modification Error Ratio is a quantitative measure comparing omitted true ties (false negatives) with incorrectly reported ties (false positives) in network data.
- It is employed in ROC-based aggregation methods where adjusting thresholds and weighting factors balances omission and modification errors for optimal network recovery.
- Empirical studies in CSS networks show O:M ratios ranging from 4 to 20, highlighting the trade-off between liberal and conservative reporting in sparse social structures.
The Omission-to-Modification error ratio (O:M ratio) arises in the context of cognitive social structure (CSS) network studies, where individuals report not only their own direct social ties but also their perceptions of ties among all others in a bounded network. The O:M ratio serves as a quantitative metric that captures the trade-off between omission errors (false negatives, where actual ties are missed) and modification errors, also called commission errors (false positives, where non-ties are incorrectly reported as present). Used in conjunction with ROC-curve based aggregation methods, the O:M ratio enables explicit calibration of network estimates by controlling the relative emphasis placed on the minimization of each error type (Yenigun et al., 2016).
1. Formal Definition
The true relational structure is represented by a directed adjacency matrix with size , where if a true tie exists from to and $0$ otherwise. The estimated or perceived structure, denoted , may be inferred via aggregation, or may refer to a single respondent’s report. Using the indicator that is $1$ if its argument is true and 0 otherwise, the error rates are defined as:
- Omission error rate (false negatives):
0
This quantifies the proportion of true ties missed in estimation.
- Modification (commission) error rate 1 (false positives):
2
This reflects the proportion of non-ties incorrectly categorized as ties.
- O:M error ratio:
3
This ratio encapsulates the relative prevalence of omission versus modification errors in any given network estimation scenario.
2. Role in ROC–Based Aggregation and Weighting
In ROC-curve based CSS aggregation, a threshold parameter 4 represents the minimum number of respondents required to agree on a tie for it to be considered present. For each 5, 6 and 7 are calculated, generating an empirical ROC curve plotting true positive rate (TPR8) against false positive rate (FPR9). The classical ROC criterion minimizes the Euclidean distance to the ideal 0 point, corresponding to
1
which treats omission and commission errors equally (2).
The introduction of an explicit weighting factor 3 generalizes this to
4
where 5 corresponds to desired O:M emphasis (6 can also be denoted 7). Selecting 8 up-weights the cost of modification/commission errors, leading the optimization to favor thresholds that reduce false positives, often at the expense of increased omissions. The procedure thus allows the analyst to choose or justify a particular O:M trade-off directly in accordance with substantive considerations or network sparsity (Yenigun et al., 2016).
3. Conceptual and Empirical Patterns in CSS Work
CSS studies characteristically involve networks with low density (9) and a large imbalance between the number of possible ties and non-ties. Empirically, omission error rates (0) are observed between 0.54 and 0.72, while modification error rates (1) range from 0.03 to 0.14, yielding raw O:M ratios in the 4 to 20 range across studied datasets.
There is a strong negative correlation between 2 and 3 at the respondent level, indicating a trade-off: respondents who are "liberal" in their perceptions tend toward low omission but high commission errors; "conservative" respondents exhibit the reverse pattern. This supports the methodological need for explicit O:M balancing in aggregation (Yenigun et al., 2016).
4. O:M Ratio in Network Estimation Algorithm
The ROC-based threshold method for network aggregation (RTM) uses the O:M ratio via the weighting parameter 4 to make the error trade-off both explicit and data-driven:
- Randomly sample 5 CSS slices; compute their average density 6.
- Set 7 (or another value according to analytic priorities).
- For each candidate threshold 8, calculate 9, 0, and 1.
- Select 2, and aggregate the network using this threshold.
In sparse networks, the recommended weight 3 amplifies the importance of reducing false positives, which are disproportionately likely amidst a majority of non-ties.
5. Numerical and Simulation Results
Across five canonical CSS datasets (network size 4–5, density 6–7), observed mean 8 values are 9–$0$0 and $0$1 values are $0$2–$0$3, with O:M ratios generally between 4 and 20. An illustrative example ("High Tech Managers", $0$4, $0$5) demonstrates how varying $0$6 affects the O:M ratio:
| Threshold $0$7 | $0$8 | $0$9 | O:M |
|---|---|---|---|
| 1 | 0.295 | 0.083 | 0.28 |
| 4 | 0.034 | 0.667 | 19.6 |
Correlation with the true network structure increases under the 0-weighted ROC choice (e.g., 1) versus an unweighted choice (2). Large-scale simulations confirm that using 3 in the ROC minimization yields robust recovery, performing comparably or better than adaptive thresholding which constrains only commission errors.
6. Guidelines and Applied Selection
Choice of 4 should be context-driven:
- If false positives (commission errors) are much costlier, set 5 to minimize 6 further.
- If false negatives (omission errors) are to be avoided, set 7.
- If no clear preference exists, use 8 (classical ROC methodology).
The recommended approach is 9 in sparse organizational networks, balancing the error rates for superior network recovery (Yenigun et al., 2016).
7. Worked Example
For a toy network with 0, 1 sampled slices, and observed counts as follows:
| 2 (threshold) | False Positives | False Negatives | 3 | 4 | 5 | 6 (for 7=5) |
|---|---|---|---|---|---|---|
| 1 | 2 | 1 | 0.333 | 0.25 | 0.417 | 1.667 |
| 2 | 1 | 2 | 0.167 | 0.5 | 0.528 | 1.305 |
| 3 | 0 | 3 | 0 | 0.75 | 0.75 | 0.75 |
With equal weighting (8), 9 minimizes $1$0. With $1$1, corresponding to high stringency, $1$2 minimizes $1$3. This formalism directly operationalizes the trade-off between omission and commission errors within ROC-guided network estimation (Yenigun et al., 2016).