Scalable Holistic Analysis of Multi-Source, Data-Intensive Problems Using Multilayered Networks (1611.01546v1)
Abstract: Holistic analysis of many real-world problems are based on data collected from multiple sources contributing to some aspect of that problem. The word fusion has also been used in the literature for such problems involving disparate data types. Holistically understanding traffic patterns, causes of accidents, bombings, terrorist planning and many natural phenomenon such as storms, earthquakes fall into this category. Some may have real-time requirements and some may need to be analyzed after the fact (post-mortem or forensic analysis.) What is common for all these problems is that the amount and types of data associated with the event. Data may also be incomplete and trustworthiness of sources may also vary. Currently, manual and ad-hoc approaches are used in aggregating data in different ways for analyzing and understanding these problems. In this paper, we approach this problem in a novel way using multilayered networks. We identify features of a central event and propose a network layer for each feature. This approach allows us to study the effect of each feature independently and its impact on the event. We also establish that the proposed approach allows us to compose these features in arbitrary ways (without loss of information) to analyze their combined effect. Additionally, formulation of relationships (e.g., distance measure for a single feature instead of several at the same time) is simpler. Further, computations can be done once on each layer in this approach and reused for mixing and matching the features for aggregate impacts and "what if" scenarios to understand the problem holistically. This has been demonstrated by recreating the communities for the AND-Composed network by using the communities of the individual layers. We believe that techniques proposed here make an important contribution to the nascent yet fast growing area of data fusion.