Motivation: Network analysis is an ubiquitous tool in systems research. Abstracting complicated social, economical, biological systems in the form of networks help us to obtain important inferences about the structure and the underlying processes related to the system. Several critical processes have been modelled by thorough investigation, such as growth, densification, spreading etc. However recently it has come to our notice that due to shortcomings, owing to data collection strategies which may be caused by many unforeseen circumstances, the graph we end up getting may not be an actual representation of the targeted system. Applying existing state of art network models on this noisy network may give us erroneous results. Besides many of this strategies has been built keeping in mind static networks, which may not be a valid assumption in the case of modern real world networks which continuously change and evolve with time
Objectives: Our objective is to develop statistical techniques to model noise, which may be biased i.e., conditioned on some node properties or independent of them. We are also actively involved in defining general set of properties of stable network. Understanding these properties is valuable because it gives us template regarding how to build a robust system which is resilient to attacks. Network vulnerability is a central issue for cybersecurity, thus, knowing how susceptible a network is to noise and how easy it is to destroy it will shed light onto how to keep the network secure against different forms of attack. Understanding how random/targeted noise/attacks on the network change the parameters of the network, will make it clearer how to defend the network and make sure it remains operational.
Understanding Sensitivity and Reliability of Network Metrics
Abstract : Network analysis is an important tool in understanding the behavior of complex systems of interacting entities. However, due to the limitations of data gathering technologies, some interactions might be missing from the network model. This is a ubiquitous problem in all domains that use network analysis, from social networks to hyper-linked web networks to biological networks. Consequently, an important question in analyzing networks is to understand how increasing the noise level (i.e. percentage of missing edges) affects different network parameters. In this work we evaluate the effect of noise on community scoring and centrality-based parameters with respect to two different aspects of network analysis: (i) sensitivity, that is how the parameter value changes as edges are removed and (ii) reliability in the context of message spreading, that is how the time taken to broadcast a message changes as edges are removed. Our experiments on synthetic and real-world networks and three different noise models demonstrate that for both the aspects over all networks and all noise models, permanence qualifies as the most effective metric. For the sensitivity experiments closeness centrality is a close second. For the message spreading experiments, closeness and betweenness centrality based initiator selection closely competes with permanence. This is because permanence has a dual characteristic where the cumulative permanence over all vertices is sensitive to noise but the ids of the top-rank vertices, which are used to find seeds during message spreading remain relatively stable under noise.
- Sarkar, Soumya, Suhansanu Kumar, Sanjukta Bhowmick, and Animesh Mukherjee. “Sensitivity and reliability in incomplete networks: Centrality metrics to community scoring functions.” In Advances in Social Networks Analysis and Mining (ASONAM), 2016 IEEE/ACM International Conference on, pp. 69-72. IEEE, 2016.
- Soumya Sarkar, Sanjukta Bhowmick, Suhansanu Kumar, Animesh Mukherjee. Centrality and Community Scoring Functions in Incomplete Networks: Their Sensitivity, Robustness and Reliability accepted in Lecture Notes in Social Networks (LNSN 2017) an edited book series by Springer (to appear)
Understanding Stability of Noisy Networks
Abstract: Networks created from real-world data contain some inaccuracies or noise, manifested as small changes in the network structure. An important question is whether these small changes can significantly affect the analysis results. In this paper, we study the effect of noise in changing ranks of the high centrality vertices. We compare, using the Jaccard Index (JI), how many of the top-k high centrality nodes from the original network are also part of the top-k ranked nodes from the noisy network. We deem a network as stable if the JI value is high. We observe two features that affect the stability. First, the stability is dependent on the number of top-ranked vertices considered. When the vertices are ordered according to their centrality values, they group into clusters. Perturbations to the network can change the relative ranking within the cluster, but vertices rarely move from one cluster to another. Second, the stability is dependent on the local connections of the high ranking vertices. The network is highly stable if the high ranking vertices are connected to each other. Our findings show that the stability of a network is affected by the local properties of high centrality vertices, rather than the global properties of the entire network. Based on these local properties we can identify the stability of a network, without explicitly applying a noise model.
- Ufimtsev, Vladimir, Soumya Sarkar, Animesh Mukherjee, and Sanjukta Bhowmick. “Understanding stability of noisy networks through centrality measures and local connections.” In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 2347-2352. ACM, 2016.
Members and Collaborators
- Soumya Sarkar, Dept. of CSE, IIT Kharagpur, India (https://sites.google.com/site/soumyasarkar881980/)
- Animesh Mukherjee, Dept. of CSE, IIT Kharagpur, India (http://cse.iitkgp.ac.in/~animeshm/)
- Sanjukta Bhowmick ,University of Nebraska Ohama (http://faculty.ist.unomaha.edu/sbhowmick/Site/index.html)