This document provides an introduction to social network analysis and centrality measures in social networks. It defines social networks as graphs with nodes representing individuals and edges representing social connections between them. It describes several centrality measures used in social network analysis including degree centrality, betweenness centrality, closeness centrality, eigen centrality, and PageRank centrality. These measures are used to quantify how important and influential individual nodes are within the overall network structure.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
22 views
Lecture 12
This document provides an introduction to social network analysis and centrality measures in social networks. It defines social networks as graphs with nodes representing individuals and edges representing social connections between them. It describes several centrality measures used in social network analysis including degree centrality, betweenness centrality, closeness centrality, eigen centrality, and PageRank centrality. These measures are used to quantify how important and influential individual nodes are within the overall network structure.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31
Introduction to Data Science
Dr. Irfan Yousuf
Department of Computer Science (New Campus) UET, Lahore (Lecture # 12; February 24, 2023) Outline • Online Networks as Graphs • Social Networks Analysis Graphs Networks • Network Analysis (NA) is a set of integrated techniques to depict relations among actors and to analyze the social structures that emerge from the recurrence of these relations.
• Network Science is an academic field which studies
complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks. Network as a Graph • We can represent any network in the form of a graph. • Nodes become the users and an Edge between two nodes shows their relationship. Social Networks as Graphs • A social network graph is a graph where the nodes represent people and the lines between nodes, called edges, represent social connections between them, such as friendship or working together on a project. Social Network Analysis • Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory.
• It characterizes networked structures in terms of nodes
(individual actors, people, or things within the network) and the ties, edges, or links (relationships or interactions) that connect them. Centrality Measures in Social Network Analysis • Centrality Measures: Centrality is a collection of metrics used to quantify how important and influential a specific node is to the network as a whole. • It is important to remember that centrality measures are used on specific nodes within the network, and do not provide information on a network level. Centrality Measures in Social Network Analysis • These algorithms / measures use graph theory to calculate the importance of any given node in a network.
• Each measure has its own definition of ‘importance’, so you
need to understand how they work to find the best one for your needs. • Degree Centrality • Betweenness Centrality • Closeness centrality • Eigen Centrality • PageRank Centrality Degree Centrality • Degree centrality assigns an importance score based simply on the number of links held by each node. • What it tells us: How many direct, ‘one hop’ connections each node has to other nodes in the network. • When to use it: For finding very connected individuals, popular individuals, individuals who are likely to hold most information or individuals who can quickly connect with the wider network. Degree Centrality: Undirected Graphs
The degree centrality of a vertex v , for a given graph
G:=(V,E) with |V| vertices and |E| edges, is defined as
CD (v) = deg (v)
Degree Centrality: Directed Graphs • The nodes with higher outdegree is more central (choices made). • The nodes with higher indegree is more prestigious (choices received). Betweenness Centrality • Betweenness centrality measures the number of times a node lies on the shortest path between other nodes. • What it tells us: This measure shows which nodes are ‘bridges’ between nodes in a network. It does this by identifying all the shortest paths and then counting how many times each node falls on one. • When to use it: For finding the individuals who influence the flow around a system. Betweenness Centrality Betweenness Centrality Betweenness Centrality
Find the betweenness centrality of node 2.
Closeness Centrality • Closeness centrality scores each node based on their ‘closeness’ to all other nodes in the network. • What it tells us: This measure calculates the shortest paths between all nodes, then assigns each node a score based on its sum of shortest paths. • When to use it: For finding the individuals who are best placed to influence the entire network most quickly. Closeness Centrality • Closeness centrality measures how short the shortest paths are from node x to all nodes. • It is usually expressed as the normalised inverse of the sum of the topological distances in the graph. Closeness Centrality Eigen Centrality • Eigen Centrality measures a node’s influence based on the number of links it has to other nodes in the network. It also takes into account how well connected a node is, and how many links their connections have, and so on through the network. • What it tells us: By calculating the extended connections of a node, Eigen Centrality can identify nodes with influence over the whole network, not just those directly connected to it. • When to use it: Eigen Centrality is a good ‘all-round’ SNA score, handy for understanding human social networks, but also for understanding networks like malware propagation. Eigen Centrality Normalized Value = sqrt(22+22+12+32+12+12) The number of iterations needed for the normalized value of the eigenvector to converge is anticipated to be less than or equal to the number of vertices in the graph. Pagerank Centrality • PageRank centrality is a variant of Eigen Centrality designed for ranking web content, using hyperlinks between pages as a measure of importance. It can be used for any kind of network, though. • What it tells us: PageRank’s main difference from Eigen Centrality is that it accounts for link direction. • When to use it: The result is that nodes with many incoming links are influential, and nodes to which they are connected share some of that influence. Pagerank Centrality • The score for a given vertex may be thought of as the fraction of time spent 'visiting' that vertex (measured over all time) in a random walk over the vertices.
• PageRank modifies this random walk by adding to the model
a probability (specified as 'alpha' in the constructor) of jumping to any vertex. If alpha is 0, this is equivalent to the eigenvector centrality algorithm; if alpha is 1, all vertices will receive the same score (1/|V|). Thus, alpha acts as a sort of score smoothing parameter. Centrality Measures Social Network Analysis
- Find the most central node in this graph using all the centrality measures. Summary