DeepWalk	node2vec
Estimator	hierarchical softmax	negative sampling
The estimate	unbiased	biased (cancels the degree bias)

Thank you!

Backward contact tracing · Nature Physics 2021

Sadamori Kojaku

Laurent Hébert-Dufresne

Enys Mones

Sune Lehmann

Implicit degree bias in link prediction · ICML 2025

Rachith Aiyappa

Xin Wang

Munjung Kim

Ozgur Can Seckin

Sadamori Kojaku

Embedding scientific migration · PNAS 2023

Dakota Murray

Jisung Yoon

Sadamori Kojaku

Rodrigo Costas

Woosung Jung

Staša Milojević

Mobility & invisible borders · preprint 2025

Guangyuan Weng

Minsuk Kim

Esteban Moro

Debiasing graph embedding · NeurIPS 2021

Sadamori Kojaku

Jisung Yoon

Isabel Constantino

Embedding community structure · Nature Communications 2024

Sadamori Kojaku

Filippo Radicchi

Santo Fortunato

Belief embedding · Nature Human Behaviour 2025

Byunghwee Lee

Rachith Aiyappa

Haewoon Kwak

Jisun An

Robust disruptiveness · Science Advances 2026

Munjung Kim

Sadamori Kojaku

Support

Shall we take a random walkinto latent space?

On bumping into the same fundamental ideas, again and again

Imagine one of your coauthors who's here at NetSci...

Do they have more coauthors and more citations than you?

Friendship paradox

And a random walk is basically repeated sampling through edges.

This is NetSci 101!

And yet, we still keep bumping into it.

So here are some of my random walks.

Stroll #1

A virus is a random walk that branches.

Epidemic spreading is sampling through edges

Hubs light up first

Hubs then hit everyone

A simple, intuitive approximation

One infection makes how many more?

Excess degree: how many possible further infections can a node make?

Each remaining edge transmits with TTT

We all know these classic results.

But how about contact tracing?

Korea and New Zealand kept deaths low with no real lockdown.

How could they do this "dance"?

Contact tracing

Contact tracing is often described as

"going one step ahead"

Does tracing just ride the friendship paradox?

One event, half of all cases, revealed through investigation of the "source"

Can tracing backward to the source go beyond the friendship paradox?

"From whom" backward contact tracing is the friendship paradox, squared!

Three degree distributions

First Stroll, recap

Stroll #2

A walk to find missing links.

Link prediction benchmark

Can we just use degree?

... and this shortcut works

Degree is all you need

Clever Hans

How do we fix this?

A simple solution: a degree-corrected benchmark

Degree is all you need, until you fix the test

Second Stroll, recap

Stroll #3

A walk into latent space.

What is an embedding?

Embedding often has one job: predict its context

Word2vec: geometry can encode semantics

word2vec: two vectors + skip-gram

Negative sampling

Can a network live in such a space?

DeepWalk & node2vec: a walk is a sentence

Graph embedding

Did you just say we're sampling through random walks?

A random walk mixes up two signals

Then, are graph embedding methods affected by the degree bias?

Stroll #4

A walk that finds communities.

Community detection

Detectability limit: where does community structure become invisible?

Graph embedding methods ~ spectral embedding

Node2vec is near optimal

At the same time, DeepWalk often fails in practice

However, it turns out that negative sampling accidentally removes the degree bias.

Same objective, but the difference boils down to degree bias

Degree obscures other structure in DeepWalk

All due to the unintended degree bias cancellation (or lack thereof)!

node2vec works better in many tasks because it accidentally removed degree bias from random walks.

Unintended bias cancelling degree bias

Stroll #5

Using bias to de-bias

What if we alter the null distribution?

residual2vec: a designed baseline

Describe the bias as a null model

A concrete bias: recency in citations

Make the embedding blind to time

So it can see the disciplinary structure more clearly

residual2vec allows us to remove bias, once we can concretely model it.

Stroll #6

Mobility as a walk, pulled by gravity

How far apart are two places?

Shall we take a random walk
into latent space?

Each remaining edge transmits with $T$