Co-authorship network (/coauthorship/)

Co-authorship network (/coauthorship/)

Builds the data behind the interactive graph at alex-loftus.com/coauthorship.

Run

cd experiments/coauthorship
uv run build_graph.py        # hits OpenAlex, writes ../../assets/data/coauthorship.json

The page (_pages/coauthorship.html + assets/js/coauthorship-network.js) loads that static JSON — no live API calls at view time.

What the page does

The ~49 listed researchers are anchors. A “Reach” slider reveals how they connect:

  • 1 hop — only direct co-authorships between listed people.
  • 2 hops — plus the outside co-author who bridges two listed people.
  • 3 / 4 hops — plus chains through two / three intermediate people.

Each node carries minhop = the length of the shortest listed-pair path that first brings it in; the UI shows everything with minhop ≤ slider. Clicking a name in the side list raises the slider just enough to reveal that person, then centres on them. Listed people with no traceable path to the rest are pinned as a row of isolated avatars below the network.

Each node also carries shared_papers = the number of papers it co-authored with at least one other node shown here. This is what the hover tooltip reports — deliberately not a raw OpenAlex total, because some people’s profiles merge same-named strangers (which would inflate a total but not the in-network count, since strangers’ papers share no one in this set).

How the data is built

  1. Resolve each name to OpenAlex author profiles (surname-gated search → union of in-field same-name profiles, handling OpenAlex ID fragmentation; spelling fixes + FORCE_IDS). 5 names have no usable record (unresolved).
  2. Fetch papers, build a name-keyed co-authorship graph (so fragmented/duplicate profiles and "Last, First" spellings collapse to one node). Papers with > MAX_AUTHORS authors only add list↔list edges, not edges through outsiders (kills mega-paper cliques).
  3. Hop reveal — for every pair of listed people, take shortest paths up to K_MAX hops; the union of those path nodes/edges is the shipped graph, each tagged with its minhop.
  4. Communities (greedy_modularity_communities) → node colour; spring_layout seeds the force layout. Junk intermediaries (single-token / common-name collisions) are dropped first.

Avatars / photos

Nodes are monogram avatars (initials on the community colour) by default. To use real photos, drop image files in assets/images/coauthors/ named <slug>.jpg (or .png/.webp), where slug is the lowercase name with spaces → hyphens (e.g. can-rager.jpg), and re-run the build — photo_url() picks them up and the UI fills the node (falling back to the monogram if an image fails to load). (There is no WhatsApp/contacts integration; photos must be supplied as files.)

Knobs (top of build_graph.py)

constantmeaning
K_MAXlongest listed-pair path traced (= max slider hops)
PATHS_PER_PAIRshortest paths kept per pair
MAX_AUTHORSpapers bigger than this only add list↔list edges
COMMON_STOPcommon names dropped to avoid collision false bridges
COMMUNITY_MERGESgroups of anchor people whose auto-detected communities are unioned
COMMUNITY_LABEL_ANCHORSforce a community’s legend label to a specific person’s name

Caveats

  • Identity is by normalised name; a stoplist guards against collisions but exotic ones can slip.
  • Reflects OpenAlex coverage at build time; some preprints/venues are missing.
  • A few peripheral names resolve to an imperfect profile; these are almost always the isolated “no traceable path” people and don’t affect the connected structure.