Leveraging Social Networks for Career Longevity using Big Data Analytics
Github repo here.
"No man is an island, entire of itself; every man is a piece of the continent, a part of the main."—John Donne
I love John Donne's beautiful meditation on human interconnectedness (the era's sexist pronouns aside). It resonates with me, now more than ever as I explore how networks shape career trajectories in my PhD dissertation. Using IMDb data spanning from 2000 to 2023, I trace how collaboration networks change over time and examine how these changes influence people's career longevity and productivity. I specifically focus on the different impacts networks have on men and women, offering strategies for building more beneficial and equitable professional networks.
Navigating the complexity of large-scale networks
Imagine a network comprising about 100,000 individuals. Annually, this network could potentially foster nearly 5 billion connections, which balloons to over 100 billion across two decades. This astounding complexity arises from two inherent characteristics of networks. First is exponential growth: each time a new individual joins, the network's potential connections multiples exponentially; every newcomer in theory can connect with all existing members. Second is constant flux: like the ebb and flow of the ocean, networks are never static; old ties dissipate while new ties form continually.
To capture these shifts, I've constructed and analyzed 21 sequential network graphs. Each graph spans a three-year period, starting from 2000-2002 and stretching to 2020-2022. This method allows me to track the evolution of social capital and its impact on career trajectories up to the year 2023. Indeed, in our interconnected world, no one is an island.
Phases of analysis
The Python codes for all the analyses are provided in the project's Github repository. You can also see them with these direct links:
Phase 1 Tracking movie directors career: I identify first-time directors and follow their filmography from 2003 to 2023.
Phase 2 Constructing filmmaker network: I build dynamic collaboration networks within the film industry from 2000 to 2023 and calculate the yearly brokerage social capital for every creative workers in the network using a 3-year moving window.
Phase 3 Predicting directors gender: I use directors' first names from IMDb data and U.S. Social Security data to predict gender.
Phase 4_Building time series data: I create a dataset that tracks each director's career year by year, setting up for a survival analysis.
As this project evolves, more notebooks will be added.
This work uses publicly available data from IMDb and the U.S. Social Security Administration. The analyses can be entirely recreated by following the provided Python codes—though keep in mind, results may vary depending on when you access the data, which is updated regularly.