The Network Effects of Following Fewer Women

Interactive Github repo

The theme for my work in the past year has been: "What does diversity data look like?". Diversity is an area many in the tech industry can agree on needs improvement. As a woman in tech, I can see the discrimination in the day-to-day interactions between myself and peers in the workplace and as an attendee at meetups and conferences where you can feel a bit out of place wearing a skirt in a sea of beards clad in startup tees and jeans. What data can help us do is to see the net effects of these micro interactions on a broad level. What happens to the overall system when we listen to certain voices over others, and as a result, what perspectives are we missing out on?

Obviously data cannot solve all these problems. Nor can it directly resolve those moments when someone is being viciously attacked in a twitter shitstorm, when someone of a marginalised background is not given the chance to speak in a room full of white males, or when another person of privilege fails to recognise this privilege and abuses their position of power. Even though someone in my position may feel these moments more strongly than others, I hope that exposing the data in broad daylight can awaken someone who is unaware that these things are happening, that these issues are systemic. Perhaps it will push over the fence sitter, who just needed that extra reason to get behind the cause of supporting women and minorities in the tech industry.

Males following males and females following femalesWhat my Twitter network looks like

After completing my previous project; a reflection on the diversity and biases that exist in my twitter network, I felt that this had been an interesting exercise for myself, but not necessarily compelling for others. Catching your own reflection in the mirror is always more fascinating than seeing someone else's. I had made the conscious decision to open source this code in order for others to run the same process on their own network. But along with Twitter's API being particularly tedious and restrictive to work with for social data extraction, and a general lack of confidence in producing comprehensible code, this was not as easy and clean as I had hoped.

So why not a simulation? This would get around the need to work with Twitter's horrendous rate limiting, which is so crippling, that I could imagine that the only data visualisers who could produce meaningful work using their data would be Twitter's own inhouse team. With randomness as your friend, a simulation is a way you can see your own network without it actually being your own network.

I decided to go with the same visual representation as my previous project. Individuals are arranged in a radial layout, grouped by gender. Those in the outer bands have more followers, while those in the inner bands have fewer followers.

How to read the simulations

Key

Each simulation is a random network of people. The number of people is variable, as is the number of people they follow (which is constant across all individuals), as well as the percentage of men and women present in the network. When you refresh, the people that are followed are randomly reassigned and another possible network is created.

An equal network

In this first simulation, and as is the case in all these simulations, there are an equal number of men and women. Each person follows 10 other people in the network, 50% of which are women (red circles). Networks of 2 different sizes are presented side-by-side for comparison. This acts as a baseline; what the ideal world would look like if everyone followed an equal number of men and women.

20 Person network refresh
100 Person network refresh

Given a random network of people, the visual formation is quite random, as you'd expect. As you refresh, you can see a few variations and patterns emerge in these formations, but the placement of red nodes vs blue nodes is unpredictable.

The effect of following fewer women

Now let's reduce the percentage of women that each person follows, with each person following 10 other people, 30% of which, are women.

20 Person network refresh
100 Person network refresh

These simulations show a distinct difference in the placement of the red nodes vs the blue nodes. The red nodes cluster towards the center, while the blue nodes are prevalent on the outer bands. The red circles do not ever seem to go as far as the outer two bands.

The significance of 30% is that this is the percentage of women that I follow. As a diversity advocate, this is not a number I am proud of, but I can only assume that this number would be worse for others in the industry. In reality, I imagine the network graph for the whole industry would be a lot more skewed than this.

Increasing the size of the network

10 people is a very unrealistic number for the average number of people someone would follow. So what happens when we increase this? I don't want to dial this number up to where the browser can't handle it, so let's try just doubling this number to see if there are differences. Here, there are 100 people in the network, each person follows 20 other people.

Each person follows 50% women refresh
Each person follows 30% women refresh

The increase in scale seems to exaggerate and exacerbate the difference between the red and blues nodes.

This result is hardly surprising, but I hope that you can find some value in seeing it plotted out. You can of course extrapolate these findings to other forms of inequality in social networks, from race, income levels, to political/religious orientation by replacing the labels of 'men' and 'women' of the different coloured nodes to something else.

Some possible worlds

Here are some simulated networks created with different variables. There is an interactive version for you to play with to generate your own network. You may want to put in the percentage of women, or some other marginalised group, that you follow.


100 people, 50% women, following 2 people, 50% are women
100 people, 50% women, following 2 people, 50% are women (variation)
20 people, 50% women, following 18 people, 100% are women

30 people, 50% women, following 30 people, 10% are women
20 people, 50% women, following 20 people, 50% are women
100 people, 70% women, following 20 people, 20% are women

I recognise that these simulations only address the first degree of connectivity in the social graph. It would be worthwhile investigating as a second step, seeing what would change with the addition of further degrees of connectivity (followers of followers of followers and so on). This may be a potential next step, but my interests might take me elsewhere after this is published.

I hope that these simulations can act as a lens into your own social network. It may be one thing to say "yes, I do follow fewer women", but what would happen if this were the case for everyone? In recognising this, I hope that people can become more conscious about the voices they opt into hearing. I also hope for us to reflect on the reality of the tech industry and society in general. While it may be a perceived meritocracy, the net result of any minority having a lesser following, is that as a member of that minority you are less likely to be listened to, with no fault of your own.

You can play with a standalone interactive version of this simulation app where you can tweak any combination of variables.

The code for this, which you can use to embed your own simulated network diversity graph, can be found on Github.