By Shruti Jain*
The fast-paced evolution of technology in the 21st century has provided too little time for all aspects of social, political and economic life to readjust to the disruptive changes. The next few years are expected to be largely shaped by Artificial Intelligence (AI) and the ‘internet-of-things.’ Therefore, this is a crucial time to build the principles upon which the foundation of AI will be set. Research has shown how algorithms are susceptible to carrying the racial, cultural and gender-bias of their designers and engineers in the manner in which they process data. However, recent studies have revealed how inaccuracies in the data itself, or reliance on non-disaggregated data may only further exaggerate algorithms’ inherent bias.
Inequity in data sets
Caroline Criado Perez in her recent book ‘Invisible women’ discusses the lack of understanding amongst programmers about the issues with the data that they use. Gender-bias is not always accounted for because of privacy concerns with proprietary software. Besides, the lack of ‘sex-disaggregated data’ in programming algorithms – where male data is not segregated from female data – its analysis may churn out only half-truths and an incomplete picture. When utilised by private companies and governments for policy-making, these biased data sets, fail to address gender imbalance.
For instance, if the data from a survey to estimate agricultural productivity is not sex-disaggregated, it would not present much opportunity to understand why the productivity is low and how to improve it. Sex-disaggregated data from a survey conducted by World Bank in six countries under International Development Association (IDA) showed gender gap in agricultural productivity due to the time spent by women caring for their families. In Mozambique, such women, when introduced to a pre-school enrolment programme, increased their likelihood of working in labour market by six percentage points. These findings inspired another country, Congo, to provide rural child care facilities to increase productivity of women in agriculture. With increased productivity, women contributed more towards the household income and gained a greater say in decision-making. These countries wouldn’t have been able to address the loss of productivity and existing gender gap had they relied on inaccurate analysis presented by non sex-disaggregated data.
Perez’s argument could be expanded from female to non-binary and gender-fluid identities. The veiled algorithms behind AI structures are mostly built by straight white men and often exclude minorities’ perspectives.
Ironically, it was women during the Second World War who pioneered the skill of writing software for the machines. With time, lopsided digital skills education and training led to learning and confidence gaps. Gradually, only a fraction of students pursuing advanced-level studies in computer science and information technology fields were women. These differences grew starker with the transition of women from education to work. According to a survey, in 2019, there were only about 11 percent female coders world over.
Gender bias in AI
Today, AI is used to curate information from search engines, make loan decisions, rank job profiles and influence preferences of people, amongst numerous other functions. Gender-bias is hidden in some of the most common AI systems that are universally used such as search engines, recruitment software and services like voice-assistants.
People’s worldviews and decisions can change based on the search results from leading search engines. A study by the Washington University found that there was a gross underrepresentation of women across professions in image search results. An image search for ‘authors’ showed that only 25 percent results were women, as against 56 percent of women authors in the US. Similarly, while significant underrepresentation was noticed for women CEOs as well as doctors, women nurses were over-represented. Often, algorithms are written in a way such that they reflect information based on a limited set of data and fail to pick up the context. For example, a news headline “A recent decline in women nurse results in equal ratio gender parity in the profession” is most likely to connect women with the image as algorithms take the keywords “nurses” and “women” without understanding the context. It was also argued that the current image results reiterated gender stereotypes, sexualised women and depicted them in supporting roles.
To dodge objectification of women in sports, Getty Images announced their decision to improve the depiction of women athletes in stock pictures by focussing on their skill, strength and speed of the sport rather than their appearance.
Another biased AI domain is hiring algorithms, which are increasingly used by companies to ease the process of recruitment by scoring resumes, assessing competencies, screening and flagging candidates who don’t meet the criteria. Although used in order to bring about more objectivity in the hiring process, they come with added risks. Most often, they find and replicate patterns of user behaviour. If the data is not sex or racially disaggregated and the algorithm finds that the most hired candidates are white males, then it teaches itself to recommend white males over the rest. Similar to search engines, algorithms have the tendency to pick up words without a context and connect them to a trait. Recently, Amazon changed its practice of using hiring algorithms after it was discovered that these algorithms were leading to a gender gap in recruitment. Amazon’s model trained their algorithms using data from resumes submitted over a 10-year period. Most of these were resumes that came from men. As a result, the algorithms taught itself that male candidates were preferred and started reducing the ranking of resumes from women candidates.
It is no coincidence that leading voice assistants are exclusively female by default in voice and name (Alexa, Cortana and Siri). A company recently reasoned through its study, which, found women’s voices to be psychologically more “pleasant” and “sympathetic”, making them an obvious commercial choice for assistance. The female identity of home voice assistants has also reinforced the stereotype of women being associated with only household activities and chores. They also invigorate the master-slave dialectic through the idea of female AI offering submissive servility to its owners. Moreover, the voice assistants’ response to sexual harassment and profane language was found to be more apologetic and deflecting rather than correctional or punitive. The personality traits and responses encoded into voice assistant algorithms negatively impact gender roles and behaviour, especially among users of impressionable ages.
Towards an inclusive environment
While it is difficult to do away with inherent human biases, inequalities in AI systems can be controlled by training designers to be aware of such inequities and their potential impact on societal perceptions. This can be done by improving the pool and representation of data sets used for training AI. For more transparency, efforts can be made by the companies to explain their algorithmic decision-making processes and adopt disaggregation of data for equitable outcomes. As explained above, gender or sex-disaggregated data will help not only identify gender biases, but also forge equitable solutions.
Simone de Beauvoir famously said, “Humanity is male and man defines woman not in herself but as relative to him”. This resonates with today’s data handling practices, where ‘male’ data is often used as the default. It is, therefore, prudent to recognise the importance and independence of gender diversity in machine learning and data collection. Gender balance is critical to prevent algorithms from reinforcing and augmenting the ideologies that already disadvantage the marginalised.
*The author is a Research Intern at ORF Mumbai.