Gender Data Collection, Tagging, and Analysis

What are the problems that exist in collection, tagging, and analysis of gender data today?

Hi @lepri, @trigoana, @key2xanadu, @yanchen, @adanvers,
Given your vast experience in gender data, I feel you may be able to contribute on Karan’s discussion. Your inputs would help us in understanding core problems relating to gender data gap.

Hello everybody, from my experience as a gender and water expert: the main obstacle for collecting gender data is the absence of realistic indicators that can reflect a real gap within a specific community, communities are not hemogenous even in the same geographic area so general indicators and measurements are not pointing the same gender markers that lead to data gap

Hi @Hanadibader thank you for sharing your experience.

What recommendations do you have for how we might develop realistic indicators that reflect gaps within a specific community?

Hi @Pauwow, @dappel, @clddrd, @pms, @kkdata, @ingmarweber, @mlow, @rosiecampbell,
Given your background in data and computer science, we feel you may be able to provide inputs on Karan’s discussion and @Hanadibader and @Kathleen_Hamrick comment.

Hi @fabienaccominotti, @aksharacentreindia, @MaikaSondarjee, @clausdh, @VictorOrozco1,
Join the discussion to let us know the problems faced while Data collection and analysis.

I think one of the problems is to ask the right questions. For instance, in many surveys about street harassment in the Netherlands participants were asked “have you ever been harassed?”, many said no and politicians concluded that it was not such a bad problem after all. But if you ask “are there places where you don’t go in order to not been harassed?” or “have you changed your behavior in order to not be harassed?” then we get a totally different picture of the level of safety experiences on Dutch streets. The right questions can be found through dialogue with the target group.

This is a good point @“ÅsaEkvall”. I appreciate how it reflects a more “bottom-up,” experience-based approach that starts with the people. It also demonstrates the value of open-ended questions and framing.

If we accept a position on gender that assumes that gender is something people continously do and not (only) something they are or they have then the very first problem is that we cannot use knowledge on biological sex alone to collect, tag or analyse gender thoroughly. Men and women are different in some respects but much research also points to the fact that men and women are very similar to each other. So in my opinion it would not be enough just to be able to tag data with either male or female - even if that could be a small improvement - but both men and women are heterogenous groups as the discussion of intersectionality presupposes.

@clausdh Indeed. I am reminded of thinking about gender not as a fixed “identity,” but in terms of gender “identification.” A process that is not static.

Copy pasting @Suneetharani’s input sent via direct message
“My fieldwork, experience and observations have revealed that the gender gap in data collection has several reasons which are culture-specific. Most women do not want to articulate their experiences and opinions due to close surveillance by people around them. Take it as protest or as reluctance, they remain silent and invisible. On the other hand, men take the initiative to speak on behalf of women as they think that women do not know anything, incapable of speaking and also cannot speak as it will draw unnecessary attention. Not only that women do not speak about themselves but also that men will speak on their behalf. This leads to gaps and misrepresentations in data collection. Some genders are not even taken into consideration while collecting the data, for instance, the transgender people. Research has still not reached transmen and transwomen still, except for theoretical discussions. Also, conventional tools of gender/women research are still being used to analyse the collected data which will only reiterate the existing notions, beliefs and stereotypes. Any detail outside the existing frameworks is also being dismissed as a stray instance in quantitative as well as quantitative research on gender.
The biggest challenge to gender research on health is the gender roles and stereotypes that allow health issues only from a restricted perspective. Women’s health is always understood focusing on their reproductive roles. Social and cultural institutions not only ignore women’s ailments and complaints but also emphasise the able bodies that almost touch upon superhuman capabilities to handle home, workplace and other responsibilities. Women also are made to feel conscious about the need to be recognised as healthy bodies without any health-related issues. As mentioned above, transgender health issues are completely ignored and neglected and on the other hand, their identities, especially in terms of health, are scandalised.
The data collected also remains incomplete, apart from all the reasons mentioned, as the research focuses only on the “mainstreamised” women while the women of marginalised communities are hardly taken into consideration except for tokenistic representation.
I think the tools and methods of research should be based on inclusiveness and critical thought. Data collection should go beyond the stereotypes and reach out to all corners of society. Not “their” men and their “families”, but women have to speak for themselves. Then only the gap can be filled.”

Adding to what has been mentioned above, we encounter problems throughout the data value chain.

First there is a lack of political and financial prioritization to ensure national surveys that collect sex-disaggregated data are done regularly (of course would be even better if data were disaggregated by gender and other stratifiers, or capture intersectionalities). Furthermore, many statistics offices do not have the staff capacity and expertise needed to bring a strong gender lens to the data analysis process. There is no strong accountability mechanism that focuses on this challenge (national performance audits on gender equality are few and far in between)

There are also biases in questionnaire design. For example, household surveys assuming women who earn money have control over it; asking questions to only one person in the household (usually the men answer); and a pervasive lack of questions on adolescent girls and young women’s sexual and reproductive health unless they are married. Connected to both @Hanadibader and @ÅsaEkvall points, we have to work together to develop questions that accurately capture concepts that we talk about frequently (e.g. great advances have been done in developing assessments to capture respectful maternity care, starting with a clear definition of the concept informed by the target population. How do we then standardize these measures and ensure frequent collection?) Additionally, how can we gather nationally representative and accurate data on sensitive topics while still prioritizing the safety of girls and women (e.g. in no privacy when answering questions on intimate partner violence and fear of backlash).

Once data is collected, it doesn’t reflect girls’ and women’s full experience and the links between different aspects of their lives. This makes analysis of data difficult, as finding different data sets that complement each other yet are also comparable is hard. Translating data into accessible and actionable language is also needed to increase use of gender data when available.

Curious to hear if others have encountered or identified biases in AI when it comes to using AI for data analysis/prediction/modelling/etc?

Hi, everyone! You can take a look at this newsletter by the NYT, about the absence of desegregated COVID-19 data.

Thanks @stellunak for sharing this insightful resource. Do you know of any existing / potential technology or model being used to overcome this issue?

Not yet @Shashi but I will follow up on that as soon as I find something.