EHR & Social Media Data for Insights in Mental Health

Big data such as health administrative data can be useful sources for understanding the epidemiology of mental health and gaps in care, but a key limitation is limitation of having only a male/female variable; even as a proxy, possible gender-based differences in the treatment for depression between women and men can be evaluated ( These datasets can still be used creatively to unpack sex-based differences from gender-based differences with some newly collected gender-specific data. Ultimately, a key challenge in this area of mental health and gender is the fact that most survey-based measures (e.g. CESD score) have differential response bias by gender that will over-estimate depression in females and under-estimate in males. It would be interesting to understand how different cultures describe and understand mental health concerns and how these meanings are constructed in a gendered way. Technology (AI) could be helpful to extract text from local print/online media or conversations and capture local experiences of distress along with mental health data. The voices of more socially disadvantaged populations, however, may not be captured in the dominant discourse using this method.

Thanks @AnnalijnUBC for sharing your thoughts. We will definitely take into account some of the points.

@AnnalijnUBC Good point on the limitations of datasets pulled from administrative health data, especially on the voices not captured. It is important to keep in mind that even if administrative health data captured more nuanced variables we would still have significant gaps because of the large populations not represented in such data.

I think using EHR data may be more straightforward than using social media data, since EHR data is collected for a health purpose in the first place. Since collecting sensitive health data indicators from social media is more of an evolving process/framework, I found this article interesting (it has a Finnish point of view, but seems like a reasonable starting point for thinking through data collection):

In addition to EHR data, is there a way to capture data from people in locations where paper medical records predominate? E.g., opt-in use of surveys via mobile phone, with generic QR codes provided on-paper at the point of healthcare and completed by individuals at a place and time of their own choosing, so the health care provider does not see the survey responses?

Regarding use of information aggregated from mobile phone use: It may be possible to determine what time phones were used or where they were used, but drawing conclusions from shifts in that data is difficult. One possible approach could be to correlate this data against known local and global events - for example, if people are staying up later or going to bed earlier during dates that correspond to the pandemic, or migrating out of cities during the same time period, that may be visible in smartphone use data. What it says about mental health, however, is an open question unless context can be gathered from people about, e.g., why their sleep patterns have changed. Maybe it’s because they need to get work done from home when other family members have gone to sleep.

Thanks @stephaniel for sharing this important resource and highlighting details on use of information received from mobile phone.