Multimodal Health Assistant

PUBLICATION

The Good, the Bad, and the Facts: Multimodal Representation of Medical Conversations for Patient Understanding,
MIT Masters Thesis, Spring 2019

MOTIVATION

In my first year of grad school my dad was diagnosed with and passed away from a very rare form of Leukemia. In the seven months between his diagnosis and death, my parents struggled to make sense of the overwhelming amount of information doctors gave them. The consequence of each decision was literally life or death, yet they were often unsure if they completely understood what the doctors were telling them. After my dad passed, I realized many cancer patients must face similar challenges, so I wanted to make a tool to help.

THE PROJECT

In this project, I created and tested a prototype web app to help address the challenge of reviewing and synthesizing information from medical appointments for patients enduring serious and emotionally demanding health conditions.

DESIGN

Emotional medical conversations are difficult for patients to record, synthesize and review. So the solution needed to:

Capture the raw content of conversations to reduce cognitive demand on patients during appointments.
Present text and audio records of conversations so patients can review the information easily.
Identify important information to help patients review their appointments without reliving them.

USER JOURNEY

My prototype web app addresses the Review & Synthesize step of the user journey. The prototype needed to highlight important information in the conversation, but I realized that “important” information was too ambiguous. I chose to focus on the positive and negative information from the doctor. While these labels do not capture all important information, I assumed that all positive and negative comments were likely to contain important information.

The final prototype transcribed recorded audio with automatic speech-to-text technology and labeled positive and negative speech events. Once the data was processed, the prototype allowed users to review the transcript alongside its audio recording. Positive and negative speech events were visually highlighted to help users identify important information in the conversation. During user studies, the labels were assigned manually, but I later developed an algorithm to assign them with machine learning.

Users could also use the filters to isolate positive and negative information from the transcript. They could click on certain parts of the transcript to hear snippets of the conversation and remember what and how the doctor communicated the information. The presentation of information allowed users to review all (or parts) of the information from appointments in their own time as many times as they chose.

USER STUDY METHOD

I designed a controlled study to evaluate how well the web interface assisted patients in reviewing and understanding information from their medical appointments. In particular I was interested whether:

The web interface was a useful tool for reviewing conversations
If the labels influenced users’ perception of information
If the labels encouraged reflection on information

The study consisted of two phases:

Appointment
Because access to real patients and doctors was not possible for this study, I created fictional doctor-patient scenarios for the user studies. Participants and the study proctor enacted fictional appointments where the participant was the patient and the proctor was the doctor. The proctor delivered a fictional cancer diagnosis to the participant. The conversation included relatively positive information (e.g. high survival rates) and relatively negative information (e.g. severe side effects).
Review
After 4 hours, participants reconvened with the proctor and completed a survey about their experience in the earlier appointment first by referencing only their handwritten notes and then by using the web interface. After completing the questionnaire, participants were interviewed about their experience in the fictional appointment and their reactions to the web interface.

USER STUDY RESULTS

In total, 25 adult participants agreed to participate in the study. Despite the fictional scenarios, several participants expressed that that receiving even a fictional diagnosis was an emotional experience.

Influence on Perception

More than half of participants reported a change in their perceived valence of the conversation after using the web interface and indicated that the interface influenced their opinion of the conversation. Some reported a positive change (they saw more positive information with the interface) and some reported a negative change (they saw more negative information with the interface).

Participants responded to the question, “Based on your understanding, the overall content of the conversation was:” on a scale of 0 (Very Negative) to 4 (Very Positive). Participants ranked the conversation first using only their manual notes and again after using the web interface to review the conversation. These charts exhibit the difference between individual rankings before and after using the interface.

Influence on Reflection

The positive and negative labels were manually assigned for these user studies. However, almost all participants assumed these were assigned algorithmically. With this assumption, 96% of participants indicated they trusted the classification. In fact, when the labels differed from their opinion, participants said it encouraged them to review the conversation more closely and reflect on how the information could be interpreted differently.

Usability & Usefulness

Participants expressed that the usability and experience with the web interface were very positive. Survey results show that participants found the interface easy to use and helpful for finding important information.

NEXT STEPS

If I were to continue developing this project, I would like to explore further methods for summarizing appointments and retrieving “important” information from transcripts. Unfortunately, results regarding the prototype’s influence on participants’ understanding of the conversation were inconclusive. Therefore, I would also like any further exploration to clarify how this type of interface influences patient understanding of medical conversations.