Ankit Gupta

I recently graduated with an M.S. in Computer Science (Research Thesis) from the CS Department at Carnegie Mellon's School of Computer Science. I defended my graduate thesis on vision-language models for medical report generation. Previously, I interned at IMC Trading, Facebook/Instagram, and Amazon.

Email  /  Google Scholar  /  Twitter  /  Github

Profile Picture
Research

See Google Scholar for more.
Analyzing Multimodal Machine Learning Model Performance and Evaluation Metrics for Medical Report Generation
Ankit Gupta, Min Xu, Martin Zhang, Bryan Wilder
MSCS Thesis Defense
[paper] [talk]

We compare the performance of a variety of approaches for generating medical reports on a dataset of chest X-ray medical reports, including a unimodal fine-tuned medical LLM, a multimodal model without symptom data, and a multimodal model with symptom data. Next, we introduce four new metrics for evaluating the similarity between generated and reference medical reports, which we term Word Pairs, Sentence Average, Sentence Pairs, and Sentence Pairs (Bio).


Website design from Jon Barron