ProjectsDecember 1, 2024Mental Health LLM Evaluation
This research project evaluates Large Language Models (LLMs) for mental health applications, with a comprehensive focus on bias and fairness across different demographic groups and clinical scenarios.
- Bias Detection: Systematically identify and measure bias in LLM responses related to mental health topics
- Fairness Assessment: Evaluate fairness across various demographic groups including age, gender, ethnicity, and socioeconomic status
- Clinical Accuracy: Assess the accuracy and appropriateness of mental health-related responses
- Safety Evaluation: Examine potential risks and safety concerns in mental health applications
- Dataset Creation: Curated comprehensive test sets covering various mental health scenarios and demographic contexts
- Multi-Model Evaluation: Tested multiple state-of-the-art LLMs including GPT-4, Claude, and specialized mental health models
- Bias Metrics: Applied established bias detection metrics and developed custom evaluation criteria for mental health contexts
- Expert Review: Collaborated with mental health professionals for clinical validation
- Python: Primary programming language for data analysis and model evaluation
- Machine Learning Libraries: PyTorch, Transformers, scikit-learn for model interaction and analysis
- Statistical Analysis: Advanced statistical methods for bias measurement and significance testing
- Visualization: Custom dashboards for presenting bias patterns and evaluation results
The research revealed significant bias patterns across different demographic groups and highlighted the importance of careful consideration when deploying LLMs in mental health contexts. The findings contribute to the development of more responsible AI systems in healthcare.
This work contributes to the growing body of research on responsible AI in healthcare, providing frameworks and metrics that can be used by other researchers and practitioners working on mental health AI applications. Related projects
F1 AI Race Predictor
Built an end-to-end race prediction platform using historical race data, weather, driver performance, and qualifying results, achieving 68.5% accuracyPortfolio Optimization Dashboard
Designed a full-stack investment optimization system supporting strategies like Markowitz, Black-Litterman, and Risk Parity, with real-time analytics dashboardsNashville Airbnb Data Analysis
Enhancing InsideAirbnb.com with Predictive Analytics on Nashville Listings: A Data-Driven Approach to Price and Rating PredictionsRestaurant Review Data Dive
Uncovering Customer Satisfaction Drivers through Sentiment Analysis and Predictive Modeling of Restaurant Reviews across Multiple StatesCar Sales Data Dive
Predicting Car Sale Prices through Advanced Data Cleaning, Feature Engineering, and Regression ModelingLyft Market Analysis
Lyft Market Expansion Strategies and Optimization in Washington D.C. using Tableau visualizationCOVID-19 Clustering Analysis
Comparing Spectral Clustering with K-means and KNN to determine relationships between COVID-19 transmission peaksBird Data Analysis
Exploratory data analysis and pattern recognition in bird observation datasets