Research

I am currently interested in the following areas of research:

AI Safety, More Specifically:
- Evaluations
- Red Teaming
- Interpretability

Machine Learning
Natural Language Processing
AI for Social Good

Publications

Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety
Vamshi Krishna Bonagiri, Ponnurangam Kumaraguru, Khanh Nguyen, Benjamin Plaut
Reliable ML and Regulatable ML workshops, NeurIPS 2025
If Pigs Could Fly... Can LLMs Logically Reason Through Counterfactuals?
Ishwar B Balappanawar*, Vamshi Krishna Bonagiri*, Anish R Joishy*, Manas Gaur, Krishnaprasad Thirunarayan, Ponnurangam Kumaraguru
arXiv, arXiv 2025 (Under Review)
SaGE: Evaluating Moral Consistency in Large Language Models
Vamshi Krishna Bonagiri, Sreeram Vennam, Priyanshul Govil, Ponnurangam Kumaraguru, Manas Gaur
International Conference on Computational Linguistics, COLING 2024
Dark Side of the Tune: Investigating the maladaptive outcomes of excessive music consumption in the age of unlimited music access
Vamshi Krishna Bonagiri, Vinoo Alluri
18th International Conference on Music Perception and Cognition, ICMPC 2025
Measuring Moral Inconsistencies in Large Language Models
Vamshi Krishna Bonagiri, Sreeram Vennam, Manas Gaur, Ponnurangam Kumaraguru
The Sixth BlackboxNLP Workshop, EMNLP 2024
From Human Judgements to Predictive Models: Unravelling Acceptability in Code-Mixed Sentences
Prashant Kodali, Anmol Goel, Likhith Asapu, Vamshi Krishna Bonagiri, Anirudh Govil, Monojit Choudhury, Manish Shrivastava, Ponnurangam Kumaraguru
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP 2025)
Towards Effective Paraphrasing for Information Disguise
Anmol Agarwal, Shrey Gupta, Vamshi Krishna Bonagiri, Manas Gaur, Joseph Reagle, Ponnurangam Kumaraguru
European Conference on Information Retrieval, ECIR 2023
Are Deepfakes Concerning? Analyzing Conversations of Deepfakes on Reddit and Exploring Societal Implications
Dilrukshi Gamage, Piyush Ghasiya, Vamshi Krishna Bonagiri, Mark E Whiting, Kazutoshi Sasahara
CHI Conference on Human Factors in Computing Systems, CHI 2022
Cobias: Contextual Reliability in Bias Assessment
Priyanshul Govil, Hemang Jain, Vamshi Krishna Bonagiri, Aman Chadha, Sanorita Dey, Ponnurangam Kumaraguru, Manas Gaur
Web Science Conference, WebSci 2025
Representation Learning for Identifying Depression Causes in Social Media
Priyanshul Govil, Vamshi Krishna Bonagiri, Mayank Gaur, Ponnurangam Kumaraguru
Third ACM SIGKDD Workshop on Knowledge-infused Learning (KiL 2023)

Ongoing Research Projects

AI Saftey Evaluations - Working on developing robust evaluation frameworks for LLMs and investigating their consistency and reasoning (logical and moral) capabilities.
Quantifying Uncertainty in AI Systems - Working on developing robust measures to deal with complex events where the model has to understand its Uncertainty.