Research

I am interested in the following areas of research:

  • AI Safety & Robustness
  • AI Interpretability
  • Natural Language Processing
  • AI for Social Good
  • Human-AI Interaction

Publications

Conference Papers

  • SaGE: Evaluating Moral Consistency in Large Language Models
    Vamshi Krishna Bonagiri, Sreeram Vennam, Priyanshul Govil, Ponnurangam Kumaraguru, Manas Gaur
    International Conference on Computational Linguistics, COLING 2024
    [Paper]
  • Towards Effective Paraphrasing for Information Disguise
    Anmol Agarwal, Shrey Gupta, Vamshi Krishna Bonagiri, Manas Gaur, Joseph Reagle, Ponnurangam Kumaraguru
    ECIR 2023
    [Paper]
  • Are Deepfakes Concerning? Analyzing Conversations of Deepfakes on Reddit and Exploring Societal Implications
    Dilrukshi Gamage, Piyush Ghasiya, Vamshi Krishna Bonagiri, Mark E Whiting, Kazutoshi Sasahara
    CHI Conference on Human Factors in Computing Systems, 2022
    [Paper]
  • Cobias: Contextual Reliability in Bias Assessment
    Priyanshul Govil, Hemang Jain , Vamshi Krishna Bonagiri, Aman Chadha, Sanorita Dey, Ponnurangam Kumaraguru, Manas Gaur
    Websci 2025
    [Paper]
  • Measuring Moral Inconsistencies in Large Language Models
    Vamshi Krishna Bonagiri, Sreeram Vennam, Manas Gaur, Ponnurangam Kumaraguru
    The Sixth BlackboxNLP Workshop, EMNLP 2024
    [Paper]
  • Representation Learning for Identifying Depression Causes in Social Media
    Prerna Govil, Vamshi Krishna Bonagiri, Mayank Gaur, Ponnurangam Kumaraguru
    Third ACM SIGKDD Workshop on Knowledge-infused Learning (KiL), 2023
    [Paper]

Current Research Projects

  • AI Saftey Evaluations - Working on developing robust evaluation frameworks for LLMs and investigating their consistency and reasoning capabilities.
  • Adversarial Attacks - Working on establishing adversarial attack bounds for LLMs and multimodal ML systems.
  • Quantifying Uncertainty in AI Systems - Working on developing robust measures to deal with complex events where the model has to understand its Uncertainty.