Research

I am currently interested in the following areas of research:

  • AI Safety, More Specifically:
    • Evaluations
    • Red Teaming
    • Interpretability
  • Machine Learning
  • Natural Language Processing
  • AI for Social Good

Publications

Ongoing Research Projects

  • AI Saftey Evaluations - Working on developing robust evaluation frameworks for LLMs and investigating their consistency and reasoning (logical and moral) capabilities.
  • Quantifying Uncertainty in AI Systems - Working on developing robust measures to deal with complex events where the model has to understand its Uncertainty.