About Me

I work on Language Model Intelligence. I’m helping build the next generation of Evals @ Huggingface. I’m currently a senior at UIUC studying CS, and an incoming PhD student. I’m advised by the wonderful Dilek Hakkani-Tur and Heng Ji.

In a previous life, I worked at Rivian and Yahoo Research.

I’m a member of Giving What We Can and I’ve pledged to donate at least 10% of my lifetime income to charity.

DM me on X or email at sumuk[atsymbol]sumuk.org.

Select Publications

* denotes first author work.

  • YourBench: Easy Custom Evaluation Sets for Everyone. Paper.
    • Sumuk Shashidhar, Clémentine Fourrier, Alina Lozovskia, Thomas Wolf, Gokhan Tur, Dilek Hakkani-Tür. Under Review.
  • Unsupervised Human Preference Learning. Paper. Slides. Poster.
    • Sumuk Shashidhar *, Abhinav Chinta, Vaibhav Sahai and Dilek Hakkani-Tur. 2024. Unsupervised Human Preference Learning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, pages pending-pending, United States. Association for Computational Linguistics.
  • Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models. Paper. Slides.
    • Sumuk Shashidhar *, Abhinav Chinta, Vaibhav Sahai, Zhenhailong Wang, and Heng Ji. 2023. Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9070–9084, Singapore. Association for Computational Linguistics.
  • Improving Task-Oriented Dialogue. Paper. Slides.
    • Takyoung Kim, Emre Can Acikgoz, Beyza Bozdag, Janvijay Singh, Sagnik Mukherjee, Shuhaib Mehri, Sumuk Shashidhar, Dilek Hakkani-Tür. Under Review.
  • Scaling Laws For Natural Language Planning Models
    • Abhinav Chinta, Sumuk Shashidhar, Vaibhav Sahai, Yangyi Chen, Heng Ji. Under Review.

Talks

Talks or paper presentations that I’ve given

  • Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models. Slides.
  • Accessing GPT-4 Level Mathematical Olympiad Solutions via MCTSr with LLaMa-3 8B. Slides.
  • Direct Preference Optimization. Slides.
  • GPTFuzzer. Slides.

Work

  • Sep ‘23 - Present: Rivian. NDA. R2 New Features.
  • May ‘23 - Aug ‘23: Yahoo. ML for ad pricing on demand-side platform.
  • May ‘22 - May ‘23: AGCO. Supply chain analytics for risk forecasting.
  • Jun ‘21 - May ‘22: QuantIllinois. HFT infrastructure for anomaly detection,
  • Jun ‘19 - Mar ‘20: Cisco. Caching for internal microservices.

Preprints

Some of these are works that were personally interesting to me, but never made it past my personal quality threshold to be formalized

  • SpaceKraft: A Vision Language Approach For Interior Design Inpainting. Paper
  • Ideal Prompt Generation for Text-Conditional Image Synthesis from Large Text Corpora. Paper
  • Decentralisation of Election Processes in Developing Nations Paper
  • Token Efficient Deep Conversational Reasoning With ConvoDAGs Paper