About Me

I’m a senior at UIUC studying Computer Science, working on Language Model Intelligence. I’m fortunate to be advised by Professor Dilek Hakkani-Tur, and Professor Heng Ji.

I’m currently working at Rivian. I’ve previously interned at Yahoo Research.

DM me on X or email at sumuk[atsymbol]sumuk.org.

Select Publications

* denotes first author work.

  • Unsupervised Human Preference Learning. Paper. Slides. Poster.
    • Sumuk Shashidhar *, Abhinav Chinta, Vaibhav Sahai and Dilek Hakkani-Tur. 2024. Unsupervised Human Preference Learning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, pages pending-pending, United States. Association for Computational Linguistics.
  • Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models. Paper. Slides.
    • Sumuk Shashidhar *, Abhinav Chinta, Vaibhav Sahai, Zhenhailong Wang, and Heng Ji. 2023. Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 9070–9084, Singapore. Association for Computational Linguistics.
  • Improving Task-Oriented Dialogue. Paper. Slides.
    • Takyoung Kim, Emre Can Acikgoz, Beyza Bozdag, Janvijay Singh, Sagnik Mukherjee, Shuhaib Mehri, Sumuk Shashidhar, Dilek Hakkani-Tür. Under Review.
  • Scaling Laws For Natural Language Planning Models
    • Abhinav Chinta, Sumuk Shashidhar, Vaibhav Sahai, Yangyi Chen, Heng Ji. Under Review.

Talks

Talks or paper presentations that I’ve given

  • Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models. Slides.
  • Accessing GPT-4 Level Mathematical Olympiad Solutions via MCTSr with LLaMa-3 8B. Slides.
  • Direct Preference Optimization. Slides.
  • GPTFuzzer. Slides.

Work

  • Sep ‘23 - Present: Rivian. NDA. R2 New Features.
  • May ‘23 - Aug ‘23: Yahoo. ML for ad pricing on demand-side platform.
  • May ‘22 - May ‘23: AGCO. Supply chain analytics for risk forecasting.
  • Jun ‘21 - May ‘22: QuantIllinois. HFT infrastructure for anomaly detection,
  • Jun ‘19 - Mar ‘20: Cisco. Caching for internal microservices.

Preprints

Some of these are works that were personally interesting to me, but never made it past my personal quality threshold to be formalized

  • SpaceKraft: A Vision Language Approach For Interior Design Inpainting. Paper
  • Ideal Prompt Generation for Text-Conditional Image Synthesis from Large Text Corpora. Paper
  • Decentralisation of Election Processes in Developing Nations Paper
  • Token Efficient Deep Conversational Reasoning With ConvoDAGs Paper