Pratyush Chatterjee

Work

CNeRG- I.I.T. Kharagpur
|

Research Intern

Summary

Investigated cultural sensitivity in small-parameter large language models (LLMs), focusing on mitigating harmful cultural outputs. Fine-tuned LLMs, including Mistral-v0.2 (7B), Llama-2 (7B, 13B), Vicuna (13B), and Phi (4B), using advanced alignment techniques such as Direct Preference Optimization (DPO) and Offline Reward-based Preference Optimization (ORPO), achieving up to a 70% reduction in harmful outputs. Utilized Captum, an explainability library, to interpret model decisions and understand attention distributions influencing harmful responses.

Deloitte
|

Al-Powered Employee Wellness & Counselling Platform

Summary

Developed a multi-agent GenAl system to detect and support emotionally distressed employees through adaptive, empathetic conversations. Orchestrated autonomous agents using LangGraph and Agno, enabling context-aware, emotion-driven interactions powered by GPT-40-mini and RAG. Implemented anomaly detection with Isolation Forest & LOF to flag at-risk employees for proactive intervention. Deployed full-stack solution with real-time reporting, HR escalation, and scalable infra using GCP, GKE, Vercel, and GitHub Actions.

800Club
|

Al Developer | Agentic Mock Paper Generation System

Summary

Architected and implemented an Al-driven exam question generator leveraging LangGraph, OpenAI's GPT-40, and open-source LLMs to dynamically construct CUET exam papers. Integrated vector embeddings for semantic indexing and retrieval of historical exam content, enabling context-aware information extraction. Designed a scalable multi-agent architecture with dedicated agents for topic distribution, context synthesis, and question generation, ensuring domain-specific precision. Engineered a resilient RAG-based workflow with layered exception handling and graceful degradation, incorporating fallback mechanisms for uninterrupted exam paper creation.

|

Stock Market Analysis and Prediction

Summary

Developed a robust stock market analysis and prediction model using Python, leveraging historical data and technical indicators to achieve an accuracy score of 0.81. Used yfinance library to retrieve historical stock data. Created and Leveraged various technical indicators like - SMA, EWMA, MACD, Bollinger Bands, RSI, and Stochastic Oscillator to develop features for stock market analysis, enabling informed trading decisions.

Open IIT Data Analytics
|

Airline Operations Optimization

Summary

Predicted flight delays with CatBoost, achieving an R2 score of 0.918 after extensive hyperparameter tuning. Minimized flight delays by 53.55% at Atlanta Airport using a OR-based optimization model, incorporating constraints like airport capacity and timing adjustments, using the optimization software Gurobi. Processed and cleaned 1.3M flight records, integrating weather data via the Meteostat API and handling missing values with MICE imputation.

About

Education

Indian Institute of Technology, Kharagpur

Dual Degree

Industrial and Systems Engineering

Grade: 9.03

B.G.K.V., Kolkata

Class XII, AISSCE

Grade: 96.2%

B.G.K.V., Kolkata

Class X, AISSE

Grade: 96%

Awards

First Place - General Championship

Awarded By

OpenSoft IIT KGP

Third Place

Awarded By

Open IIT Data Analytics

Publications

Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment

Summary

Proposed Soteria, a novel safety alignment strategy that modifies only~3% of language-specific model parameters, significantly reducing harmful content generation. Developed XThreatBench, a 3,000-instance multilingual safety benchmark covering 12 languages and 10 high-risk categories derived from real-world policy guidelines. Achieved a 40-60% reduction in attack success rates across high-, mid-, and low-resource languages while maintaining general model performance. Conducted large-scale experiments with open-source LLMs (Llama 3.1, Qwen 2, Mistral, Phi 3.5), demonstrating consistent improvements in multilingual safety. Read More: arXiv

Skills

Programming Languages

Python, C, C++.

Frameworks and Libraries

PyTorch, TensorFlow, NumPy, Pandas, Matplotlib, Scikit-learn, LangChain, LangGraph.

Interests

Achievements

Secured All India Rank 4991 in JEE Advanced 2023, among 2.5 lakh candidates., Ranked in the top 1% in JEE Mains 2023, among 12 lakh candidates., Secured a rank of 264 in WBJEE 2023..

COURSEWORK INFORMATION

Advanced Calculus, Linear Algebra, Numerical and Complex Analysis, Operations Research, Programming and Data Structures, Probability and Statistics.