Technical Expertise
Generative AI & LLMs
Agentic AI, Advanced RAG, LangChain, OpenAI, Hugging Face, Llama Index, Multimodal Models, Prompt Engineering, Fine-Tuning, Semantic Chunking, Hybrid Retrieval.
Data & ML Engineering
Python, R, SQL, TensorFlow, PyTorch, Keras, Scikit-learn, Pandas, Matplotlib, Data Structures and Algorithms, Text/Image Analysis.
Cloud & MLOps/DevOps
AWS (EC2, Lambda, S3), Azure Data Warehousing, HPC, CI/CD, MLOps, Prefect, Spark/PySpark, Git/GitHub.
Deep Learning
Transformers, CNNs, RNNs, LSTMs, Deep Reinforcement Learning.
Education
M.S. in Information Science (Specialization: ML)
University of Arizona
July 2024 - May 2026 | Current GPA: 4.0
Coursework: Machine Learning, Deep Learning, Applied NLP, Data Mining, Cloud Data Warehousing (Azure).
Bachelor's in Data Science and Analytics
Jain University
Aug 2021 - July 2024 | Current CGPA: 9.3/10
Coursework: Statistics, Machine Learning, Data Mining, Time-Series Analysis.
Work Experience
Research Collaborator at University of Arizona - ECE
August 2025 - Present | Tucson, AZ
Leading research in healthcare data (EEG/fMRI) classification. Applied **transformer-based multimodal models** and utilized High-Performance Computing (HPC) for large-scale dataset preprocessing and training, demonstrating expertise in complex multimodal GenAI workflows.
Data Analytics Training Specialist at Tops Technologies Pvt. Ltd.
September 2024 - May 2025 | Surat
Trained learners in comprehensive data analytics tools (Python, Power BI, SQL). Optimized database design and complex SQL queries, achieving a documented efficiency increase of up to **70%**.
Data Science Project Lead at Omdena Local Chapter
March 2023 - May 2023 | Mumbai
Led 25+ members on an AQI monitoring and prediction project. Developed a **time-series forecasting model** to predict AQI in Mumbai with over **80% RMSE accuracy**.
Projects & Achievements
- Designed an NL2SQL GenAI app using Llama Index and an LLM to automate query generation.
- Incorporated **Advanced RAG** techniques: Semantic Chunking, Metadata Filtering, and Vector Search Optimization.
- Implemented LLM optimization (Prompt Compression, Caching) to reduce latency.
- Performed end-to-end unstructured text data processing and cleaning.
- Applied **word2vec embedding** with an XGBoost classifier, achieving a strong log loss of 12.
- Analyzed data using Dabl and Pandas, treated missing values, and performed categorical encoding.
- Utilized **Principal Component Analysis (PCA)** to reduce dataset dimensionality.