Avi Kumar Talaviya

Data Scientist | ML Engineer | Gen AI Consultant

Specializing in Generative AI, Advanced RAG solutions, and scalable MLOps.

Technical Expertise

Generative AI & LLMs

Agentic AI, Advanced RAG, LangChain, OpenAI, Hugging Face, Llama Index, Multimodal Models, Prompt Engineering, Fine-Tuning, Semantic Chunking, Hybrid Retrieval.

Data & ML Engineering

Python, R, SQL, TensorFlow, PyTorch, Keras, Scikit-learn, Pandas, Matplotlib, Data Structures and Algorithms, Text/Image Analysis.

Cloud & MLOps/DevOps

AWS (EC2, Lambda, S3), Azure Data Warehousing, HPC, CI/CD, MLOps, Prefect, Spark/PySpark, Git/GitHub.

Deep Learning

Transformers, CNNs, RNNs, LSTMs, Deep Reinforcement Learning.

Education

M.S. in Information Science (Specialization: ML)

University of Arizona

July 2024 - May 2026 | Current GPA: 4.0

Coursework: Machine Learning, Deep Learning, Applied NLP, Data Mining, Cloud Data Warehousing (Azure).

Bachelor's in Data Science and Analytics

Jain University

Aug 2021 - July 2024 | Current CGPA: 9.3/10

Coursework: Statistics, Machine Learning, Data Mining, Time-Series Analysis.

Work Experience

Research Collaborator at University of Arizona - ECE

August 2025 - Present | Tucson, AZ

Leading research in healthcare data (EEG/fMRI) classification. Applied **transformer-based multimodal models** and utilized High-Performance Computing (HPC) for large-scale dataset preprocessing and training, demonstrating expertise in complex multimodal GenAI workflows.

Data Analytics Training Specialist at Tops Technologies Pvt. Ltd.

September 2024 - May 2025 | Surat

Trained learners in comprehensive data analytics tools (Python, Power BI, SQL). Optimized database design and complex SQL queries, achieving a documented efficiency increase of up to **70%**.

Data Science Project Lead at Omdena Local Chapter

March 2023 - May 2023 | Mumbai

Led 25+ members on an AQI monitoring and prediction project. Developed a **time-series forecasting model** to predict AQI in Mumbai with over **80% RMSE accuracy**.

Projects & Achievements

NL2SQL Generation Application
  • Designed an NL2SQL GenAI app using Llama Index and an LLM to automate query generation.
  • Incorporated **Advanced RAG** techniques: Semantic Chunking, Metadata Filtering, and Vector Search Optimization.
  • Implemented LLM optimization (Prompt Compression, Caching) to reduce latency.
GenAI RAG Llama Index
Cyberbullying Detection using NLP
  • Performed end-to-end unstructured text data processing and cleaning.
  • Applied **word2vec embedding** with an XGBoost classifier, achieving a strong log loss of 12.
NLP XGBoost Classification
Road Traffic Severity Classification Project
  • Analyzed data using Dabl and Pandas, treated missing values, and performed categorical encoding.
  • Utilized **Principal Component Analysis (PCA)** to reduce dataset dimensionality.
ML Data Analysis Python