SARANG DEB SAHA

Data Scientist | Machine Learning Engineer

LinkedIn | GitHub

About

Highly analytical and results-oriented Data Scientist with expertise in deep learning, computer vision, and advanced analytics. Proven ability to develop and deploy production-grade machine learning solutions, optimize data pipelines, and drive impactful insights across diverse domains including financial technology, sports analytics, and road safety. Seeking to leverage strong technical skills and leadership experience to solve complex data challenges and contribute to innovative product development.

Work Experience

Analyst - Analytics

Bureau Inc.

Nov 2024 - Apr 2025

Collaborated with Data Engineering and Data Science teams to design scalable dashboards and automate data pipelines.

  • Collaborated with Data Engineering and Data Science teams to design scalable Apache Superset dashboards and automate data pipelines using Python, SQL, and Airbyte, optimizing task scheduling with DAGs.

Data Scientist

Rally Vision

Aug 2024 - Nov 2024

Developed and fine-tuned deep learning models for real-time sports ball tracking.

  • Developed and fine-tuned deep learning models (TrackNet) for real-time squash ball tracking using Python, TensorFlow, PyTorch, and OpenCV.
  • Generated comprehensive datasets and processed match footage with FFmpeg, yielding key insights into player behavior and match dynamics.
  • Deployed models in Docker environments and collaborated with cross-functional teams to integrate tracking data into broadcasting systems, significantly enhancing viewer engagement.

Data Scientist

Bureau Inc.

May 2025 - Nov 2024

Leading the end-to-end development of a flagship production tool for financial technology.

  • Led end-to-end development of 'FinSpector', a flagship production tool for bank statement parsing, transaction categorization, and mule detection, positioning it among a select few platforms in India.
  • Engineered a robust PDF and OCR pipeline leveraging regex and NLP to achieve 97%+ accuracy in transaction extraction and classification, deploying a rule-based fraud engine to identify suspicious patterns.
  • Enabled clients to perform real-time creditworthiness assessment and fraud detection directly from uploaded statements, reducing manual underwriting time by 60%.
  • Positioned the tool for enterprise rollout with SaaS pricing, forecasting significant revenue opportunities through B2B lending partnerships.

Project Intern

CiSTUP, IISc (Indian Institute of Science)

Jan 2024 - Jul 2024

Conducted a comparative study on road safety datasets and implemented computer vision models for traffic violation detection.

  • Conducted a comparative study of road safety datasets and blackspot definitions across India, UK, US, and France, identifying critical feature gaps and inconsistencies in Indian data.
  • Implemented YOLO-based computer vision models with centroid tracking to detect traffic violations including triple riding, helmet-less riding, and wrong-way driving.
  • Developed a web scraping tool using BeautifulSoup4 to extract and structure FIR data from Karnataka (post-2016) via bounding box-based parsing.
  • Performed spatio-temporal analysis of violations across 50 Bengaluru traffic junctions, integrating IUDX, BTP, and meteorological datasets to identify violation trends and build predictive models.

Data Science Intern

Quidich Innovation Labs

Oct 2023 - Apr 2024

Contributed to the development of a recommendation engine and improved real-time player tracking solutions.

  • Developed a recommendation engine to automate storyline generation for cricket commentators.
  • Tested and benchmarked the company's real-time player tracker solution (QT and QStat), improving efficiency by 20%.
  • Contributed to the development of new YOLO-based object detection models for cricket ball tracking, leveraging highlight videos for real-time detection and tracking in QT (Quidich Tracker).

Education

Computer Science and Engineering

The LNM Institute of Information Technology

CGPA: 7.5

Nov 2020 - Jul 2024

Courses

  • Served as the Chairman of ACM LNMIIT Students Chapter.
  • Served as the Coordinator of the Gender Sensitization and Equality Council.
  • Elected as the Senator of the Student's Gymkhana, LNMIIT 2022-23 (Student's Council body).

Physics, Chemistry, Maths, English

Delhi Public School

Percentage: 96%

Apr 2019 - Jun 2020

Projects

Cognizance of the Premier League

Jan 2023 - Jan 2024

Exploratory data analysis and predictive modeling in football analytics, focusing on Premier League teams and the transfer market.

VaahanFlow

Jan 2023 - Jan 2024

Developed a real-time traffic density estimation system using YOLOv8 for urban traffic management.

Evidence vs Eminence

Jan 2023 - Jan 2024

Explored the IPL auction market using statistical analysis and machine learning models to uncover insights into player valuation and predict auction prices.

Awards

Research Paper Acceptance: Exploring Anomaly Detection Techniques for Crime Detection

ICRTCIS 2024 / Springer Book Series

Jan 2024

Paper accepted for presentation at ICRTCIS 2024 conference and published in the Springer Book Series 'Algorithms of Intelligent Systems'.

Research Paper: Cognizance of the Premier League (Peer-Review Process)

IJSMM by InderScience

Jan 2024

Paper on football analytics currently undergoing peer-review for publication in a Q2 journal.

Chanakya UG Fellowship

iHUB DivyaSampark, IIT Roorkee

Jan 2023

Awarded prestigious fellowship as the only shortlisted team from Jaipur.

TATA Crucible Campus Quiz 2022 Finalist (Rajasthan Zone)

TATA Crucible

Jan 2022

Achieved a top 6 spot out of 20,000+ students in Rajasthan zone finals and 1st in Cluster wildcard round.

Publications

Exploring Anomaly Detection Techniques for Crime Detection

Springer Book Series 'Algorithms of Intelligent Systems'

Jan 2024

Research on detecting anomalous events with criminal intent using deep learning, specifically Convolutional Neural Networks.

Cognizance of the Premier League: An In-Depth Exploration of Team Performance, Player Transfers, Referee Dynamics, and Player Position Prediction for Scouting

IJSMM by InderScience (peer-review process)

Jan 2024

Exploration of the IPL auction market using statistical analysis and machine learning models to uncover insights into player valuation and predict auction prices.

Languages

English (Fluent) , Hindi (Native)

Skills

Programming Languages

  • Python
  • C++
  • C

Libraries & Frameworks

  • PyTorch
  • TensorFlow
  • OpenCV
  • Scikit-learn
  • Matplotlib
  • Seaborn
  • Spacy
  • Apache Superset
  • Airbyte

Tools & Platforms

  • Docker
  • Amazon AWS
  • GCP
  • Tableau
  • CleverTap
  • FFmpeg
  • BeautifulSoup4
  • HTML5
  • CSS
  • Javascript
  • PHP
  • Unreal Engine

Databases

  • MySQL

Machine Learning

  • Deep Learning
  • Computer Vision
  • Predictive Modeling
  • Anomaly Detection
  • Recommendation Systems
  • Regression Models
  • YOLO
  • TrackNet
  • CNNs (VGG19, DenseNet121, ResNet50, MobileNetV2)

Data Analysis & Engineering

  • Exploratory Data Analysis (EDA)
  • Data Pipelines
  • SQL
  • Data Scraping
  • Video Analytics
  • Spatio-temporal Analysis
  • Dataset Creation