PROFILE
Experienced Data Science Professional with years months of IT experience with + years in Data Science projects, currently working in a high-performing team at JPMorgan Chase & Co, Hyderabad. Kaggle Competitions Expert & Kaggle Kernels Expert.
- 2024 - Present: Applied AI ML Lead at JPMorgan Chase & Co, Hyderabad
- 2021 - 2024: Senior Lead Data Scientist at Tokopedia, Hyderabad
- 2019 - 2021: Data Scientist at ADP, Hyderabad
- 2017 - 2019: Data Scientist at DBS Bank, Hyderabad
- 2013 - 2017: Software Engineer at Tech Mahindra, Hyderabad
SKILLS & ABILITIES
Data Science, Machine Learning, Deep Learning, Python, R, AWS (EC2, Lambda, Sagemaker), GCP (Vertex AI), MLOps, SQL, NLP, Gen AI (Prompt Engineering, RAG), LLMs (PaLM 2, Llama 2), ML Frameworks (Keras, Tensorflow, Flask, LlamaIndex), Vector Databases (Milvus, ChromaDB), Technical Leadership, Team Management
PROFESSIONAL EXPERIENCE
JPMorgan Chase & Co | Applied AI ML Lead | Jul 2024 - Present | Hyderabad, India
- Personalization and Customer Insights, CCB Digital
- Currently working in a high-performing team at JPMorgan Chase & Co, Hyderabad.
- Building Account Engagement (CTR) and Origination (CVR) ML models for Personalization and Customer Insights, Digital Intelligence project in Customer Acquisition and Marketing team, CCB Digital.
Tokopedia | Data Science Senior Lead | Jan 2023 - Jul 2024 | Hyderabad, India
- Search Ads - TopAds
- Led 2 data science projects - Ads Ranking and Keyword Recommendation, and managed 4 data scientists, and contributed to various DS initiatives that have benefited the Ads DS team and GoTo DS.
- Consistently delivered high-quality data science solutions for TopAds business problems and contributed positively to Tokopedia's revenue by using Gen AI (Vertex AI - PaLM 2 LLM) for Augmented Ad title generation for products with low/no impressions and Product Keyword creation for products with low keyword supply.
- Oversaw research and development projects at various stages ranging from initial exploration to deployment into production systems.
- Collaborated with engineering teams to integrate successful A/B tested ML models into large-scale and highly complex production services.
Tokopedia | Principal Data Scientist | Dec 2021 - Dec 2022 | Hyderabad, India
- Real-time Ads Ranking - TopAds
- Worked on building a product, user and query context-aware Real-time Ads Ranking (RTR) system.
- Developed a tree-based Learning-to-Rank model to increase the relevancy of ads displayed on the search results page ( SRP ) which enabled the Real-time Ranking system to achieve a 3.25% increase in CTR, 1.57% increase in Ads Revenue (~1.6 billion IDR), while maintaining the system level ROAS >=7.
- Deployed a highly scalable, low latency and high throughput model inference pipeline for Real-time Ranking system to achieve sub 80 ms latency with 1K requests served per second.
- Optimised inference of an existing deep learning solution using post-training quantization techniques which increased revenue by 1.43% (~1 billion IDR monthly) and saved inference costs up to 166 million IDR monthly.
ADP | Data Scientist | Nov 2019 - Dec 2021 | Hyderabad, India
Worked as a Data Scientist in the ADP DataCloud team building machine learning models for Job Classification and PayCode Classification.
- Job Classification - ADP DataCloud
- Technologies: Python 3, Machine Learning - NLP, Classification, AWS, Amazon Sagemaker
- The Job classification model normalises the client-entered jobs to standard ADP Job Taxonomies , which enables the benchmarking of an organization's workforce metrics against industry benchmarks in ADP DataCloud.
- Built multi-lingual Job Classification for French & Spanish using Language Agnostic BERT ( LaBSE ) & multi-lingual Universal Sentence Encoder ( mUSE ) in a Tensorflow Keras model & reduced time to market from 6 to 2 months.
- Developed a multi-class classifier using a Keras model using FastText word embeddings on Amazon Sagemaker and productionised it using Step Functions, ECR, Sagemaker Endpoint and API Gateway.
- Deployed the model as both batch transform and real-time endpoint.
- PayCode Classification, ADP DataCloud
- Technologies: Python 3, Machine Learning - NLP, Classification, AWS, Databricks
- The PayCode classification model provides paycode group and paycode subgroup for the text entered by the end user while submitting their timesheet, which powers the workforce trends analytics and reporting in ADP DataCloud.
- Developed a multi-class classifier using FastText on Amazon Sagemaker and productionised it using AWS Batch, Step Functions, ECR, Sagemaker Endpoint and API Gateway.
- Deployed the model as both batch transform and real-time endpoint.
DBS | Data Scientist | Oct 2017 - Oct 2019 | Hyderabad, India
Worked as a Data Scientist in the Core Banking division and built machine learning models for Self-Service Banking - Cash In Transit and Cheque Analytics.
- Cheque Analytics - Self-Service Banking
- Technologies: Python 3, PySpark 2.3, Machine Learning - Classification
- The Cheque Risk Scoring model provides a risk score for all inward cheques using the historic cheque attribute data and customer information dealing with class imbalance using SMOTE/TomekLinks.
- Developed binary classifier using LightGBM for Cheque Risk Scoring model to provide a risk score for all inward cheques using the historic cheque attribute data and customer information.
- Reduced false negatives by 18% through effective handling of class imbalance using SMOTE and TomekLinks and derived the best hyperparameters for the model using Bayesian Hyperparameter Optimisation.
- Cash In Transit (CIT)- Self-Service Banking
- Technologies: Python 3, PySpark 2.3, Machine Learning - Time Series Forecasting
- Forecasting of "no cash" situations for different types of ATM, BTM & CRS machines in PySpark.
- Developed a time-series forecasting FB Prophet model to forecast ATM, BTM and CRS cash withdrawals and backtesting of models against real results to tweak the model accordingly to optimize error, resulting in a 15% reduction in cash shortage incidents.
Tech Mahindra | Software Engineer | Nov 2013 - Oct 2017 | Hyderabad, India
Worked as a Software Engineer for General Electric (GE) client in 4 different projects spanning from data ingestion to data science.
- Technologies: Python, R, SQL Machine Learning - Time Series Forecasting, Clustering
- Developed Cash Collection machine learning model using Clustering (KMeans), Decision Trees and GE US Payroll model using ARIMA.
- Converting the JSON Data obtained from APIʼs to R dataframes and CSV.
- Developed dynamic Talend jobs for integration of data on Oracle, MySQL, Teradata and PostgreSQL databases using Talend components.
- Created Stored Procedures and finished code stubs received from team members.
EDUCATION
Electronics and Communications Engineering, GITAM University | 2009 – 2013 | Hyderabad, India
- Bachelor of Technology ( B.Tech ) in Electronics and Communications Engineering ( ECE ) with 8.4 CGPA
COURSES
- Summer School on NLP, IIIT Hyderabad | Jun 2024 | Hyderabad, India
- Attended Summer School on NLP with focus on Large Language Models (LLM) and Natural Language Generation (NLG) conducted by LTRC, IIIT Hyderabad
- Summer School on CV and DL, IIIT Hyderabad | Jul 2018 | Hyderabad, India
- Attended Summer School on Computer Vision and Deep Learning across the conducted by CVIT, IIIT Hyderabad
AWARDS
- Best Summer School Project, IIIT Hyderabad | Jul 2024
- Awarded 2nd place for Continued Pre-training of LLM on Legal Contracts - LLM Domain Adaptation project at Summer School on NLP, IIIT Hyderabad
- Super Techie, DBS | Dec 2018
- Awarded the Pride Connect Award for being the Super Techie at C2E , Consumer Banking Analytics
- Workplace Catalyst, DBS | Jul 2018
- Awarded for being a part of the team which is Workplace Catalyst across the whole DBS Tech Organisation
When not making a difference in the world, I spend a significant amount of leisure time watching TV shows or playing Apex Legends.
Lets connect on LinkedIn . Peace!