PROFILE
Experienced Data Science Professional with years months of IT experience with + years in Data Science projects, currently leading a high-performing team at Tokopedia, Hyderabad. Kaggle Competitions Expert & Kaggle Kernels Expert.
- 2021 - Present: Senior Lead Data Scientist at Tokopedia, Hyderabad
- 2019 - 2021: Data Scientist at ADP, Hyderabad
- 2017 - 2019: Data Scientist at DBS Bank, Hyderabad
- 2013 - 2017: Software Engineer at Tech Mahindra, Hyderabad
SKILLS & ABILITIES
Data Science, Machine Learning, Deep Learning, Python, R, AWS (EC2, Lambda, Sagemaker), GCP (Vertex AI), MLOps, SQL, NLP, Gen AI (Prompt Engineering, RAG), LLMs (PaLM 2, Llama 2), ML Frameworks (Keras, Tensorflow, Flask, LlamaIndex), Vector Databases (Milvus, ChromaDB), Technical Leadership, Team Management
PROFESSIONAL EXPERIENCE
Tokopedia | Data Science Senior Lead | Jan 2023 - present | Hyderabad, India
- Search Ads - TopAds
- Leading 2 data science projects - Ads Ranking and Keyword Recommendation, and managing 4 data scientists under me, and contributing to various DS initiatives that have benefited the Ads DS team and GoTo DS.
- Consistently delivering high-quality data science solutions for TopAds business problems and contributing positively to Tokopedia's revenue by using Gen AI (Vertex AI - PaLM 2 LLM) for Augmented Ad title generation for products with low/no impressions and Product Keyword creation for products with low keyword supply.
- Overseeing research and development projects at various stages ranging from initial exploration to deployment into production systems.
- Collaborating with engineering teams to integrate successful A/B tested ML models into large-scale and highly complex production services.
Tokopedia | Principal Data Scientist | Dec 2021 - Dec 2022 | Hyderabad, India
- Real-time Ads Ranking - TopAds
- Worked on building a product, user and query context-aware Real-time Ads Ranking (RTR) system.
- Developed a tree-based Learning-to-Rank model to increase the relevancy of ads displayed on the search results page ( SRP ) which enabled the Real-time Ranking system to achieve a 3.25% increase in CTR, 2.24% increase in Revenue per Impression and 1% increase in Avg CPC, while maintaining the system level ROAS >=7.
- Deployed a highly scalable, low latency and high throughput model inference pipeline for Real-time Ranking system to achieve sub 80 ms latency with 1K requests served per second.
- Optimised inference of an existing deep learning solution using post-training quantization techniques which increased revenue by 1.43% (~1 billion IDR monthly) and saved inference costs up to 166 million IDR monthly.
ADP | Data Scientist | Nov 2019 - Dec 2021 | Hyderabad, India
Worked as a Data Scientist in the ADP DataCloud team building machine learning models for Job Classification and PayCode Classification.
-
Job Classification - ADP DataCloud
- Technologies: Python 3, Machine Learning - NLP, Classification, AWS, Amazon Sagemaker
- The Job classification model normalises the client-entered jobs to standard ADP Job Taxonomies , which enables the benchmarking of an organization's workforce metrics against industry benchmarks in ADP DataCloud.
- Built multi-lingual Job Classification for French & Spanish using Language Agnostic BERT ( LaBSE ) & multi-lingual Universal Sentence Encoder ( mUSE ) in a Tensorflow Keras model & reduced time to market from 6 to 2 months.
- Developed a multi-class classifier using a Keras model using FastText word embeddings on Amazon Sagemaker and productionised it using Step Functions, ECR, Sagemaker Endpoint and API Gateway.
- Deployed the model as both batch transform and real-time endpoint.
-
PayCode Classification, ADP DataCloud
- Technologies: Python 3, Machine Learning - NLP, Classification, AWS, Databricks
- The PayCode classification model provides paycode group and paycode subgroup for the text entered by the end user while submitting their timesheet, which powers the workforce trends analytics and reporting in ADP DataCloud.
- Developed a multi-class classifier using FastText on Amazon Sagemaker and productionised it using AWS Batch, Step Functions, ECR, Sagemaker Endpoint and API Gateway.
- Deployed the model as both batch transform and real-time endpoint.
DBS | Data Scientist | Oct 2017 - Oct 2019 | Hyderabad, India
Worked as a Data Scientist in the Core Banking division and built machine learning models for Self-Service Banking - Cash In Transit and Cheque Analytics.
- Cheque Analytics - Self-Service Banking
- Technologies: Python 3, PySpark 2.3, Machine Learning - Classification
- The Cheque Risk Scoring model provides a risk score for all inward cheques using the historic cheque attribute data and customer information dealing with class imbalance using SMOTE/TomekLinks.
- Developed a binary classifier using LightGBM and derived the best hyperparameters for the model using Bayesian Hyperparameter Optimisation.
- Cash In Transit (CIT)- Self-Service Banking
- Technologies: Python 3, PySpark 2.3, Machine Learning - Time Series Forecasting
- Forecasting of “ no cash ” situations for different types of ATM, BTM & CRS machines in PySpark.
- Developed a time-series forecasting FB Prophet model to forecast ATM, BTM and CRS cash withdrawals and backtesting of models against real results to tweak the model accordingly to optimize error.
Tech Mahindra | Software Engineer | Nov 2013 - Oct 2017 | Hyderabad, India
Worked as a Software Engineer for General Electric (GE) client in 4 different projects spanning from data ingestion to data science.
- Technologies: Python, R, SQL Machine Learning - Time Series Forecasting, Clustering
- Developed Cash Collection machine learning model using Clustering (KMeans), Decision Trees and GE US Payroll model using ARIMA.
- Converting the JSON Data obtained from APIʼs to R dataframes and CSV.
- Developed dynamic Talend jobs for integration of data on Oracle, MySQL, Teradata and PostgreSQL databases using Talend components.
- Created Stored Procedures and finished code stubs received from team members.
EDUCATION
Electronics and Communications Engineering, GITAM University | 2009 – 2013 | Hyderabad, India
- Bachelor of Technology ( B.Tech ) in Electronics and Communications Engineering ( ECE ) with 8.4 CGPA
COURSES
Summer School on CV and DL, IIIT Hyderabad | Jul 2018 | Hyderabad, India
- Attended Summer School on Computer Vision and Deep Learning conducted by CVIT, IIIT Hyd research group
AWARDS
-
Super Techie, DBS | Dec 2018
- Awarded the Pride Connect Award for being the Super Techie at C2E , Consumer Banking Analytics
-
Workplace Catalyst, DBS | Jul 2018
- Awarded for being a part of the team which is Workplace Catalyst across the whole DBS Tech Organisation
When not making a difference in the world, I spend a significant amount of leisure time watching TV shows or playing Apex Legends.
Lets connect on LinkedIn . Peace!